Stream: git-wasmtime

Topic: wasmtime / PR #6245 Avoid introducing a move for struct r...


view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2023 at 18:17):

elliottt opened PR #6245 from elliottt:trevor/avoid-sret-reg-move to bytecodealliance:main:

RA2-0.7.0 removes support for processing program moves, and as such we need to burn down the list of moves present in VCode prior to it being handed of to RA2.

The handling of struct return currently will allocate a fresh VReg in Lower::new that will be copied into during Lower::gen_arg_setup. That new VReg will be used when a return is processed, to ensure that the sret param is threaded out correctly.

If we discover the sret param in Lower::new instead, we can avoid allocating the temporary and instead use it directly when processing a return.

<!--
Please make sure you include the following information:

Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.html

Please ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2023 at 18:25):

elliottt edited PR #6245:

RA2-0.7.0 removes support for processing program moves, and as such we need to burn down the list of moves present in VCode prior to it being handed of to RA2.

The handling of struct return currently will allocate a fresh VReg in Lower::new that will be copied into during Lower::gen_arg_setup. That new VReg will be used when a return is processed, to ensure that the sret param is threaded out correctly.

If we cache the sret param in Lower::new instead, we can avoid allocating the temporary and instead use it directly when processing a return. This has a mixed effect on the filetests: you can see some cases where eagerly introducing the temporary register avoids a move later, and some where avoiding the temporary allocation also avoids a move.

<!--
Please make sure you include the following information:

Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.html

Please ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2023 at 18:30):

elliottt updated PR #6245.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2023 at 18:53):

elliottt has marked PR #6245 as ready for review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2023 at 18:53):

elliottt requested abrown for a review on PR #6245.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2023 at 18:53):

elliottt requested wasmtime-compiler-reviewers for a review on PR #6245.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2023 at 19:03):

jameysharp submitted PR review:

I think this is a big improvement! I just have some questions about the details.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2023 at 19:03):

jameysharp submitted PR review:

I think this is a big improvement! I just have some questions about the details.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2023 at 19:03):

jameysharp created PR review comment:

I don't think the returns array needs to be cloned now. I think after the changes in this PR, the loop body only reads from vcode.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2023 at 19:03):

jameysharp created PR review comment:

Have you identified what's causing this duplicate move (mov x8, x23 happens twice, with only the blr in between)?

It would also be nice if it weren't just the reverse of an earlier move (mov x23, x8), and therefore a no-op, but that was happening before too. So I'm really just wondering about the new duplicate move at the moment.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2023 at 19:03):

jameysharp created PR review comment:

I'm inclined to suggest unwrap here. I think if we're trying to lower a function that has no entry block, something has gone very wrong.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2023 at 19:03):

jameysharp created PR review comment:

This assert is in the middle of an if statement which only runs when the assert is true, I think. So you can delete the assert.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2023 at 19:03):

jameysharp created PR review comment:

I can see that this is a pretty direct transplant of the existing logic, but would you mind making this a zip instead of enumerate plus slice indexing?

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2023 at 19:17):

elliottt created PR review comment:

Thanks for catching this! Originally I had left it out, but felt paranoid about removing asserts :)

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2023 at 19:19):

elliottt created PR review comment:

I agree. The other place we use if-let for the entry block is gen_arg_setup, perhaps that one should change as well?

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2023 at 19:21):

elliottt updated PR #6245.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2023 at 19:31):

jameysharp created PR review comment:

On further thought, maybe there's an equivalent assert we should add here, that sret_reg.is_some() after looping over all the params?

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2023 at 19:41):

jameysharp created PR review comment:

I was hoping to find that gen_arg_setup is dominated by something that panics if there's no entry block. I got as far as cranelift/codegen/src/dominator_tree.rs where compute_postorder just silently returns without doing anything if there's no entry block. So in that case it looks like we just generate an empty block lowering order and proceed without complaint.

Since I can't quickly prove that lowering isn't expected to work when there's no entry block, I'm going to appeal to authority and ask @cfallin if there's some reason we should be supporting that case.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2023 at 19:59):

cfallin created PR review comment:

I think that the verifier should verify that there is an entry block? If not, it seems like a reasonable thing to add. I don't think we should expend any unnecessary energy doing special things for empty function bodies; they have no meaning/semantics anyway.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2023 at 21:05):

elliottt created PR review comment:

I like that! I'll add that assert in.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 19 2023 at 21:06):

elliottt updated PR #6245.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 20 2023 at 00:29):

elliottt created PR review comment:

This was an interesting one! The call will clobber the value in x8, but we require that value to be live to the end of the function so that we can return it, so RA2 rewrites the input constraint on the call to an any constraint and moves into x8 right before the call. This ensures that we have a value (x23) to restore x8 after the call, but does yield a slightly funny result.

The move from x23 into x8 on line 910 could probably be eliminated by the redundant move eliminator, but we could also look into resolving the need for the value duplication on the in-progress overlap support PR...

view this post on Zulip Wasmtime GitHub notifications bot (Apr 20 2023 at 00:39):

jameysharp created PR review comment:

That does fully explain this, doesn't it? Yeah, I guess I'm not too worried about an extra move that is:

Especially since this sounds like exactly the sort of situation that the RA2 overlapping live ranges work should fix.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 20 2023 at 02:11):

elliottt created PR review comment:

(Just for posterity, here's the location in RA2 where this rewrite happens)
https://github.com/bytecodealliance/regalloc2/blob/244ede83638762701831453a43e6011a7dbe0e94/src/ion/liveranges.rs#L557-L616

view this post on Zulip Wasmtime GitHub notifications bot (Apr 20 2023 at 18:11):

jameysharp submitted PR review:

Right then, this seems just fine. I think any remaining cleanup can safely wait until more RA2 work is ready, even if that ends up taking a while.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 20 2023 at 18:54):

elliottt merged PR #6245.


Last updated: Dec 23 2024 at 13:07 UTC