Stream: git-wasmtime

Topic: wasmtime / PR #7306 egraphs: don't let rematerialization ...


view this post on Zulip Wasmtime GitHub notifications bot (Oct 19 2023 at 23:22):

cfallin opened PR #7306 from cfallin:licm-in-the-correct-direction to bytecodealliance:main:

This reworks the way that remat and LICM interact during aegraph elaboration. In principle, both happen during the same single-pass "code placement" algorithm: we decide where to place pure instructions (those that are eligible for movement), and remat pushes them one way while LICM pushes them the other.

The interaction is a little more subtle than simple heuristic priority, though -- it's really a decision ordering issue. A remat'd value wants to sink as deep into the loop nest as it can (to the use's block), but we don't know where the uses go until we process them (and make LICM-related choices), and we process uses after defs during elaboration. Or more precisely, we have some work at the use before recursively processing the def, and some work after the recursion returns; and the LICM decision happens after recursion returns, because LICM wants to know where the defs are to know how high we can hoist. (The recursion is itself unrolled into a state machine on an explicit stack so that's a little hard to see but that's what is happening in principle.)

The solution here is to make remat a separate just-in-time thing, once we have arg values. Just before we plug the final arg values into the elaborated instruction, we ask: is this a remat'd value, and if so, do we have a copy of the computation in this block yet. If not, we make one. This has to happen in two places (the main elab loop and the toplevel driver from the skeleton).

The one downside of this solution is that it doesn't handle recursive rematerialization by default. This means that if we, for example, decide to remat single-constant-arg adds (as we actually do in our current rules), we won't then also recursively remat the constant arg to those adds. This can be seen in the licm.clif test case. This doesn't seem to be a dealbreaker to me because most such cases will be able to fold the constants anyway (they happen mostly because of pointer pre-computations: a loop over structs in Wasm computes heap_base + p + offset, and naive LICM pulls a heap_base + offset out of the loop for every struct field accessed in the loop, with horrible register pressure resulting; that's why we have that remat rule. Most such offsets are pretty small.).

Fixes #7283.

<!--
Please make sure you include the following information:

Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.html

Please ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->

view this post on Zulip Wasmtime GitHub notifications bot (Oct 19 2023 at 23:22):

cfallin requested fitzgen for a review on PR #7306.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 19 2023 at 23:22):

cfallin requested wasmtime-compiler-reviewers for a review on PR #7306.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 19 2023 at 23:23):

cfallin requested elliottt for a review on PR #7306.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 19 2023 at 23:23):

cfallin requested alexcrichton for a review on PR #7306.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 20 2023 at 03:13):

cfallin updated PR #7306.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 20 2023 at 07:54):

elliottt submitted PR review:

This looks great to me! I had some questions, but none of it should block merging.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 20 2023 at 07:54):

elliottt submitted PR review:

This looks great to me! I had some questions, but none of it should block merging.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 20 2023 at 07:54):

elliottt created PR review comment:

Would you mind reverting this change? I like the space between the imports and type decls :)

view this post on Zulip Wasmtime GitHub notifications bot (Oct 20 2023 at 07:54):

elliottt created PR review comment:

I like how much simpler this gets as a result of moving the remat decision to when the args are being processed!

view this post on Zulip Wasmtime GitHub notifications bot (Oct 20 2023 at 07:54):

elliottt created PR review comment:

Do we know if there's a single consumer for the value at this point? Would it make sense in the future to only clone when args were rematerialized if there are >1 uses?

view this post on Zulip Wasmtime GitHub notifications bot (Oct 20 2023 at 07:54):

elliottt created PR review comment:

Would you mind making an issue and referencing it here? This would be nice to track.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 20 2023 at 19:37):

cfallin updated PR #7306.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 20 2023 at 19:37):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 20 2023 at 19:37):

cfallin created PR review comment:

Oops, leftover from some fiddling to get Entry before I found another re-export; sorry about that!

view this post on Zulip Wasmtime GitHub notifications bot (Oct 20 2023 at 19:37):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 20 2023 at 19:37):

cfallin created PR review comment:

Done! (#7313)

view this post on Zulip Wasmtime GitHub notifications bot (Oct 20 2023 at 19:37):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 20 2023 at 19:37):

cfallin created PR review comment:

Indeed, it reads much better now!

view this post on Zulip Wasmtime GitHub notifications bot (Oct 20 2023 at 19:37):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 20 2023 at 19:37):

cfallin created PR review comment:

Unfortunately we don't; and the lazy cloning we use for the main remat loop doesn't work, because while both rewrite args, we rewrite to clones that are not full eclass values (not in the unionfind, not in the extraction tables, etc); so we have to preserve the original. (Discovered the hard way while debugging!)

view this post on Zulip Wasmtime GitHub notifications bot (Oct 20 2023 at 19:37):

cfallin has enabled auto merge for PR #7306.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 20 2023 at 20:44):

cfallin merged PR #7306.


Last updated: Nov 22 2024 at 16:03 UTC