wasmtime / PR #9071 `cranelift-frontend`: Fix stack maps ... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / PR #9071 `cranelift-frontend`: Fix stack maps ...

Wasmtime GitHub notifications bot (Aug 02 2024 at 19:36):

fitzgen requested wasmtime-compiler-reviewers for a review on PR #9071.

Wasmtime GitHub notifications bot (Aug 02 2024 at 19:36):

fitzgen requested elliottt for a review on PR #9071.

Wasmtime GitHub notifications bot (Aug 02 2024 at 19:36):

fitzgen opened PR #9071 from fitzgen:stack-maps-and-liveness-and-loops to bytecodealliance:main:

Previously, we were not properly handling back edges. This manifested in values incorrectly being considered not-live inside loop bodies where they definitely were live. Consider the following example:
block0:
  v0 = needs stack map

block1:
  call foo(v0)
  call foo(v0)
  jump block1
We were previously considering v0 live only for the first call foo(v0) but not the second, because we mistakenly concluded that v0 would not be used again after that second call. While it won't be used again in this iteration of the loop, it will be used again in the next iteration of the loop.

Trevor and I initially tried implementing a clever trick suggested by Chris where, if we know the minimum post-order index of all of a block's transitive predecessors, we can continue to compute liveness in a single pass over the IR. We believed we could compute the minimum predecessor post-order index via dynamic programming. It turns out, however, that approach doesn't provide correct answers out of the box for certain kinds of irreducible control flow, only nearly correct answers, and would require an additional clever fix-up pass afterwards. We deemed this cleverness on cleverness unacceptable.

Instead, Trevor and I opted to implement a worklist algorithm where we process blocks to a fixed-point. This has the advantages of being obviously correct and producing more-precise results. It has the disadvantage of requiring multiple passes over the IR in the face of loops and back edges. Because this analysis is only used when needs-stack-map values are present (i.e. when the function uses GC values) and is otherwise skipped, this additional compile-time overhead is tolerable.

Wasmtime GitHub notifications bot (Aug 02 2024 at 19:36):

fitzgen updated PR #9071.

Wasmtime GitHub notifications bot (Aug 02 2024 at 19:46):

fitzgen requested cfallin for a review on PR #9071.

Wasmtime GitHub notifications bot (Aug 02 2024 at 20:23):

cfallin submitted PR review:

Looks great, thanks! Some minor requests for clarification in comments but otherwise good to go.

Wasmtime GitHub notifications bot (Aug 02 2024 at 20:23):

cfallin created PR review comment:

This is pretty clever and it took me a second to understand why we don't simply avoid pushing duplicates; it's to do with the innermost-first heuristic mentioned above (so we want to process at the last duplicate pushed, not the first), is that right?

It might be worth a comment here explicitly noting that; we push multiple times because in effect we're "hoisting" the point at which we consider a block sooner into the future, and it would be too expensive to actually remove it from deeper in the stack.

(I usually write worklist algorithms with "if already in set, don't add to queue" logic and I've seen it that way more commonly so it's worth noting this trick here!)

Wasmtime GitHub notifications bot (Aug 02 2024 at 20:23):

cfallin submitted PR review:

Looks great, thanks! Some minor requests for clarification in comments but otherwise good to go.

Wasmtime GitHub notifications bot (Aug 02 2024 at 20:23):

cfallin created PR review comment:

extra word (successors) at end of sentence?

Wasmtime GitHub notifications bot (Aug 02 2024 at 20:23):

cfallin created PR review comment:

Do we ever iterate over the live-set and produce code based on its order? It'd be good to confirm (with a comment noting here) that we don't, in order to retain deterministic compiler output.

Wasmtime GitHub notifications bot (Aug 02 2024 at 21:35):

fitzgen submitted PR review.

Wasmtime GitHub notifications bot (Aug 02 2024 at 21:35):

fitzgen created PR review comment:

The one place we do iterate over them, it is only after we've copied them into a smallvec and sorted it. But I added a note to the doc comment as well.

Wasmtime GitHub notifications bot (Aug 02 2024 at 21:44):

fitzgen submitted PR review.

Wasmtime GitHub notifications bot (Aug 02 2024 at 21:44):

fitzgen created PR review comment:

This is pretty clever and it took me a second to understand why we don't simply avoid pushing duplicates; it's to do with the innermost-first heuristic mentioned above (so we want to process at the _last_ duplicate pushed, not the first), is that right?

Correct, filtering on push means we process the least-recently-pushed duplicate entry. Filtering on pop means we process the most-recently-pushed duplicate entry, which gives us that inner-loops-first behavior that is desirable for minimizing the number of iterations until we reach the fixed-point.

Added some more comments; like your "hoisting" wording.

Wasmtime GitHub notifications bot (Aug 02 2024 at 21:45):

fitzgen updated PR #9071.

Wasmtime GitHub notifications bot (Aug 02 2024 at 21:45):

fitzgen has enabled auto merge for PR #9071.

Wasmtime GitHub notifications bot (Aug 02 2024 at 22:08):

fitzgen merged PR #9071.

Last updated: Apr 18 2025 at 06:04 UTC