Stream: git-wasmtime

Topic: wasmtime / PR #4902 ISLE: merge trie edges across priorit...


view this post on Zulip Wasmtime GitHub notifications bot (Sep 13 2022 at 03:18):

cfallin opened PR #4902 from isle-trie-build to main:

Previously, the ISLE trie-building process (which constructs the tree of match operations lowered into the generated Rust code) generated code for each priority level separately, for simplicity and to avoid subtle bugs in edge-ordering (see e.g. #4093, #4102 and #4117).

This PR implements a "priority frontier" approach that allows sharing a single match edge across rules of different priority levels, but in a fairly simple way: it builds the trie in rule-priority order (highest to lowest), and at each decision node, tracks the last edge that got an insertion at the previous priority level. Then a rule at the current priority level can be inserted into the subtree of any edge at or beyond that one, but not before it. This requires constant space per node and allows for significant flexibility. Within the range of possible insertion locations, we respect the sort order over trie symbols, as currently, so that matches on the same enum get placed adjacently to each other and can merge into match statements.

The advantage of this approach is that as we use more priority levels to break ties when rules overlap (as we are planning to do, for more semantic clarity), we do not regress compile-time performance by the use of more priority levels.

The generated code at least on AArch64 is actually almost unchanged (a few let ... = arg0; trivial arg-value match ops at the top become shared now), which indicates that our priority order already locked down the important tie-breaks and that this new trie-building mechanism otherwise respects the same ordering that we currently do.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (Sep 13 2022 at 03:18):

cfallin requested elliottt for a review on PR #4902.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 13 2022 at 16:21):

cfallin updated PR #4902 from isle-trie-build to main.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 14 2022 at 06:07):

elliottt created PR review comment:

Worth using a cached key here?

view this post on Zulip Wasmtime GitHub notifications bot (Sep 14 2022 at 06:07):

elliottt created PR review comment:

What do you think about adding a struct to name this pair? Something like InsertPointer maybe?

view this post on Zulip Wasmtime GitHub notifications bot (Sep 14 2022 at 06:07):

elliottt submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 14 2022 at 06:07):

elliottt created PR review comment:

Is the priority value in last_prio tracked only for this assert?

view this post on Zulip Wasmtime GitHub notifications bot (Sep 14 2022 at 06:07):

elliottt submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 14 2022 at 15:48):

jameysharp submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 14 2022 at 15:48):

jameysharp created PR review comment:

I haven't looked at the surrounding context, but I'd guess that this expression is a single memory access and maybe a couple of arithmetic instructions. So I'd guess that copying the cached keys into another array wouldn't save any time and would waste some space.

If we're micro-optimizing I'd go the other direction and use sort_unstable_by_key. If you want a stable sort, in this case you can get it by making the key be (prio, index).

Unrelated: you might consider using std::cmp::Reverse instead of negating the priority. IMO that serves as better documentation, enough that you could maybe delete the comment. I also wouldn't be surprised if it generates better code (by swapping the order of comparison rather than doing any negation) and it avoids worries about integer overflow for negating the most-negative signed value.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 14 2022 at 18:22):

cfallin updated PR #4902 from isle-trie-build to main.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 14 2022 at 18:23):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 14 2022 at 18:23):

cfallin created PR review comment:

It was, yeah, and I guess as a side-effect of using the same type for cur_prio and last_prio (it's actually needed for cur_prio). This let start = expression got much simpler in the refactor though so it's no longer here.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 14 2022 at 18:23):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 14 2022 at 18:23):

cfallin created PR review comment:

Done!

view this post on Zulip Wasmtime GitHub notifications bot (Sep 14 2022 at 18:25):

cfallin created PR review comment:

Oh, neat, I didn't know about std::cmp::Reverse -- thanks! Updated to use that. (And agreed that caching likely doesn't make much sense here: it would just be avoiding the unwrap_or(0) but that's something like 3 instructions (cmp, generate 0, conditional-move) at the cost of more allocation and memory traffic.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 14 2022 at 18:25):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 19 2022 at 19:17):

cfallin updated PR #4902 from isle-trie-build to main.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 20 2022 at 00:30):

cfallin updated PR #4902 from isle-trie-build to main.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 20 2022 at 00:45):

cfallin updated PR #4902 from isle-trie-build to main.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 20 2022 at 01:11):

elliottt submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 20 2022 at 01:11):

elliottt created PR review comment:

I think this is the same, but avoids testing start at each iteration.

            .skip(start.or_else(0))
            .position(|edge| edge.symbol == op)

view this post on Zulip Wasmtime GitHub notifications bot (Sep 20 2022 at 01:12):

elliottt edited PR review comment.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 20 2022 at 01:20):

cfallin updated PR #4902 from isle-trie-build to main.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 20 2022 at 01:20):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 20 2022 at 01:20):

cfallin created PR review comment:

Not quite I think, as then position will return an index shifted down by the skipped amount, but we can add that back. Updated with that extra bit as another .map() in the chain.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 20 2022 at 02:05):

elliottt created PR review comment:

Ah right, inserting between start and cur-prio.edge_idx means that cur_prio.edge_idx needs to be incremented to point at the same element :+1:

view this post on Zulip Wasmtime GitHub notifications bot (Sep 20 2022 at 02:05):

elliottt submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 20 2022 at 02:10):

elliottt submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 20 2022 at 02:10):

elliottt created PR review comment:

This is also the case that currently requires that cur_prio.edge_idx be set to None on line 251. Is it worth calling that connection out with a comment?

view this post on Zulip Wasmtime GitHub notifications bot (Sep 20 2022 at 02:10):

elliottt edited PR review comment.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 20 2022 at 02:10):

elliottt edited PR review comment.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 20 2022 at 03:31):

cfallin updated PR #4902 from isle-trie-build to main.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 20 2022 at 03:32):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 20 2022 at 03:32):

cfallin created PR review comment:

Yup, added (actually at the comment above this where None is set).

view this post on Zulip Wasmtime GitHub notifications bot (Sep 20 2022 at 03:45):

elliottt submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 20 2022 at 03:45):

elliottt created PR review comment:

Yep, I was hoping to add another comment here referencing that one to tie the two together.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 20 2022 at 03:56):

cfallin updated PR #4902 from isle-trie-build to main.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 20 2022 at 03:56):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 20 2022 at 03:56):

cfallin created PR review comment:

Ah, sure thing -- added one here too!

view this post on Zulip Wasmtime GitHub notifications bot (Sep 20 2022 at 04:07):

elliottt submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 20 2022 at 04:07):

elliottt created PR review comment:

There's a problem here with removing the edge at start from consideration when inserting a new node: if none of the rules at previous priorities had an op inserted that would sort less than the op we're inserting, the node at start may shadow it. Here's the extractor_ordering_bug.isle test with the additional rule (rule 3 (test (identity 3)) 3) that shows the bug:

(type u32 (primitive u32))

(decl identity (u32) u32)
(extern extractor infallible identity identity)

(decl is_zero (u32) u32)
(extern extractor is_zero is_zero)

(decl test (u32) u32)

(rule 3 (test (identity 3)) 3)

;; This exposes a bug where infallible extractors were running before fallible
;; ones, as the derived ordering for the `Extract` type was ordering them ahead
;; of the fallible ones. The result is that the fallible `is_zero` extractor
;; never runs, as the `identity` extractor will always succeed before it's
;; called.
(rule (test (identity x)) x)

(rule (test (is_zero x)) 2)

thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `Some(0)`,
 right: `Some(2)`', /tmp/.tmpSbDsCf/extractor_ordering_bug_main.rs:21:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
test test_run_run_extractor_ordering_bug ... FAILED

view this post on Zulip Wasmtime GitHub notifications bot (Sep 20 2022 at 04:50):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 20 2022 at 04:50):

cfallin created PR review comment:

So, the issue with that is that removing the "last frontier plus one" logic for the starting-point also creates a bug: it allows a lower-priority rule to insert an edge prior to an edge that leads to higher-priority leaves.

The fundamental issue is that the sorting of match ops within a priority is at odds with the sharing of the match edges between priorities. The only way for sharing to occur is for the match op in a given prio range to be last, and that same match op to be first in the next prio range.

So there are two possible design regimes:

My understanding of this whole situation is that by deciding to disallow overlapping rules not disambiguated by priorities, we have eliminated the need for cases like this: we would not allow the ISLE rules above to exist at the same priority, because they overlap.

It was in this context that we realized the need for the "sharing across priority levels" scheme, and this is how this PR arose. In other words, this PR relies on the fact that we are allowed to reorder rules arbitrarily within a priority, and does so in order to maximize sharing.

However, if we also still want to have stricter ordering within a priority, and hold this ordering as semantically binding (as the test case above does), then we must abandon this PR, and this optimization scheme generally. This is because we cannot both strictly sort ops within a priority, and share between priorities, unless the last op of a given priority (in sort order) happens to be the same as the first op of the next priority (in sort order).

So: do we ban overlaps, and then use the resulting freedom to merge edges, getting our efficiency back after adding prios to disambiguate overlaps? Or do we go back to the system that we had before, and strictly respect op ordering?

view this post on Zulip Wasmtime GitHub notifications bot (Sep 20 2022 at 16:47):

cfallin closed without merge PR #4902.


Last updated: Jan 24 2025 at 00:11 UTC