wasmtime / PR #11129 Cranelift: Revive division-by-consta... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / PR #11129 Cranelift: Revive division-by-consta...

Wasmtime GitHub notifications bot (Jun 24 2025 at 21:43):

fitzgen opened PR #11129 from fitzgen:revive-div-rem-magic-consts to bytecodealliance:main:

Revive the old module for generating division-by-constant magic numbers

Add proptests for divide-by-constant magic number generators, just for a little extra correctness assurances

Apply division-by-constant magic sequences in the ISLE mid-end.

The application of these sequences have been ported to ISLE now to better integrate with our existing mid-end optimizations and to allow for recursive rule application in the optimized right-hand sides.

Wasmtime GitHub notifications bot (Jun 24 2025 at 21:43):

fitzgen requested alexcrichton for a review on PR #11129.

Wasmtime GitHub notifications bot (Jun 24 2025 at 21:43):

fitzgen requested wasmtime-compiler-reviewers for a review on PR #11129.

Wasmtime GitHub notifications bot (Jun 24 2025 at 21:43):

fitzgen requested wasmtime-default-reviewers for a review on PR #11129.

Wasmtime GitHub notifications bot (Jun 24 2025 at 21:44):

fitzgen edited PR #11129:

Revive the old module for generating division-by-constant magic numbers

Add proptests for divide-by-constant magic number generators, just for a little extra correctness assurances

Apply division-by-constant magic sequences in the ISLE mid-end.

The application of these sequences have been ported to ISLE now to better integrate with our existing mid-end optimizations and to allow for recursive rule application in the optimized right-hand sides.

Fixes https://github.com/bytecodealliance/wasmtime/issues/6049

Wasmtime GitHub notifications bot (Jun 24 2025 at 22:43):

alexcrichton submitted PR review:

Very nice :+1:. Review-wise the main thing I did was double-check that the eval_* functions match the ISLE lowerings of the instructions to ensure that all the goodness of the proptest/test suite/etc mirrors into ISLE well. Very much appreciate the extensive test suite here!

A few thoughts:

Could the proptest bits be expanded to {u,s}rem as well as division? That way the final bits of ISLE can all additionally be mapped into Rust/proptest as well.

Is it true that in all cases this expansion is better than whatever the hardware does?

More of a curiosity, but how do egraphs know to favor these instruction sequences? This is expanding to quite a few more instrunctions than a div so do we have the cost of div/rem set really high?

Wasmtime GitHub notifications bot (Jun 24 2025 at 22:43):

alexcrichton created PR review comment:

FWIW while these were probably copied from C or similar you can probably simplify these greatly with 128-bit integers

Wasmtime GitHub notifications bot (Jun 24 2025 at 22:43):

alexcrichton created PR review comment:

Also this sort of makes me wonder if we could have a monster ISLE rule to pattern match this and generate smulhi....

Wasmtime GitHub notifications bot (Jun 24 2025 at 23:00):

cfallin commented on PR #11129:

Is it true that in all cases this expansion is better than whatever the hardware does?

I guess I'm becoming the resident instruction latency/throughput oracle today but yes definitely -- 64-bit integer divide has a latency of 35-88 cycles (!) on Skylake, with a throughput of 1/6 (so not fully pipelined), while 64-bit integer multiply is 3 cycles and is fully pipelined (1 per cycle). I vaguely recall some issue discussion from when we first switched over to the egraph mid-end where we noticed some benchmarks where this showed up as minor regressions, too. Very happy to see this finally ported over!

Wasmtime GitHub notifications bot (Jun 25 2025 at 01:05):

github-actions[bot] commented on PR #11129:

Subscribe to Label Action

cc @cfallin, @fitzgen

<details>
This issue or pull request has been labeled: "cranelift", "cranelift:meta", "isle"

Thus the following users have been cc'd because of the following labels:

cfallin: isle

fitzgen: isle

To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.

Learn more.
</details>

Wasmtime GitHub notifications bot (Jun 25 2025 at 18:30):

fitzgen submitted PR review.

Wasmtime GitHub notifications bot (Jun 25 2025 at 18:30):

fitzgen created PR review comment:

The tricky bit with matching this gunk to produce smulhi is all of the commutative operations.

Wasmtime GitHub notifications bot (Jun 25 2025 at 18:52):

fitzgen updated PR #11129.

Wasmtime GitHub notifications bot (Jun 25 2025 at 18:57):

fitzgen commented on PR #11129:

More of a curiosity, but how do egraphs know to favor these instruction sequences? This is expanding to quite a few more instrunctions than a div so do we have the cost of div/rem set really high?

We don't put the values defined by side-effectful instructions into the egraph, so we have a slightly different path. This is why we use the ISLE simplify_skeleton term instead of the usual simplify. This path assumes that all RHSes are an improvement over the LHS, and does a shallow/greedy cost comparison for choosing among multiple RHSes (rather than a dynamic programming thing that looks at deep structure).

Concretely, in this case there will be one RHS candidate (the new magic sequence), the one RHS is trivially the best RHS, and for simplify_skeleton we always take RHSes over LHSes when available. So basically, "we favor these instructions because we generated them".

Wasmtime GitHub notifications bot (Jun 25 2025 at 18:57):

fitzgen has enabled auto merge for PR #11129.

Wasmtime GitHub notifications bot (Jun 25 2025 at 19:39):

fitzgen updated PR #11129.

Wasmtime GitHub notifications bot (Jun 25 2025 at 20:20):

fitzgen merged PR #11129.

Wasmtime GitHub notifications bot (Jun 25 2025 at 21:17):

jameysharp submitted PR review.

Wasmtime GitHub notifications bot (Jun 25 2025 at 21:17):

jameysharp created PR review comment:

I think what you're describing is the draft work I did, and comments I wrote, in https://github.com/bytecodealliance/wasmtime/pull/8653; I was trying to get all the way to wide imul, but I think similar reasoning should work for *mulhi.

Last updated: Jun 01 2026 at 09:49 UTC