wasmtime / Issue #1175 Replace temporary lowering of Wasm... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / Issue #1175 Replace temporary lowering of Wasm...

Wasmtime GitHub notifications bot (Jul 24 2020 at 21:48):

Having a specialized load_splat Cranelift instruction would be beneficial for the AArch64 backend (so that it could generate LD1R instructions). I tried the alternative approach quickly, i.e. smarter codegen, which in the AArch64 case means using maybe_input_insn(), but it didn't work (I was getting None as the return value) - the reason was probably because the load might have side effects.

Wasmtime GitHub notifications bot (Jul 24 2020 at 21:59):

abrown commented on Issue #1175:

Let's do that then; in the past, it's always been ok to create Cranelift instructions that map exactly to Wasm SIMD instructions so that seems like a good path forward. If you add it (somewhere in meta/src/shared/instructions.rs) feel free to tag me and I'll review it.

Wasmtime GitHub notifications bot (Jul 27 2020 at 22:31):

akirilov-arm commented on Issue #1175:

I'd be interested in having a wider discussion; in particular, @cfallin might have a different opinion with respect to the AArch64 backend.

Wasmtime GitHub notifications bot (Jul 27 2020 at 22:49):

cfallin commented on Issue #1175:

I think it's a reasonably pragmatic solution to have the dedicated instruction; it reduces work overall in the pipeline if we can carry the whole package of operations together as one unit from Wasm to machine code.

There's a possible concern if we have a lot of these; it's certainly nice to only have to worry about literal "load" and "store" instructions when analyzing loads and stores. But perhaps we get around that by having accessors on the instruction ("this inst is a load of some form, with address X and size Y"). Or, in the worst case, the instruction is a black-box with unknown side-effects from the point of view of other optimization passes, so it inhibits some optimization but is still correct.

@akirilov-arm, the reason that your (quite reasonable!) approach attempting to merge the ops in the backend didn't work is indeed because load is side-effecting, so will never be returned by maybe_input_insn(). This prevents the backend from arbitrarily sinking a side-effecting op, which could lead to a miscompile (consider if the load-splat were sunk past a store to an aliasing pointer, for example). The "correct" thing to do is to hoist the splat instead and then merge, but the framework isn't really set up to allow that.

Wasmtime GitHub notifications bot (Oct 28 2020 at 19:54):

cfallin closed Issue #1175:

What is the feature or code improvement you would like to do in Cranelift?

In bytecodealliance/cranelift#1347 I added a temporary lowering for Wasm's load_splat to two Cranelift instructions, load + splat. This generates extra instructions that could be removed by a specialized Cranelift load_splat instruction or by smarter codegen (e.g. complex addressing on splat).

What is the value of adding this in Cranelift?

Fewer instructions produced.

Do you have an implementation plan, and/or ideas for data structures or algorithms to use?

Seeking feedback on which way to proceed: specialized load_splat or smarter codegen.

Have you considered alternative implementations? If so, how are they better or worse than your proposal?

See above.

Last updated: Apr 17 2025 at 21:03 UTC