I'm curious -- if I am generating webassembly, how much should I care about generating machine code that is "optimized" for the WASM interpreter. For example, I could probably simplify my life if I store all local variables into memory, but I could also work a bit harder and store some subset of them (perhaps a large subset) in WASM local variables and the native WASM stack.
Does this matter? Can I expect wasmtime to do some kind of "Heap SSA"-like transformation and rewrite both into an equivalent form?
Similarly, should I try to use memory.copy
to copy "small-ish" regions of memory (e.g., 16 bytes) or is it equivalent if I just generate 4 int32 loads and then 4 int32 stores?
We don’t have the equivalent of mem2reg, and wasm semantics actually make it a bit tricky to do so (any memory access could trap and the heap state post-trap is externally visible if exported; so at least stores must remain, and loads at least must bounds-check unless one can prove in-range or subsumed which we don’t do). So a good cost model is probably: operand stack and locals become SSA become registers, loads/stores remain and are more expensive than native
We also don’t try to merge smaller accesses into larger, because of the same trapping semantics; I would suggest using wasm-simd to move 128 bits at a time where possible. memory.copy turns into a trampoline into the runtime IIRC so not so cheap for smaller copies
Thanks!
I'm familiar with binaryen; I'm wondering what other wasm-to-wasm optimizers exist-- are there any that are willing to e.g. treat memory state as "not inspectable"?
I'm only aware of binaryen myself and IIRC it has opt-in flags for semantics like that to enable optimizations
mem2reg on Wasm is also hard because we don't have the non-aliasing guarantees of LLVM's alloca. Any memory location is up for grabs by almost any store anywhere.
Binaryen faces the same limitation. I don't think any tool will ever be able to do mem2reg well on Wasm.
Interesting.
Seems like something that tools could emit as "opt-in" annotations of a sort
It's an interesting question. Wasm doesn't have an obvious place to attach such annotations, so I don't think it's trivial to do.
I sometimes wonder if would make sense to design a Wasm-like compiler IR, which would be like Wasm plus instructions for things like "begin a stack frame", "end a stack frame", and alloca
. And on top of that, some concept of provenance for pointers. Maybe also GlobalVariable
and noalias
concepts too. All these things could be lowered into plain Wasm, but in unlowered form would have enough nondeterminism to give a C-compiler-style optimizer room to breath.
Last updated: Jan 24 2025 at 00:11 UTC