wasmtime / issue #11340 Figure out how to optimize away `... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / issue #11340 Figure out how to optimize away `...

Wasmtime GitHub notifications bot (Jul 29 2025 at 17:42):

For example, consider tests/disas/component-model/direct-call-inlining.wat. It begins by checking and setting the component flag globals for entry into a component:

https://github.com/bytecodealliance/wasmtime/blob/6b7465480b6fc831dad1b9dfad09b7cf58a05bd3/tests/disas/component-model/direct-adapter-calls-inlining.wat#L80-L100

And ends by resetting those globals to their original values:

https://github.com/bytecodealliance/wasmtime/blob/6b7465480b6fc831dad1b9dfad09b7cf58a05bd3/tests/disas/component-model/direct-adapter-calls-inlining.wat#L104-L133

This is the vast bulk of the inlined callee code, since the actual callee got const-prop'd away into a simple constant:

https://github.com/bytecodealliance/wasmtime/blob/6b7465480b6fc831dad1b9dfad09b7cf58a05bd3/tests/disas/component-model/direct-adapter-calls-inlining.wat#L143

Ideally the only thing we would have after inlining is that constant. I guess we also need to check the flags, but we shouldn't need to write to them at all.

Addressing this likely involves implementing dead-store-elimination in our alias analysis, but maybe also requires some higher-level analyses and logic in the adapter generation code as well, where we know exactly what can and cannot observe those globals and when.

Wasmtime GitHub notifications bot (Jul 29 2025 at 17:42):

fitzgen added the cranelift:goal:optimize-speed label to Issue #11340.

Wasmtime GitHub notifications bot (Jul 29 2025 at 17:42):

fitzgen added the wasm-proposal:component-model label to Issue #11340.

Wasmtime GitHub notifications bot (Jul 29 2025 at 17:43):

fitzgen commented on issue #11340:

cc @cfallin

Wasmtime GitHub notifications bot (Aug 12 2025 at 16:51):

cfallin commented on issue #11340:

This looks like something we could get to work fairly well if we have custom alias-analysis regions and use one per global; then the store-elimination condition will be fairly robust to any other intervening memory traffic.

The tricky part is the condition that no intervening may-trap operations occur. This is needed for soundness in the absence of more semantic info from the IR, because we ensure that memory state at a trap is as-if no optimizations occurred. But if we know that the flags are not observable after a trap, perhaps we could note that with a flag; then we can allow trapping ops not to clobber the last-store info. (Is it really a core Wasm-level global or some custom component-model storage? Is it accessible via any of our instance introspection APIs?)

Wasmtime GitHub notifications bot (Aug 12 2025 at 17:11):

fitzgen commented on issue #11340:

This looks like something we could get to work fairly well if we have custom alias-analysis regions and use one per global

In https://github.com/bytecodealliance/wasmtime/issues/9348, I want to remove our hard-coded alias regions and replace them with an arbitrary u8 or something that the cranelift embedder assigns semantics to. This would work well here, allowing us to give these flags unique regions (well in a world where we go crazy with regions, we might need to map multiple application-level logical regions onto the same cranelift-level "physical" regions due to bitpacking constraints and running out of u8 space, or else we would have to make a new alias region entity table; that is all something we can deal with if/when we get to it).

The tricky part is the condition that no intervening may-trap operations occur. This is needed for soundness in the absence of more semantic info from the IR, because we ensure that memory state at a trap is as-if no optimizations occurred. But if we know that the flags are not observable after a trap, perhaps we could note that with a flag; then we can allow trapping ops not to clobber the last-store info.

This perhaps seems like a property of the alias region: is this region observable after traps or not? If we want to start having additional metadata about alias regions like this, then maybe they really do need to become their own entity table.

(Is it really a core Wasm-level global or some custom component-model storage? Is it accessible via any of our instance introspection APIs?)

They are implemented with Wasm-level globals but they not directly observable by anything but the Wasm adapter functions we generate. Guest Wasm cannot access them directly.

However, they do prevent re-entrancy (that is what these gets/sets are implementing in the first place) and the traps we raise when we detect re-entrancy are visible to embedders (the guest component halts and cannot be re-entered after one of these traps, as attempts to do so will just keep hitting this re-entrancy trap).

They are not exported, and so embedders cannot get a wasmtime::Global to introspect them. I think they will show up in core dumps, but that would just be us leaking an implementation detail in our core dumps, essentially, and I would argue that doesn't "count". Definitely shouldn't be any way for the embedder to modify these flags directly.

Wasmtime GitHub notifications bot (Aug 12 2025 at 17:29):

cfallin commented on issue #11340:

Yep, your custom alias region proposal is what I had in mind :-) This would work nicely as a new kind of entity: something like region0 = region / region1 = region internal in the preamble. We could also consider whether other semantic properties make sense, e.g. noting a region is readonly, or guaranteed not to be altered by callees, etc. Perhaps it even makes sense to move all semantic info we carry on MemFlags into the region? (I'm struggling to think of a case where we could have loads/stores with different semantics at different points to the same disjoint subset of addresses...)

If I recall correctly, one of the concerns we had at that time was whether we could fit a whole region identifier in the 16-byte InstructionData for a load/store, hence the u8 region ID; stores already have opcode, 32-bit offset, address, data, and flags (so three u32s and discriminant+flags packed in a fourth). Limiting to 256 regions seems anemic (consider: calling many different components from one large init function, each with their own reentrancy flag). But if we're replacing MemFlags (per above) by moving its flags to the region, then we have 16 or 24 bits (depending on InstructionData discriminant size) to play with, which seems reasonable.

Wasmtime GitHub notifications bot (Aug 12 2025 at 17:39):

fitzgen commented on issue #11340:

(I'm struggling to think of a case where we could have loads/stores with different semantics at different points to the same disjoint subset of addresses...)

The only flag I can think of would be can_move which depends more on the data-flow properties of the instruction than the memory region.

Last updated: Feb 24 2026 at 05:28 UTC