fitzgen opened PR #10340 from fitzgen:pure-memflag to bytecodealliance:main:
This flag represents whether the memory operation's safety (e.g. the validity of its
notrapandreadonlyclaims) is purely a function of its data dependencies.If this flag is
true, then it is okay to code motion this instruction to arbitrary locations, in the function, including across blocks and conditional branches, so long as data dependencies (and trap ordering, if relevant) are upheld.If this flag is
false, then the memory operation's safety potentially relies upon invariants that are not reflected in its data dependencies, and therefore it is not safe to code motion this operation. For example, this operation could be in a block that is dominated by a control-flow bounds check that makes this operation safe, and that invariant is not reflected in its operands. It would be unsafe to code motion such an instruction above its associated bounds check, even if its data dependencies would still be satisfied.I've added this flag because we were doing exactly that kind of code motion where we moved a
readonlyandnotrapmemory operation past its associated null-check and therefore it was no longer safe to perform and we would get a segfault. This could only be triggered when the Wasm typed-function-references proposal was enabled, which is not a tier-1 proposal, so it is not considered a vulnerability. Nonetheless, it is a pretty scary kind of bug, and other code paths weren't affected due to pretty subtle interactions. And this is the motivation for the new "pure" flag: without needing to explicitly opt into data-dependency-based code motion (i.e. set the "pure" flag), it is too easy to accidentally move loads past their control-flow-based safety guards.<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
fitzgen requested cfallin for a review on PR #10340.
fitzgen requested wasmtime-compiler-reviewers for a review on PR #10340.
fitzgen requested alexcrichton for a review on PR #10340.
fitzgen requested wasmtime-core-reviewers for a review on PR #10340.
cfallin submitted PR review:
Thanks for finding this and for the excellent documentation of semantics of the new flag. I think this is a reasonable solution.
Would you mind adding two filetests (e.g. in
cranelift/filetests/filetests/egraph/) testing that a non-pure load won't LICM, and that a pure/readonly/notrap one will? (This will also force the parser bit below)
cfallin created PR review comment:
We'll need the corresponding change to the parser (
cranelift/reader/src/parser.rs) too, I think.
fitzgen updated PR #10340.
fitzgen submitted PR review.
fitzgen created PR review comment:
That all happens via
ir::MemFlags::set_by_name, so the parser itself doesn't actually need any updating.
fitzgen commented on PR #10340:
Would you mind adding two filetests (e.g. in
cranelift/filetests/filetests/egraph/) testing that a non-pure load won't LICM, and that a pure/readonly/notrap one will? (This will also force the parser bit below)Extended
load-hoist.clifto test both hoisting and non-hoisting. Good idea, thanks!
alexcrichton submitted PR review.
alexcrichton created PR review comment:
How come this is readonly? Can't the length change over time?
alexcrichton created PR review comment:
This raises an eyebrow from me. Functionally I don't think this is an issue, but it seems kind of weird to have a pure, but not readonly,
MemFlags. I could sort of extend that to "what is apurebut trapping load?".To be conservative though, do you think it would be reasonable to validate that pure memflags are always notrap+readonly? Or is there a use case for a trapping pure operation or non-readonly pure operation?
fitzgen submitted PR review.
fitzgen created PR review comment:
Nope, GC arrays have fixed length.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Ah! Ok color me ignorant then
fitzgen submitted PR review.
fitzgen created PR review comment:
The semantics of
pureare "this memory operation's safety, including the validity of its other flags, depend only on its data dependencies". There is nothing about that statement that requires the memory operation also bereadonlyor also benotrap. Therefore, I'd rather not add additional validation code that ultimately isn't necessary.
fitzgen submitted PR review.
fitzgen created PR review comment:
We could potentially rename it to
data-pureor something if you think that would be better?
alexcrichton submitted PR review.
alexcrichton created PR review comment:
What I'm worried about is that it's a pretty subtle definition and you basically can't do anything if you only check
pure, it's only if pure + readonly + notrap are all set that something can actually be done about moving the instruction around. I do realize it's more validation though... Do you feel that there's not that much risk of someone in the future usingpureby accident?In terms of bikeshedding I suppose my concern does indeed stem from the term "pure" being pretty overloaded and my gut feeling about "pure" things is "oh hey you can put that anywhere and it's fine". Basically I do think my concern might be assuaged with a different name. That being said "data-pure" doesn't feel all that different from "pure", but perhaps different enough?
I'm trying to think of other ideas...
notrap_alwaysnotrap_everywheresemantically_ok_to_put_anywhere(ok definitely not)cfg_independentI'm reading up on
readonlyand that already says that nothing about the memory can change from when the function starts to when it ends, so maybenotrap_fooas a name? Maybenotrap_pure? Something like that maybe...
cfallin submitted PR review.
cfallin created PR review comment:
I actually agree that it probably doesn't make sense to have a load that is
purebut notreadonly:puremeans that we can sink or hoist it anywhere (constrained by its address input data dependency); that should only be possible if the memory is unchanging. (Right? Or are there cases where we're ok with hoisting but the value is changing, and we might load a different value if hoisted?)
pureis a bit different than (and orthogonal to)notrapthough: it's ok to have an address that maybe traps, but traps everywhere the same. In other words,puresays "the trappiness of this address is unchanging" whilenotrapsays "this address is not trappy at this location".
cfallin submitted PR review.
cfallin created PR review comment:
Maybe we should name flags based on actions that we expect or permit the compiler to do instead:
can_trap: add trap metadata. This is the inverse ofnotrap(why invert it? because we're describing actions, not lack of actions)can_skip: allow dead-code elimination if unusedcan_move: can hoist or sink as data dependencies allowcan_merge: can reuse values from equivalent loads, even without alias analysis proving this (i.e., value is readonly)Is that sufficient to express what we need?
is_pure_for_egraphis thencan_skip+can_move+can_merge(since the egraph can do all three); DCE usescan_skip; if we do other things in the future we have the specific action-permissions defined for those.Alternately-alternately, we could look at the predicates we define today and give specific names to those: e.g., redefine
pureto be whatis_pure_for_egraphtests; and thennotrap(orcan_trap) for DCE; and I think that's it?
github-actions[bot] commented on PR #10340:
Subscribe to Label Action
cc @fitzgen
<details>
This issue or pull request has been labeled: "cranelift", "wasmtime:ref-types"Thus the following users have been cc'd because of the following labels:
- fitzgen: wasmtime:ref-types
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
Learn more.
</details>
fitzgen submitted PR review.
fitzgen created PR review comment:
I like
can_{trap,skip,move,merge}and think that is a good place to aim towards ending up at. This PR'spureis equivalent tocan_move, but the rest of those flags don't align exactly with what we have today. For example,can_skipwould beis-not-a-store && notraptoday -- would it ever make sense to havecan_skipon a store?Anyways, what do we think of, in order to land this PR, renaming this PR's
puretomove,can_move, ormovableand then kicking the can down the road for the rest of the potential flags overhaul?
cfallin submitted PR review.
cfallin created PR review comment:
SGTM!
And yes, I think at least
can_skipandcan_mergeapply only to loads. Stores can definitely vary on trappiness, and I thinkcan_movemakes sense too -- some store that we want to happen at some point but don't care where (analogous to eventual consistency or relaxed-consistency atomics).
fitzgen updated PR #10340.
fitzgen has enabled auto merge for PR #10340.
fitzgen merged PR #10340.
Last updated: Dec 06 2025 at 06:05 UTC