cfallin commented on issue #4080:
@fitzgen or @abrown, this should be ready for review now. I've stacked it
on top of #4091 as otherwise it's a merge conflict. Updated description:This PR refactors the x64 backend address-mode lowering to use an
incremental-build approach, where it considers each node in a tree of
iadd
s that feed into a load/store address and, at each step, builds
the best possibleAmode
. It will combine an arbitrary number of
constant offsets (an extension beyond the current rules), and can
capture a left-shifted (scaled) index in any position of the tree
(another extension).This doesn't have any measurable performance improvement on our Wasm
benchmarks in Sightglass, unfortunately, because the IR lowered from
wasm32 will do address computation in 32 bits and thenuextend
it to
add to the 64-bit heap base. We can't quite lift the 32-bit adds to 64
bits because this loses the wraparound semantics.(We could label adds as "expected not to overflow", and allow those to
be lifted to 64 bit operations; wasm32 heap address computation should
fit this. This isadd nuw
(no unsigned wrap) in LLVM IR terms. That's
likely my next step.)Nevertheless, (i) this generalizes the cases we can handle, which should
be a good thing, all other things being equal (and in this case, no
compile time impact was measured); and (ii) might benefit non-Wasm
frontends.
fitzgen commented on issue #4080:
(We could label adds as "expected not to overflow", and allow _those_ to
be lifted to 64 bit operations; wasm32 heap address computation should
fit this. This isadd nuw
(no unsigned wrap) in LLVM IR terms. That's
likely my next step.)This is a pretty big change for CLIF: right now it is ~basically UB-free other than memory loads/stores (and invalid
raw_bitcast
s?). Division by zero, for example, is not UB and semantically raises a trap. Adding the equivalent ofadd nuw
would be the first instance of UB we have in basic arithmetic instructions, and that's something we shouldn't do lightly. I think this will have subtle and far-reaching effects that aren't immediately obvious to us now. For example, verification/superoptimization would have to keep track of UB, explicitly represent it in the SMT queries they create, and then check refinement rather than equivalence.
cfallin commented on issue #4080:
Updated! @abrown let me know if the additional comments make sense.
In the process of tweaking priorities to try to get everything to apply in the right order, I was hitting some weird behavior and suspect there may be an issue with priorities in general. I'm going to go play with that a bit now, but it shouldn't affect correctness here (the order in which rules apply seems wrong sometimes but the rules themselves are applied correctly).
cfallin commented on issue #4080:
This is a pretty big change for CLIF: right now it is ~basically UB-free other than memory loads/stores (and invalid raw_bitcasts?). Division by zero, for example, is not UB and semantically raises a trap. Adding the equivalent of add nuw would be the first instance of UB we have in basic arithmetic instructions, and that's something we shouldn't do lightly. I think this will have subtle and far-reaching effects that aren't immediately obvious to us now. For example, verification/superoptimization would have to keep track of UB, explicitly represent it in the SMT queries they create, and then check refinement rather than equivalence.
Yeah, I've been digging into cases where I thought this might help and it seems that the adds are coming from bona fine
i32.add
s in the Wasm, not some part of the legalization ofheap_addr
with offsets; in this case we actually have to implement the wrapping behavior faithfully, so nevermind the above. I do agree as well actually that nondeterminism isn't the best answer here; if we come back to a similar issue later maybe the right answer is to just usei64
s in the right spots (or the host pointer type) to avoid the extends.
Last updated: Dec 23 2024 at 12:05 UTC