abrown opened Issue #2256:
As suggested by @bnjbvr in https://github.com/bytecodealliance/wasmtime/pull/2248#issuecomment-702627995, we should benchmark whether clearing a register with
PXORbefore emitting the sequence forsplatwill cause a slowdown on x64. Currently, #2248 adds a weird meta-instruction,XmmUninitializedValue, that tells the register allocator that thedstregister is adef, not amod, because the sequence of instructions emitted forsplatwill overwrite all lanes ofdst.XmmUninitializedValueis dangerous, though, because we must be very careful to ensure the "overwrite all lanes" invariant holds--it would be preferable to remove it. One way to do so would be to initially emit aPXOR dst, dst, which the new backend recognizes as adef. I avoided this in #2248 because of increased code size, potential slowdown, and the fact that the old backend did not have it, but if we find that its emission causes no slowdown, we should add it and removeXmmUninitializedValue.
abrown labeled Issue #2256:
As suggested by @bnjbvr in https://github.com/bytecodealliance/wasmtime/pull/2248#issuecomment-702627995, we should benchmark whether clearing a register with
PXORbefore emitting the sequence forsplatwill cause a slowdown on x64. Currently, #2248 adds a weird meta-instruction,XmmUninitializedValue, that tells the register allocator that thedstregister is adef, not amod, because the sequence of instructions emitted forsplatwill overwrite all lanes ofdst.XmmUninitializedValueis dangerous, though, because we must be very careful to ensure the "overwrite all lanes" invariant holds--it would be preferable to remove it. One way to do so would be to initially emit aPXOR dst, dst, which the new backend recognizes as adef. I avoided this in #2248 because of increased code size, potential slowdown, and the fact that the old backend did not have it, but if we find that its emission causes no slowdown, we should add it and removeXmmUninitializedValue.
Last updated: Jan 10 2026 at 20:04 UTC