alexcrichton opened issue #8116:
On the heels of #8112 and #8113, this should succeed but it does not:
(module (memory 1) (func (export "main") (param i32 f64 i32) (result f64) local.get 2 f64.load f64.ceil return)) (assert_return (invoke "main" (i32.const 1) (f64.const 1) (i32.const 0xfff8) ) (f64.const 0))
$ wasmtime wast foo.wast -C cranelift-has-avx=false Error: failed to run script file 'foo.wast' Caused by: 0: failed directive on foo.wast:9:1 1: error while executing at wasm backtrace: 0: 0x2f - <unknown>!<wasm function 0> 2: memory fault at wasm address 0x10000 in linear memory of size 0x10000 3: wasm trap: out of bounds memory access
I discovered this by commenting out these conversions and auditing everything that relied on them. The instructions I found were
rounds{s,d}
which are related to instructions likef64.ceil
. Note that this requires AVX to be disabled (since AVX works on unaligned loads/stores) and additionally requires SSE4.1 (or some sse variant) to be enabled.I think that this is the only one relying on non-128-bit size registers, but at the same time I'm not certain.
alexcrichton commented on issue #8116:
I believe that this may just be limited to the round instructions because I believe that floating-point operations on f32 and f64 do not require alignment but working on 128-bit values does require alignment. That means instructions [like
addss](https://github.com/bytecodealliance/wasmtime/blob/daa7fdf2cef425f73fcd50f91bad2698d0d1813f/cranelift/codegen/src/isa/x64/inst.isle#L3348-L3351) specifically use
*_unalignedvariants of instructions to avoid this issue. However
rounds{s,d}were special in that I was "too lazy" to make a whole new instruction variant for a unaligned-unary-operation-with-immediate. That I think at least explains why it's just
rounds{s,d}` affected and it looks like no others.Even with that though with a fix like https://github.com/bytecodealliance/wasmtime/pull/8113 to "just" do the load eagerly and never take advantage of load sinking I'd fear that I would never be able to remember this when reviewing changes to the x64 backend.
alexcrichton edited a comment on issue #8116:
I believe that this may just be limited to the round instructions because I believe that floating-point operations on f32 and f64 do not require alignment but working on 128-bit values does require alignment. That means instructions like
addss
specifically use*_unaligned
variants of instructions to avoid this issue. Howeverrounds{s,d}
were special in that I was "too lazy" to make a whole new instruction variant for a unaligned-unary-operation-with-immediate. That I think at least explains why it's justrounds{s,d}
affected and it looks like no others.Even with that though with a fix like https://github.com/bytecodealliance/wasmtime/pull/8113 to "just" do the load eagerly and never take advantage of load sinking I'd fear that I would never be able to remember this when reviewing changes to the x64 backend.
cfallin commented on issue #8116:
Potentially very unpopular suggestion: we could build another newtype around xmm registers (and memory variants) that means "narrow
f{32,64}
value". It's a departure from how we handle GPRs (we have no problem usingGpr
for 8/16/32/64-bit-wide values, and we don't seem to have load-width bugs at the moment), but it at least is a speedbump for the ISLE rule author to remember that some special handling may be required.Another possibility would be to encode operand size into the
XmmMem
and at least assert it matches what the instruction it's been given to will do.Happy to talk more during the CL meeting in a few minutes!
fitzgen closed issue #8116:
On the heels of #8112 and #8113, this should succeed but it does not:
(module (memory 1) (func (export "main") (param i32 f64 i32) (result f64) local.get 2 f64.load f64.ceil return)) (assert_return (invoke "main" (i32.const 1) (f64.const 1) (i32.const 0xfff8) ) (f64.const 0))
$ wasmtime wast foo.wast -C cranelift-has-avx=false Error: failed to run script file 'foo.wast' Caused by: 0: failed directive on foo.wast:9:1 1: error while executing at wasm backtrace: 0: 0x2f - <unknown>!<wasm function 0> 2: memory fault at wasm address 0x10000 in linear memory of size 0x10000 3: wasm trap: out of bounds memory access
I discovered this by commenting out these conversions and auditing everything that relied on them. The instructions I found were
rounds{s,d}
which are related to instructions likef64.ceil
. Note that this requires AVX to be disabled (since AVX works on unaligned loads/stores) and additionally requires SSE4.1 (or some sse variant) to be enabled.I think that this is the only one relying on non-128-bit size registers, but at the same time I'm not certain.
Last updated: Jan 24 2025 at 00:11 UTC