Stream: git-wasmtime

Topic: wasmtime / issue #8116 More cases of aligned xmm loads lo...


view this post on Zulip Wasmtime GitHub notifications bot (Mar 13 2024 at 15:00):

alexcrichton opened issue #8116:

On the heels of #8112 and #8113, this should succeed but it does not:

(module
  (memory 1)
  (func (export "main") (param i32 f64 i32) (result f64)
    local.get 2
    f64.load
    f64.ceil
    return))

(assert_return
  (invoke "main"
    (i32.const 1)
    (f64.const 1)
    (i32.const 0xfff8)
  )
  (f64.const 0))
$ wasmtime wast  foo.wast -C cranelift-has-avx=false
Error: failed to run script file 'foo.wast'

Caused by:
    0: failed directive on foo.wast:9:1
    1: error while executing at wasm backtrace:
           0:   0x2f - <unknown>!<wasm function 0>
    2: memory fault at wasm address 0x10000 in linear memory of size 0x10000
    3: wasm trap: out of bounds memory access

I discovered this by commenting out these conversions and auditing everything that relied on them. The instructions I found were rounds{s,d} which are related to instructions like f64.ceil. Note that this requires AVX to be disabled (since AVX works on unaligned loads/stores) and additionally requires SSE4.1 (or some sse variant) to be enabled.

I think that this is the only one relying on non-128-bit size registers, but at the same time I'm not certain.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 13 2024 at 15:07):

alexcrichton commented on issue #8116:

I believe that this may just be limited to the round instructions because I believe that floating-point operations on f32 and f64 do not require alignment but working on 128-bit values does require alignment. That means instructions [like addss](https://github.com/bytecodealliance/wasmtime/blob/daa7fdf2cef425f73fcd50f91bad2698d0d1813f/cranelift/codegen/src/isa/x64/inst.isle#L3348-L3351) specifically use *_unaligned variants of instructions to avoid this issue. However rounds{s,d} were special in that I was "too lazy" to make a whole new instruction variant for a unaligned-unary-operation-with-immediate. That I think at least explains why it's just rounds{s,d}` affected and it looks like no others.

Even with that though with a fix like https://github.com/bytecodealliance/wasmtime/pull/8113 to "just" do the load eagerly and never take advantage of load sinking I'd fear that I would never be able to remember this when reviewing changes to the x64 backend.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 13 2024 at 15:07):

alexcrichton edited a comment on issue #8116:

I believe that this may just be limited to the round instructions because I believe that floating-point operations on f32 and f64 do not require alignment but working on 128-bit values does require alignment. That means instructions like addss specifically use *_unaligned variants of instructions to avoid this issue. However rounds{s,d} were special in that I was "too lazy" to make a whole new instruction variant for a unaligned-unary-operation-with-immediate. That I think at least explains why it's just rounds{s,d} affected and it looks like no others.

Even with that though with a fix like https://github.com/bytecodealliance/wasmtime/pull/8113 to "just" do the load eagerly and never take advantage of load sinking I'd fear that I would never be able to remember this when reviewing changes to the x64 backend.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 13 2024 at 15:26):

cfallin commented on issue #8116:

Potentially very unpopular suggestion: we could build another newtype around xmm registers (and memory variants) that means "narrow f{32,64} value". It's a departure from how we handle GPRs (we have no problem using Gpr for 8/16/32/64-bit-wide values, and we don't seem to have load-width bugs at the moment), but it at least is a speedbump for the ISLE rule author to remember that some special handling may be required.

Another possibility would be to encode operand size into the XmmMem and at least assert it matches what the instruction it's been given to will do.

Happy to talk more during the CL meeting in a few minutes!

view this post on Zulip Wasmtime GitHub notifications bot (Mar 13 2024 at 21:00):

fitzgen closed issue #8116:

On the heels of #8112 and #8113, this should succeed but it does not:

(module
  (memory 1)
  (func (export "main") (param i32 f64 i32) (result f64)
    local.get 2
    f64.load
    f64.ceil
    return))

(assert_return
  (invoke "main"
    (i32.const 1)
    (f64.const 1)
    (i32.const 0xfff8)
  )
  (f64.const 0))
$ wasmtime wast  foo.wast -C cranelift-has-avx=false
Error: failed to run script file 'foo.wast'

Caused by:
    0: failed directive on foo.wast:9:1
    1: error while executing at wasm backtrace:
           0:   0x2f - <unknown>!<wasm function 0>
    2: memory fault at wasm address 0x10000 in linear memory of size 0x10000
    3: wasm trap: out of bounds memory access

I discovered this by commenting out these conversions and auditing everything that relied on them. The instructions I found were rounds{s,d} which are related to instructions like f64.ceil. Note that this requires AVX to be disabled (since AVX works on unaligned loads/stores) and additionally requires SSE4.1 (or some sse variant) to be enabled.

I think that this is the only one relying on non-128-bit size registers, but at the same time I'm not certain.


Last updated: Jan 24 2025 at 00:11 UTC