Stream: git-wasmtime

Topic: wasmtime / issue #4315 Incorrect execution of openjph com...


view this post on Zulip Wasmtime GitHub notifications bot (Jun 24 2022 at 20:26):

yurydelendik opened issue #4315:

While working on the SIMD tests at https://github.com/bytecodealliance/sightglass/pull/189, I found that this file is not executed properly. The app informs:

Size: 400x400
ojph error 0x000300A1 at ojph_codestream.cpp:4067: Error decoding a codeblock

Though expected something like:

Size: 400x400
Line 100:
 [0]:  37 36 35 36 37....

It appears to be running correctly on Node and SM, so I assume it is an issue of correctness. Attaching a compiled benchmark (minus bench_start/_end):

test-case.zip

view this post on Zulip Wasmtime GitHub notifications bot (Jun 24 2022 at 20:37):

yurydelendik edited issue #4315:

While working on the SIMD tests at https://github.com/bytecodealliance/sightglass/pull/189, I found that this file is not executed properly. The app informs:

Size: 400x400
ojph error 0x000300A1 at ojph_codestream.cpp:4067: Error decoding a codeblock

Though expected something like:

Size: 400x400
Line 100:
 [0]:  37 36 35 36 37....

It appears to be running on Node and SM, so I assume it is an issue of correctness. Attaching the compiled benchmark (minus bench_start/_end):

test-case.zip

view this post on Zulip Wasmtime GitHub notifications bot (Jun 24 2022 at 20:37):

yurydelendik edited issue #4315:

While working on the SIMD tests at https://github.com/bytecodealliance/sightglass/pull/189, I found that this file is not executed properly. The app informs:

Size: 400x400
ojph error 0x000300A1 at ojph_codestream.cpp:4067: Error decoding a codeblock

Though expected something like:

Size: 400x400
Line 100:
 [0]:  37 36 35 36 37....

It appears to be running on Node and SM, so I assume it is an issue of correctness. Attaching the compiled benchmark (minus bench_start/_end):

test-case.zip

P.S. on Intel CPU x64

view this post on Zulip Wasmtime GitHub notifications bot (Jun 24 2022 at 21:31):

yurydelendik commented on issue #4315:

Also, works on wasmtime for ARM64

view this post on Zulip Wasmtime GitHub notifications bot (Jun 25 2022 at 03:57):

alexcrichton commented on issue #4315:

I've done some work to narrow this down but unfortunately I'm not being super successful at the moment. My strategy for narrowing this down has been:

Next I have a script which runs the binary on aarch64 and x86_64 with a binary-search-of-sorts. This pins down the precise fuel count where with one more fuel the execution differs. My goal here is to find the precise wasm instruction at which execution diverges here. This quickly-ish found results and I started sprinkling "debug stores" within the *.wat. Basically I'd take a value of the stack, shove it in a temporary global, store the global to linear memory address 0, and then load the global back onto the stack. The goal here was a quick-and-dirty way of debugging the wasm value stack.

In doing this I've narrowed this down to function 40 in the input file above. Specifically these two instructions:

  0x4942 | fd 0c 00 01 | V128Const { value: V128([0, 1, 0, 1, 0, 1, 0, 1, 4, 5, 4, 5, 4, 5, 4, 5]) }
         | 00 01 00 01
         | 00 01 04 05
         | 04 05 04 05
         | 04 05
  0x4954 | fd 0e       | I8x16Swizzle

Unfortunately this v128.const value is not actually going into i8x16.swizzle. With my "debug store" inserted between these two instructions it turns out that the value stored in memory is indeed not the v128.const here. Instead some other bit-pattern gets stored in memory and naturally when that gets fed into i8x16.swizzle everything goes awry.

Other than that though I have now hit a wall and am unable to make progress debugging this. Function 40 is huge and it's tough to explore the disassembly. Using rr I can find the portion of the function that actually creates the value which eventually gets fed into i8x16.swizzle. This _should_ feed in the v128.const value but it is not. Unfortunately I can't figure out how to translate this random snippet of assembly back to the source file.

I have confirmed, though, that 0.35.0 reproduces this issue so I don't believe this is a problem with recent developments like regalloc2, recent ISLE migrations, or the alias analysis pass added.

Others who are more familiar with the backend may know more about how to debug this though. IIRC there's some optimizations around constants, constant pools, and trying to not always reify something into a register or something like that, and something may be going awry there.


... aaaand as I am writing this up I took another look at the definition of the Swizzle instruction. This looks suspicious, specifically treating the mask as a writable reg since I think in this case it's the mask that's getting corrupted going into the swizzle instruction.

I added a move from the input to a temporary register and that didn't actually fix the original program but it did surprisingly allow it to make more progress. I think that means there's actually at least two bugs here!

Doing my bisection again I think I found a much simpler reproduction, namely:

(module
  (func (param v128 v128 i32) (result v128)
        local.get 0
        local.get 1
        local.get 2
        select
        ))

I don't think that our codegen for the select instruction is correct for v128 values. Namely this produces:

0000000000000000 <_wasm_function_0>:
       0:       55                      push   %rbp
       1:       48 89 e5                mov    %rsp,%rbp
       4:       85 d2                   test   %edx,%edx
       6:       0f 84 04 00 00 00       je     10 <_wasm_function_0+0x10>
       c:       f3 0f 10 c8             movss  %xmm0,%xmm1
      10:       66 0f 6f c1             movdqa %xmm1,%xmm0
      14:       48 89 ec                mov    %rbp,%rsp
      17:       5d                      pop    %rbp
      18:       c3                      retq

where the movss I don't believe is correct, I think that needs to be movdqa or something sized larger.

I'm running out of steam for tonight so I think this is as far as I'll get today. I'm actually quite worried that fuzzing hasn't discovered anything in this area. These are somewhat trivial bugs which in theory should be found almost immediately by any halfway decent fuzzing.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 25 2022 at 05:45):

alexcrichton commented on issue #4315:

Ok I can confirm that with https://github.com/bytecodealliance/wasmtime/pull/4317 and https://github.com/bytecodealliance/wasmtime/pull/4318 the original test case now has the same behavior on arm64 and on x86_64

view this post on Zulip Wasmtime GitHub notifications bot (Jun 27 2022 at 19:14):

alexcrichton commented on issue #4315:

I realize now I forgot to say this earlier but thanks @yurydelendik for taking the time to file this! We'll be making an 0.38.1 release with these fixes soon and a few other fixes for Cranelift (unrelated to this).

We also plan on filing a CVE about these micompiles because while they don't affect hosts themselves this could affect in-wasm execution which is often just as important in some environments.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 27 2022 at 21:42):

alexcrichton commented on issue #4315:

I believe everything is now merged so I'm going to close this.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 27 2022 at 21:42):

alexcrichton closed issue #4315:

While working on the SIMD tests at https://github.com/bytecodealliance/sightglass/pull/189, I found that this file is not executed properly. The app informs:

Size: 400x400
ojph error 0x000300A1 at ojph_codestream.cpp:4067: Error decoding a codeblock

Though expected something like:

Size: 400x400
Line 100:
 [0]:  37 36 35 36 37....

It appears to be running on Node and SM, so I assume it is an issue of correctness. Attaching the compiled benchmark (minus bench_start/_end):

test-case.zip

P.S. on Intel CPU x64


Last updated: Jan 24 2025 at 00:11 UTC