Stream: git-wasmtime

Topic: wasmtime / Issue #1407 OOB memory access


view this post on Zulip Wasmtime GitHub notifications bot (Mar 25 2020 at 21:36):

abrown labeled Issue #1407:

What do you expect to happen? What does actually happen? Does it panic, and if so, with which assertion?

wasmtime traps with an OOB memory access; Node does not. In Node

$ node --version
v13.9.0
$ node --experimental-wasm-simd emscripten-built-for-js.js
source: vertex data 32064032 bytes, index data 24000000 bytes
pass 0: vertex data 18518385 bytes, index data 2332680 bytes
decode: vertex 16.32 ms (1.83 GB/sec), index 11.15 ms (2.00 GB/sec)
decode: vertex 16.33 ms (1.83 GB/sec), index 11.15 ms (2.00 GB/sec)
decode: vertex 16.57 ms (1.80 GB/sec), index 11.23 ms (1.99 GB/sec)
decode: vertex 16.19 ms (1.84 GB/sec), index 11.35 ms (1.97 GB/sec)
decode: vertex 16.18 ms (1.85 GB/sec), index 11.16 ms (2.00 GB/sec)
decode: vertex 16.12 ms (1.85 GB/sec), index 11.19 ms (2.00 GB/sec)
decode: vertex 16.19 ms (1.84 GB/sec), index 11.15 ms (2.01 GB/sec)
decode: vertex 16.16 ms (1.85 GB/sec), index 11.14 ms (2.01 GB/sec)
decode: vertex 16.15 ms (1.85 GB/sec), index 11.17 ms (2.00 GB/sec)
decode: vertex 16.17 ms (1.85 GB/sec), index 11.16 ms (2.00 GB/sec)
pass 1: vertex data 18518204 bytes, index data 2001016 bytes
decode: vertex 16.12 ms (1.85 GB/sec), index 11.07 ms (2.02 GB/sec)
decode: vertex 16.17 ms (1.85 GB/sec), index 11.08 ms (2.02 GB/sec)
decode: vertex 16.11 ms (1.85 GB/sec), index 11.11 ms (2.01 GB/sec)
decode: vertex 16.21 ms (1.84 GB/sec), index 11.09 ms (2.02 GB/sec)
decode: vertex 16.17 ms (1.85 GB/sec), index 11.10 ms (2.01 GB/sec)
decode: vertex 16.07 ms (1.86 GB/sec), index 11.13 ms (2.01 GB/sec)
decode: vertex 16.19 ms (1.84 GB/sec), index 11.06 ms (2.02 GB/sec)
decode: vertex 16.17 ms (1.85 GB/sec), index 11.10 ms (2.01 GB/sec)
decode: vertex 16.04 ms (1.86 GB/sec), index 11.14 ms (2.01 GB/sec)
decode: vertex 16.19 ms (1.84 GB/sec), index 11.07 ms (2.02 GB/sec)
filters: oct8 data 4000000 bytes, oct12/quat12 data 8000000 bytes
filter: oct8 2.12 ms (1.76 GB/sec), oct12 2.26 ms (3.29 GB/sec), quat12 2.84 ms (2.63 GB/sec)
filter: oct8 2.11 ms (1.76 GB/sec), oct12 2.19 ms (3.40 GB/sec), quat12 2.79 ms (2.67 GB/sec)
filter: oct8 2.11 ms (1.77 GB/sec), oct12 2.17 ms (3.43 GB/sec), quat12 2.79 ms (2.67 GB/sec)
filter: oct8 2.13 ms (1.75 GB/sec), oct12 2.25 ms (3.32 GB/sec), quat12 2.86 ms (2.61 GB/sec)
filter: oct8 2.10 ms (1.77 GB/sec), oct12 2.17 ms (3.43 GB/sec), quat12 2.80 ms (2.66 GB/sec)
filter: oct8 2.09 ms (1.78 GB/sec), oct12 2.16 ms (3.45 GB/sec), quat12 2.81 ms (2.65 GB/sec)
filter: oct8 2.13 ms (1.75 GB/sec), oct12 2.28 ms (3.26 GB/sec), quat12 2.82 ms (2.64 GB/sec)
filter: oct8 2.23 ms (1.67 GB/sec), oct12 2.16 ms (3.44 GB/sec), quat12 2.81 ms (2.65 GB/sec)
filter: oct8 2.10 ms (1.78 GB/sec), oct12 2.15 ms (3.47 GB/sec), quat12 2.83 ms (2.63 GB/sec)
filter: oct8 2.14 ms (1.74 GB/sec), oct12 2.17 ms (3.44 GB/sec), quat12 2.80 ms (2.66 GB/sec)

In wasmtime (on branch https://github.com/abrown/wasmtime/tree/additional-i8x16-shift which implements needed instructions). I tried various versions of the same code built with different tools:

$ ls ../oob/*.wasm | xargs -I{} sh -c "cargo run -- run --enable-simd --disable-cache {}"


    Finished dev [unoptimized + debuginfo] target(s) in 0.07s
     Running `target/debug/wasmtime run --enable-simd --disable-cache ../oob/emscripten-built-for-js.wasm`
Error: failed to run main module `../oob/emscripten-built-for-js.wasm`

Caused by:
    import module `a` was not found



    Finished dev [unoptimized + debuginfo] target(s) in 0.07s
     Running `target/debug/wasmtime run --enable-simd --disable-cache ../oob/emscripten-built.wasm`
source: vertex data 32064032 bytes, index data 24000000 bytes
pass 0: vertex data 18518385 bytes, index data 2332680 bytes
Error: failed to run main module `../oob/emscripten-built.wasm`

Caused by:
    0: failed to invoke `_start`
    1: wasm trap: out of bounds memory access, source location: @7d97
       wasm backtrace:
         0: <unknown>!<wasm function 74>
         1: <unknown>!<wasm function 37>
         2: <unknown>!<wasm function 75>
         3: <unknown>!<wasm function 28>
         4: <unknown>!<wasm function 67>



    Finished dev [unoptimized + debuginfo] target(s) in 0.07s
     Running `target/debug/wasmtime run --enable-simd --disable-cache ../oob/wasi-sdk-built-extra-memory.wasm`
source: vertex data 32064032 bytes, index data 24000000 bytes
pass 0: vertex data 18518385 bytes, index data 2332680 bytes
Error: failed to run main module `../oob/wasi-sdk-built-extra-memory.wasm`

Caused by:
    0: failed to invoke `_start`
    1: wasm trap: out of bounds memory access, source location: @22a5
       wasm backtrace:
         0: <unknown>!meshopt::decodeVertexBlockSimd(unsigned char const*, unsigned char const*, unsigned char*, unsigned long, unsigned long, unsigned char*)
         1: <unknown>!meshopt_decodeVertexBuffer
         2: <unknown>!benchCodecs(std::__2::vector<Vertex, std::__2::allocator<Vertex> > const&, std::__2::vector<unsigned int, std::__2::allocator<unsigned int> > const&)
         3: <unknown>!__original_main
         4: <unknown>!_start



    Finished dev [unoptimized + debuginfo] target(s) in 0.07s
     Running `target/debug/wasmtime run --enable-simd --disable-cache ../oob/wasi-sdk-built.wasm`
source: vertex data 32064032 bytes, index data 24000000 bytes
pass 0: vertex data 18518385 bytes, index data 2332680 bytes
Error: failed to run main module `../oob/wasi-sdk-built.wasm`

Caused by:
    0: failed to invoke `_start`
    1: wasm trap: out of bounds memory access, source location: @22a4
       wasm backtrace:
         0: <unknown>!meshopt::decodeVertexBlockSimd(unsigned char const*, unsigned char const*, unsigned char*, unsigned long, unsigned long, unsigned char*)
         1: <unknown>!meshopt_decodeVertexBuffer
         2: <unknown>!benchCodecs(std::__2::vector<Vertex, std::__2::allocator<Vertex> > const&, std::__2::vector<unsigned int, std::__2::allocator<unsigned int> > const&)
         3: <unknown>!__original_main
         4: <unknown>!_start

Which Wasmtime version / commit hash / branch are you using?

On branch https://github.com/abrown/wasmtime/tree/additional-i8x16-shift which implements needed instructions.

What are the steps to reproduce the issue?

See above. Also, here are steps for building the Wasm modules from https://github.com/zeux/meshoptimizer/tree/9047ac1936351d0508bb26b5b82ec1101f9735b4:

# wasi-sdi-built.wasm (2^28 bytes of memory, 4096x64K pages)
$ /opt/wasi-sdk/bin/clang++ --version
clang version 11.0.0 (https://github.com/llvm/llvm-project 46bb6613a31fd43b6d4485ce7e71a387dc22cbc7)
Target: wasm32-unknown-wasi
Thread model: posix
InstalledDir: /opt/wasi-sdk/bin
$ make clean && make codecbench-simd.wasm
/opt/wasi-sdk/bin/clang++ tools/codecbench.cpp src/vertexcodec.cpp src/vertexfilter.cpp src/overdrawanalyzer.cpp src/indexgenerator.cpp src/vcacheoptimizer.cpp src/indexcodec.cpp src/vfetchanalyzer.cpp src/spatialorder.cpp src/clusterizer.cpp src/allocator.cpp src/vcacheanalyzer.cpp src/vfetchoptimizer.cpp src/overdrawoptimizer.cpp src/simplifier.cpp src/stripifier.cpp -O3 -DNDEBUG -fno-exceptions -Wl,--initial-memory=268435456 -msimd128 -o codecbench-simd.wasm

# wasi-sdk-built-extra-memory.wasm (2^30 bytes, 16384x64K pages)
$ make clean && make codecbench-simd.wasm
/opt/wasi-sdk/bin/clang++ tools/codecbench.cpp src/vertexcodec.cpp src/vertexfilter.cpp src/overdrawanalyzer.cpp src/indexgenerator.cpp src/vcacheoptimizer.cpp src/indexcodec.cpp src/vfetchanalyzer.cpp src/spatialorder.cpp src/clusterizer.cpp src/allocator.cpp src/vcacheanalyzer.cpp src/vfetchoptimizer.cpp src/overdrawoptimizer.cpp src/simplifier.cpp src/stripifier.cpp -O3 -DNDEBUG -fno-exceptions -Wl,--initial-memory=1073741824 -msimd128 -o codecbench-simd.wasm

# emscripten-built.wasm
$ emcc --version
emcc (Emscripten gcc/clang-like replacement) 1.39.10 (commit 1bd7d547598f3fc74699c172f6c9c59a1e8484f1)
$ make clean && make codecbench-simd.wasm
emcc tools/codecbench.cpp src/vertexcodec.cpp src/vertexfilter.cpp src/overdrawanalyzer.cpp src/indexgenerator.cpp src/vcacheoptimizer.cpp src/indexcodec.cpp src/vfetchanalyzer.cpp src/spatialorder.cpp src/clusterizer.cpp src/allocator.cpp src/vcacheanalyzer.cpp src/vfetchoptimizer.cpp src/overdrawoptimizer.cpp src/simplifier.cpp src/stripifier.cpp -O3 -DNDEBUG -s TOTAL_MEMORY=268435456 -msimd128 -o codecbench-simd.wasm

# then generated wat and dump files with
ls *.wasm | xargs -I{} sh -c "wasm2wat --enable-all {} > {}.wat"
ls *.wasm | xargs -I{} sh -c "wasm-objdump -d {} > {}.dump"

emscripten-built.wasm.txt
emscripten-built.wasm.wat.txt
emscripten-built-for-js.js.txt
emscripten-built-for-js.wasm.dump.txt
emscripten-built-for-js.wasm.txt
emscripten-built-for-js.wasm.wat.txt
wasi-sdk-built.wasm.dump.txt
wasi-sdk-built.wasm.txt
wasi-sdk-built.wasm.wat.txt
[wasi-sdk-built-extra-memory.wasm.dump.txt](https://github.com/bytecodealliance/wasmti
[message truncated]

view this post on Zulip Wasmtime GitHub notifications bot (Mar 25 2020 at 21:36):

abrown opened Issue #1407:

What do you expect to happen? What does actually happen? Does it panic, and if so, with which assertion?

wasmtime traps with an OOB memory access; Node does not. In Node

$ node --version
v13.9.0
$ node --experimental-wasm-simd emscripten-built-for-js.js
source: vertex data 32064032 bytes, index data 24000000 bytes
pass 0: vertex data 18518385 bytes, index data 2332680 bytes
decode: vertex 16.32 ms (1.83 GB/sec), index 11.15 ms (2.00 GB/sec)
decode: vertex 16.33 ms (1.83 GB/sec), index 11.15 ms (2.00 GB/sec)
decode: vertex 16.57 ms (1.80 GB/sec), index 11.23 ms (1.99 GB/sec)
decode: vertex 16.19 ms (1.84 GB/sec), index 11.35 ms (1.97 GB/sec)
decode: vertex 16.18 ms (1.85 GB/sec), index 11.16 ms (2.00 GB/sec)
decode: vertex 16.12 ms (1.85 GB/sec), index 11.19 ms (2.00 GB/sec)
decode: vertex 16.19 ms (1.84 GB/sec), index 11.15 ms (2.01 GB/sec)
decode: vertex 16.16 ms (1.85 GB/sec), index 11.14 ms (2.01 GB/sec)
decode: vertex 16.15 ms (1.85 GB/sec), index 11.17 ms (2.00 GB/sec)
decode: vertex 16.17 ms (1.85 GB/sec), index 11.16 ms (2.00 GB/sec)
pass 1: vertex data 18518204 bytes, index data 2001016 bytes
decode: vertex 16.12 ms (1.85 GB/sec), index 11.07 ms (2.02 GB/sec)
decode: vertex 16.17 ms (1.85 GB/sec), index 11.08 ms (2.02 GB/sec)
decode: vertex 16.11 ms (1.85 GB/sec), index 11.11 ms (2.01 GB/sec)
decode: vertex 16.21 ms (1.84 GB/sec), index 11.09 ms (2.02 GB/sec)
decode: vertex 16.17 ms (1.85 GB/sec), index 11.10 ms (2.01 GB/sec)
decode: vertex 16.07 ms (1.86 GB/sec), index 11.13 ms (2.01 GB/sec)
decode: vertex 16.19 ms (1.84 GB/sec), index 11.06 ms (2.02 GB/sec)
decode: vertex 16.17 ms (1.85 GB/sec), index 11.10 ms (2.01 GB/sec)
decode: vertex 16.04 ms (1.86 GB/sec), index 11.14 ms (2.01 GB/sec)
decode: vertex 16.19 ms (1.84 GB/sec), index 11.07 ms (2.02 GB/sec)
filters: oct8 data 4000000 bytes, oct12/quat12 data 8000000 bytes
filter: oct8 2.12 ms (1.76 GB/sec), oct12 2.26 ms (3.29 GB/sec), quat12 2.84 ms (2.63 GB/sec)
filter: oct8 2.11 ms (1.76 GB/sec), oct12 2.19 ms (3.40 GB/sec), quat12 2.79 ms (2.67 GB/sec)
filter: oct8 2.11 ms (1.77 GB/sec), oct12 2.17 ms (3.43 GB/sec), quat12 2.79 ms (2.67 GB/sec)
filter: oct8 2.13 ms (1.75 GB/sec), oct12 2.25 ms (3.32 GB/sec), quat12 2.86 ms (2.61 GB/sec)
filter: oct8 2.10 ms (1.77 GB/sec), oct12 2.17 ms (3.43 GB/sec), quat12 2.80 ms (2.66 GB/sec)
filter: oct8 2.09 ms (1.78 GB/sec), oct12 2.16 ms (3.45 GB/sec), quat12 2.81 ms (2.65 GB/sec)
filter: oct8 2.13 ms (1.75 GB/sec), oct12 2.28 ms (3.26 GB/sec), quat12 2.82 ms (2.64 GB/sec)
filter: oct8 2.23 ms (1.67 GB/sec), oct12 2.16 ms (3.44 GB/sec), quat12 2.81 ms (2.65 GB/sec)
filter: oct8 2.10 ms (1.78 GB/sec), oct12 2.15 ms (3.47 GB/sec), quat12 2.83 ms (2.63 GB/sec)
filter: oct8 2.14 ms (1.74 GB/sec), oct12 2.17 ms (3.44 GB/sec), quat12 2.80 ms (2.66 GB/sec)

In wasmtime (on branch https://github.com/abrown/wasmtime/tree/additional-i8x16-shift which implements needed instructions). I tried various versions of the same code built with different tools:

$ ls ../oob/*.wasm | xargs -I{} sh -c "cargo run -- run --enable-simd --disable-cache {}"


    Finished dev [unoptimized + debuginfo] target(s) in 0.07s
     Running `target/debug/wasmtime run --enable-simd --disable-cache ../oob/emscripten-built-for-js.wasm`
Error: failed to run main module `../oob/emscripten-built-for-js.wasm`

Caused by:
    import module `a` was not found



    Finished dev [unoptimized + debuginfo] target(s) in 0.07s
     Running `target/debug/wasmtime run --enable-simd --disable-cache ../oob/emscripten-built.wasm`
source: vertex data 32064032 bytes, index data 24000000 bytes
pass 0: vertex data 18518385 bytes, index data 2332680 bytes
Error: failed to run main module `../oob/emscripten-built.wasm`

Caused by:
    0: failed to invoke `_start`
    1: wasm trap: out of bounds memory access, source location: @7d97
       wasm backtrace:
         0: <unknown>!<wasm function 74>
         1: <unknown>!<wasm function 37>
         2: <unknown>!<wasm function 75>
         3: <unknown>!<wasm function 28>
         4: <unknown>!<wasm function 67>



    Finished dev [unoptimized + debuginfo] target(s) in 0.07s
     Running `target/debug/wasmtime run --enable-simd --disable-cache ../oob/wasi-sdk-built-extra-memory.wasm`
source: vertex data 32064032 bytes, index data 24000000 bytes
pass 0: vertex data 18518385 bytes, index data 2332680 bytes
Error: failed to run main module `../oob/wasi-sdk-built-extra-memory.wasm`

Caused by:
    0: failed to invoke `_start`
    1: wasm trap: out of bounds memory access, source location: @22a5
       wasm backtrace:
         0: <unknown>!meshopt::decodeVertexBlockSimd(unsigned char const*, unsigned char const*, unsigned char*, unsigned long, unsigned long, unsigned char*)
         1: <unknown>!meshopt_decodeVertexBuffer
         2: <unknown>!benchCodecs(std::__2::vector<Vertex, std::__2::allocator<Vertex> > const&, std::__2::vector<unsigned int, std::__2::allocator<unsigned int> > const&)
         3: <unknown>!__original_main
         4: <unknown>!_start



    Finished dev [unoptimized + debuginfo] target(s) in 0.07s
     Running `target/debug/wasmtime run --enable-simd --disable-cache ../oob/wasi-sdk-built.wasm`
source: vertex data 32064032 bytes, index data 24000000 bytes
pass 0: vertex data 18518385 bytes, index data 2332680 bytes
Error: failed to run main module `../oob/wasi-sdk-built.wasm`

Caused by:
    0: failed to invoke `_start`
    1: wasm trap: out of bounds memory access, source location: @22a4
       wasm backtrace:
         0: <unknown>!meshopt::decodeVertexBlockSimd(unsigned char const*, unsigned char const*, unsigned char*, unsigned long, unsigned long, unsigned char*)
         1: <unknown>!meshopt_decodeVertexBuffer
         2: <unknown>!benchCodecs(std::__2::vector<Vertex, std::__2::allocator<Vertex> > const&, std::__2::vector<unsigned int, std::__2::allocator<unsigned int> > const&)
         3: <unknown>!__original_main
         4: <unknown>!_start

Which Wasmtime version / commit hash / branch are you using?

On branch https://github.com/abrown/wasmtime/tree/additional-i8x16-shift which implements needed instructions.

What are the steps to reproduce the issue?

See above. Also, here are steps for building the Wasm modules from https://github.com/zeux/meshoptimizer/tree/9047ac1936351d0508bb26b5b82ec1101f9735b4:

# wasi-sdi-built.wasm (2^28 bytes of memory, 4096x64K pages)
$ /opt/wasi-sdk/bin/clang++ --version
clang version 11.0.0 (https://github.com/llvm/llvm-project 46bb6613a31fd43b6d4485ce7e71a387dc22cbc7)
Target: wasm32-unknown-wasi
Thread model: posix
InstalledDir: /opt/wasi-sdk/bin
$ make clean && make codecbench-simd.wasm
/opt/wasi-sdk/bin/clang++ tools/codecbench.cpp src/vertexcodec.cpp src/vertexfilter.cpp src/overdrawanalyzer.cpp src/indexgenerator.cpp src/vcacheoptimizer.cpp src/indexcodec.cpp src/vfetchanalyzer.cpp src/spatialorder.cpp src/clusterizer.cpp src/allocator.cpp src/vcacheanalyzer.cpp src/vfetchoptimizer.cpp src/overdrawoptimizer.cpp src/simplifier.cpp src/stripifier.cpp -O3 -DNDEBUG -fno-exceptions -Wl,--initial-memory=268435456 -msimd128 -o codecbench-simd.wasm

# wasi-sdk-built-extra-memory.wasm (2^30 bytes, 16384x64K pages)
$ make clean && make codecbench-simd.wasm
/opt/wasi-sdk/bin/clang++ tools/codecbench.cpp src/vertexcodec.cpp src/vertexfilter.cpp src/overdrawanalyzer.cpp src/indexgenerator.cpp src/vcacheoptimizer.cpp src/indexcodec.cpp src/vfetchanalyzer.cpp src/spatialorder.cpp src/clusterizer.cpp src/allocator.cpp src/vcacheanalyzer.cpp src/vfetchoptimizer.cpp src/overdrawoptimizer.cpp src/simplifier.cpp src/stripifier.cpp -O3 -DNDEBUG -fno-exceptions -Wl,--initial-memory=1073741824 -msimd128 -o codecbench-simd.wasm

# emscripten-built.wasm
$ emcc --version
emcc (Emscripten gcc/clang-like replacement) 1.39.10 (commit 1bd7d547598f3fc74699c172f6c9c59a1e8484f1)
$ make clean && make codecbench-simd.wasm
emcc tools/codecbench.cpp src/vertexcodec.cpp src/vertexfilter.cpp src/overdrawanalyzer.cpp src/indexgenerator.cpp src/vcacheoptimizer.cpp src/indexcodec.cpp src/vfetchanalyzer.cpp src/spatialorder.cpp src/clusterizer.cpp src/allocator.cpp src/vcacheanalyzer.cpp src/vfetchoptimizer.cpp src/overdrawoptimizer.cpp src/simplifier.cpp src/stripifier.cpp -O3 -DNDEBUG -s TOTAL_MEMORY=268435456 -msimd128 -o codecbench-simd.wasm

# then generated wat and dump files with
ls *.wasm | xargs -I{} sh -c "wasm2wat --enable-all {} > {}.wat"
ls *.wasm | xargs -I{} sh -c "wasm-objdump -d {} > {}.dump"

emscripten-built.wasm.txt
emscripten-built.wasm.wat.txt
emscripten-built-for-js.js.txt
emscripten-built-for-js.wasm.dump.txt
emscripten-built-for-js.wasm.txt
emscripten-built-for-js.wasm.wat.txt
wasi-sdk-built.wasm.dump.txt
wasi-sdk-built.wasm.txt
wasi-sdk-built.wasm.wat.txt
[wasi-sdk-built-extra-memory.wasm.dump.txt](https://github.com/bytecodealliance/wasmtim
[message truncated]

view this post on Zulip Wasmtime GitHub notifications bot (Mar 25 2020 at 21:39):

abrown commented on Issue #1407:

It is interesting that the OOB is triggered in different places:

$ wasm-objdump -d emscripten-built.wasm | grep -A5 -B5 7d97
 007d8a: 0e 03 01 02 03 00          |                       br_table 1 2 3 0
 007d90: 0b                         |                     end
 007d91: 20 07                      |                     local.get 7
 007d93: 41 00                      |                     i32.const 0
 007d95: fd 0c                      |                     i32x4.splat
>007d97: fd 01 04 10                |                     v128.store 4 16
 007d9b: 0c 03                      |                     br 3
 007d9d: 0b                         |                   end
 007d9e: 20 07                      |                   local.get 7
 007da0: 20 00                      |                   local.get 0
 007da2: fd 00 00 04                |                   v128.load 0 4
abrown@abrown-desk:~/Code/oob$ wasm-objdump -d wasi-sdk-built.wasm | grep -A5 -B5 22a4
 002297: fd 06 00                   |                     i8x16.extract_lane_u 0
 00229a: 6a                         |                     i32.add
 00229b: 20 17                      |                     local.get 23
 00229d: 41 f0 b0 80 80 00          |                     i32.const 6256
 0022a3: 6a                         |                     i32.add
>0022a4: 2d 00 00                   |                     i32.load8_u 0 0
 0022a7: 6a                         |                     i32.add
 0022a8: 21 00                      |                     local.set 0
 0022aa: 0c 02                      |                     br 2
 0022ac: 0b                         |                   end
 0022ad: 20 0f                      |                   local.get 15

view this post on Zulip Wasmtime GitHub notifications bot (Mar 25 2020 at 21:39):

abrown edited a comment on Issue #1407:

It is interesting that the OOB is triggered in different places:

$ wasm-objdump -d emscripten-built.wasm | grep -A5 -B5 7d97
 007d8a: 0e 03 01 02 03 00          |                       br_table 1 2 3 0
 007d90: 0b                         |                     end
 007d91: 20 07                      |                     local.get 7
 007d93: 41 00                      |                     i32.const 0
 007d95: fd 0c                      |                     i32x4.splat
>007d97: fd 01 04 10                |                     v128.store 4 16
 007d9b: 0c 03                      |                     br 3
 007d9d: 0b                         |                   end
 007d9e: 20 07                      |                   local.get 7
 007da0: 20 00                      |                   local.get 0
 007da2: fd 00 00 04                |                   v128.load 0 4
$ wasm-objdump -d wasi-sdk-built.wasm | grep -A5 -B5 22a4
 002297: fd 06 00                   |                     i8x16.extract_lane_u 0
 00229a: 6a                         |                     i32.add
 00229b: 20 17                      |                     local.get 23
 00229d: 41 f0 b0 80 80 00          |                     i32.const 6256
 0022a3: 6a                         |                     i32.add
>0022a4: 2d 00 00                   |                     i32.load8_u 0 0
 0022a7: 6a                         |                     i32.add
 0022a8: 21 00                      |                     local.set 0
 0022aa: 0c 02                      |                     br 2
 0022ac: 0b                         |                   end
 0022ad: 20 0f                      |                   local.get 15

view this post on Zulip Wasmtime GitHub notifications bot (Mar 25 2020 at 21:41):

abrown edited Issue #1407:

What do you expect to happen? What does actually happen? Does it panic, and if so, with which assertion?

wasmtime traps with an OOB memory access; Node does not.

In Node the code runs as expected (though we have to run it from the JS wrapper):

$ node --version
v13.9.0
$ node --experimental-wasm-simd emscripten-built-for-js.js
source: vertex data 32064032 bytes, index data 24000000 bytes
pass 0: vertex data 18518385 bytes, index data 2332680 bytes
decode: vertex 16.32 ms (1.83 GB/sec), index 11.15 ms (2.00 GB/sec)
decode: vertex 16.33 ms (1.83 GB/sec), index 11.15 ms (2.00 GB/sec)
decode: vertex 16.57 ms (1.80 GB/sec), index 11.23 ms (1.99 GB/sec)
decode: vertex 16.19 ms (1.84 GB/sec), index 11.35 ms (1.97 GB/sec)
decode: vertex 16.18 ms (1.85 GB/sec), index 11.16 ms (2.00 GB/sec)
decode: vertex 16.12 ms (1.85 GB/sec), index 11.19 ms (2.00 GB/sec)
decode: vertex 16.19 ms (1.84 GB/sec), index 11.15 ms (2.01 GB/sec)
decode: vertex 16.16 ms (1.85 GB/sec), index 11.14 ms (2.01 GB/sec)
decode: vertex 16.15 ms (1.85 GB/sec), index 11.17 ms (2.00 GB/sec)
decode: vertex 16.17 ms (1.85 GB/sec), index 11.16 ms (2.00 GB/sec)
pass 1: vertex data 18518204 bytes, index data 2001016 bytes
decode: vertex 16.12 ms (1.85 GB/sec), index 11.07 ms (2.02 GB/sec)
decode: vertex 16.17 ms (1.85 GB/sec), index 11.08 ms (2.02 GB/sec)
decode: vertex 16.11 ms (1.85 GB/sec), index 11.11 ms (2.01 GB/sec)
decode: vertex 16.21 ms (1.84 GB/sec), index 11.09 ms (2.02 GB/sec)
decode: vertex 16.17 ms (1.85 GB/sec), index 11.10 ms (2.01 GB/sec)
decode: vertex 16.07 ms (1.86 GB/sec), index 11.13 ms (2.01 GB/sec)
decode: vertex 16.19 ms (1.84 GB/sec), index 11.06 ms (2.02 GB/sec)
decode: vertex 16.17 ms (1.85 GB/sec), index 11.10 ms (2.01 GB/sec)
decode: vertex 16.04 ms (1.86 GB/sec), index 11.14 ms (2.01 GB/sec)
decode: vertex 16.19 ms (1.84 GB/sec), index 11.07 ms (2.02 GB/sec)
filters: oct8 data 4000000 bytes, oct12/quat12 data 8000000 bytes
filter: oct8 2.12 ms (1.76 GB/sec), oct12 2.26 ms (3.29 GB/sec), quat12 2.84 ms (2.63 GB/sec)
filter: oct8 2.11 ms (1.76 GB/sec), oct12 2.19 ms (3.40 GB/sec), quat12 2.79 ms (2.67 GB/sec)
filter: oct8 2.11 ms (1.77 GB/sec), oct12 2.17 ms (3.43 GB/sec), quat12 2.79 ms (2.67 GB/sec)
filter: oct8 2.13 ms (1.75 GB/sec), oct12 2.25 ms (3.32 GB/sec), quat12 2.86 ms (2.61 GB/sec)
filter: oct8 2.10 ms (1.77 GB/sec), oct12 2.17 ms (3.43 GB/sec), quat12 2.80 ms (2.66 GB/sec)
filter: oct8 2.09 ms (1.78 GB/sec), oct12 2.16 ms (3.45 GB/sec), quat12 2.81 ms (2.65 GB/sec)
filter: oct8 2.13 ms (1.75 GB/sec), oct12 2.28 ms (3.26 GB/sec), quat12 2.82 ms (2.64 GB/sec)
filter: oct8 2.23 ms (1.67 GB/sec), oct12 2.16 ms (3.44 GB/sec), quat12 2.81 ms (2.65 GB/sec)
filter: oct8 2.10 ms (1.78 GB/sec), oct12 2.15 ms (3.47 GB/sec), quat12 2.83 ms (2.63 GB/sec)
filter: oct8 2.14 ms (1.74 GB/sec), oct12 2.17 ms (3.44 GB/sec), quat12 2.80 ms (2.66 GB/sec)

In wasmtime (on branch https://github.com/abrown/wasmtime/tree/additional-i8x16-shift which implements needed instructions). I tried various versions of the same code built with different tools:

$ ls ../oob/*.wasm | xargs -I{} sh -c "cargo run -- run --enable-simd --disable-cache {}"


    Finished dev [unoptimized + debuginfo] target(s) in 0.07s
     Running `target/debug/wasmtime run --enable-simd --disable-cache ../oob/emscripten-built-for-js.wasm`
Error: failed to run main module `../oob/emscripten-built-for-js.wasm`

Caused by:
    import module `a` was not found



    Finished dev [unoptimized + debuginfo] target(s) in 0.07s
     Running `target/debug/wasmtime run --enable-simd --disable-cache ../oob/emscripten-built.wasm`
source: vertex data 32064032 bytes, index data 24000000 bytes
pass 0: vertex data 18518385 bytes, index data 2332680 bytes
Error: failed to run main module `../oob/emscripten-built.wasm`

Caused by:
    0: failed to invoke `_start`
    1: wasm trap: out of bounds memory access, source location: @7d97
       wasm backtrace:
         0: <unknown>!<wasm function 74>
         1: <unknown>!<wasm function 37>
         2: <unknown>!<wasm function 75>
         3: <unknown>!<wasm function 28>
         4: <unknown>!<wasm function 67>



    Finished dev [unoptimized + debuginfo] target(s) in 0.07s
     Running `target/debug/wasmtime run --enable-simd --disable-cache ../oob/wasi-sdk-built-extra-memory.wasm`
source: vertex data 32064032 bytes, index data 24000000 bytes
pass 0: vertex data 18518385 bytes, index data 2332680 bytes
Error: failed to run main module `../oob/wasi-sdk-built-extra-memory.wasm`

Caused by:
    0: failed to invoke `_start`
    1: wasm trap: out of bounds memory access, source location: @22a5
       wasm backtrace:
         0: <unknown>!meshopt::decodeVertexBlockSimd(unsigned char const*, unsigned char const*, unsigned char*, unsigned long, unsigned long, unsigned char*)
         1: <unknown>!meshopt_decodeVertexBuffer
         2: <unknown>!benchCodecs(std::__2::vector<Vertex, std::__2::allocator<Vertex> > const&, std::__2::vector<unsigned int, std::__2::allocator<unsigned int> > const&)
         3: <unknown>!__original_main
         4: <unknown>!_start



    Finished dev [unoptimized + debuginfo] target(s) in 0.07s
     Running `target/debug/wasmtime run --enable-simd --disable-cache ../oob/wasi-sdk-built.wasm`
source: vertex data 32064032 bytes, index data 24000000 bytes
pass 0: vertex data 18518385 bytes, index data 2332680 bytes
Error: failed to run main module `../oob/wasi-sdk-built.wasm`

Caused by:
    0: failed to invoke `_start`
    1: wasm trap: out of bounds memory access, source location: @22a4
       wasm backtrace:
         0: <unknown>!meshopt::decodeVertexBlockSimd(unsigned char const*, unsigned char const*, unsigned char*, unsigned long, unsigned long, unsigned char*)
         1: <unknown>!meshopt_decodeVertexBuffer
         2: <unknown>!benchCodecs(std::__2::vector<Vertex, std::__2::allocator<Vertex> > const&, std::__2::vector<unsigned int, std::__2::allocator<unsigned int> > const&)
         3: <unknown>!__original_main
         4: <unknown>!_start

Which Wasmtime version / commit hash / branch are you using?

On branch https://github.com/abrown/wasmtime/tree/additional-i8x16-shift which implements needed instructions.

What are the steps to reproduce the issue?

See above. Also, here are steps for building the Wasm modules from https://github.com/zeux/meshoptimizer/tree/9047ac1936351d0508bb26b5b82ec1101f9735b4:

# wasi-sdi-built.wasm (2^28 bytes of memory, 4096x64K pages)
$ /opt/wasi-sdk/bin/clang++ --version
clang version 11.0.0 (https://github.com/llvm/llvm-project 46bb6613a31fd43b6d4485ce7e71a387dc22cbc7)
Target: wasm32-unknown-wasi
Thread model: posix
InstalledDir: /opt/wasi-sdk/bin
$ make clean && make codecbench-simd.wasm
/opt/wasi-sdk/bin/clang++ tools/codecbench.cpp src/vertexcodec.cpp src/vertexfilter.cpp src/overdrawanalyzer.cpp src/indexgenerator.cpp src/vcacheoptimizer.cpp src/indexcodec.cpp src/vfetchanalyzer.cpp src/spatialorder.cpp src/clusterizer.cpp src/allocator.cpp src/vcacheanalyzer.cpp src/vfetchoptimizer.cpp src/overdrawoptimizer.cpp src/simplifier.cpp src/stripifier.cpp -O3 -DNDEBUG -fno-exceptions -Wl,--initial-memory=268435456 -msimd128 -o codecbench-simd.wasm

# wasi-sdk-built-extra-memory.wasm (2^30 bytes, 16384x64K pages)
$ make clean && make codecbench-simd.wasm
/opt/wasi-sdk/bin/clang++ tools/codecbench.cpp src/vertexcodec.cpp src/vertexfilter.cpp src/overdrawanalyzer.cpp src/indexgenerator.cpp src/vcacheoptimizer.cpp src/indexcodec.cpp src/vfetchanalyzer.cpp src/spatialorder.cpp src/clusterizer.cpp src/allocator.cpp src/vcacheanalyzer.cpp src/vfetchoptimizer.cpp src/overdrawoptimizer.cpp src/simplifier.cpp src/stripifier.cpp -O3 -DNDEBUG -fno-exceptions -Wl,--initial-memory=1073741824 -msimd128 -o codecbench-simd.wasm

# emscripten-built.wasm
$ emcc --version
emcc (Emscripten gcc/clang-like replacement) 1.39.10 (commit 1bd7d547598f3fc74699c172f6c9c59a1e8484f1)
$ make clean && make codecbench-simd.wasm
emcc tools/codecbench.cpp src/vertexcodec.cpp src/vertexfilter.cpp src/overdrawanalyzer.cpp src/indexgenerator.cpp src/vcacheoptimizer.cpp src/indexcodec.cpp src/vfetchanalyzer.cpp src/spatialorder.cpp src/clusterizer.cpp src/allocator.cpp src/vcacheanalyzer.cpp src/vfetchoptimizer.cpp src/overdrawoptimizer.cpp src/simplifier.cpp src/stripifier.cpp -O3 -DNDEBUG -s TOTAL_MEMORY=268435456 -msimd128 -o codecbench-simd.wasm

# then generated wat and dump files with
ls *.wasm | xargs -I{} sh -c "wasm2wat --enable-all {} > {}.wat"
ls *.wasm | xargs -I{} sh -c "wasm-objdump -d {} > {}.dump"

emscripten-built.wasm.txt
emscripten-built.wasm.wat.txt
emscripten-built-for-js.js.txt
emscripten-built-for-js.wasm.dump.txt
emscripten-built-for-js.wasm.txt
emscripten-built-for-js.wasm.wat.txt
wasi-sdk-built.wasm.dump.txt
wasi-sdk-built.wasm.txt
wasi-sdk-built.wasm.wat.txt
[wasi-sdk-b
[message truncated]

view this post on Zulip Wasmtime GitHub notifications bot (Mar 25 2020 at 21:53):

alexcrichton commented on Issue #1407:

Are the wasm files emitted here intended to be run directly? I would imagine that the JS does some sort of setup/glue which might prepare the runtime and/or size memory appropriately. Without that it might be expected that the wasm faults if run directly? (mostly in that node is running more code than we are, so a difference in behavior may not mean something wrong is happening)

view this post on Zulip Wasmtime GitHub notifications bot (Mar 25 2020 at 22:08):

abrown commented on Issue #1407:

Perhaps there is some Emscripten/Node-specific setup that I'm not aware of (@zeux, what do you think?). In the minified JS I do see code like var DYNAMIC_BASE=5249984,DYNAMICTOP_PTR=6944 that might be doing something special. But I wouldn't think that the wasi-sdk-built Wasm code should need any of that setup: the files compiled are normal-looking C++.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 25 2020 at 22:13):

abrown commented on Issue #1407:

In answer to,

Are the wasm files emitted here intended to be run directly?

I think, yes, the files that are not *-for-js should be runnable directly. Or at least looking at the *.wat versions I do not see why not.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 02 2020 at 23:28):

zeux commented on Issue #1407:

I'm wondering if wasmtime is hitting one of the cases where the code might actually hit an OOB access in practice. There's a couple of TODO comments in the code around this, where the right thing to do is to use a load_splat, but load_splat isn't available in node/Chrome so I'm not using it...

Let me look closer at what these accesses are actually doing.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 02 2020 at 23:59):

abrown commented on Issue #1407:

This shouldn't theoretically be an issue but Cranelift is lowering load_splat to load + splat at the moment (eventually optimized by #1175)... in case that matters.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 03 2020 at 01:16):

zeux commented on Issue #1407:

I tried to reproduce this but (maybe because I'm using a later version of Emscripten) I'm hitting this:

Error: failed to run main module `codecbench-simd.wasm`

Caused by:
    0: WebAssembly failed to compile
    1: Compilation error: function u0:73(i64 vmctx, i64, i32, i32, i32, i32, i32) -> i32 system_v {
...
       @7bf3                               v154 = iadd v152, v153
       ;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
       ; error: inst149 (v154 = iadd.i32 v152, v153): arg 1 (v153) has type i8, expected i32

wasm-validate doesn't agree with this assessment but maybe it's because it doesn't validate something properly? Attaching the .wasm file in question.

codecbench-simd.zip

view this post on Zulip Wasmtime GitHub notifications bot (Apr 03 2020 at 08:33):

bjorn3 commented on Issue #1407:

Where is v153 defined?

view this post on Zulip Wasmtime GitHub notifications bot (Apr 03 2020 at 14:34):

zeux commented on Issue #1407:

codecbench-simd.txt

view this post on Zulip Wasmtime GitHub notifications bot (Apr 03 2020 at 18:19):

abrown commented on Issue #1407:

@zeux, I was in a special branch that has some fixes and additional instructions that make it possible to get past those types of errors: https://github.com/abrown/wasmtime/tree/additional-i8x16-shift. I'm waiting on a review for #1377 and then I can try to merge #1409; then all of that functionality should be in master.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 03 2020 at 18:33):

zeux commented on Issue #1407:

@abrown Ahh, git :( I did check out that branch initially but had issues with the submodule links, and forgot to switch back to it after re-cloning it recursively. After syncing to this branch properly I can indeed reproduce the error, thanks! Will update once I understand this more.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 03 2020 at 19:46):

zeux commented on Issue #1407:

I strongly suspect this isn't an issue in the benchmark; the behavior seems highly dependent on the inlining here. Adding noinline to decodeBytesGroupSimd & decodeBytesSimd & decodeVertexBlockSimd fixes this. With only decodeBytesSimd marked as noinline, I get this instead:

/mnt/c/work/meshoptimizer $ make -B codecbench-simd.wasm && ../wasmtime/target/debug/wasmtime --enable-simd codecbench-simd.wasm
emcc tools/codecbench.cpp src/vertexcodec.cpp src/vertexfilter.cpp src/overdrawanalyzer.cpp src/indexgenerator.cpp src/vcacheoptimizer.cpp src/clusterizer.cpp src/indexcodec.cpp src/vfetchanalyzer.cpp src/spatialorder.cpp src/allocator.cpp src/vcacheanalyzer.cpp src/vfetchoptimizer.cpp src/overdrawoptimizer.c
pp src/simplifier.cpp src/stripifier.cpp -O3 -DNDEBUG -s TOTAL_MEMORY=268435456 -msimd128 -o codecbench-simd.wasm -g
source: vertex data 32064032 bytes, index data 24000000 bytes
pass 0: vertex data 18518385 bytes, index data 2332680 bytes
Error: failed to run main module `codecbench-simd.wasm`

Caused by:
    0: failed to invoke `_start`
    1: wasm trap: call stack exhausted, source location: @-
       wasm backtrace:
         0: <unknown>!meshopt::decodeBytesSimd(unsigned char const*, unsigned char const*, unsigned char*, unsigned long)
         1: <unknown>!meshopt::decodeVertexBlockSimd(unsigned char const*, unsigned char const*, unsigned char*, unsigned long, unsigned long, unsigned char*)
         2: <unknown>!meshopt_decodeVertexBuffer
         3: <unknown>!benchCodecs(std::__2::vector<Vertex, std::__2::allocator<Vertex> > const&, std::__2::vector<unsigned int, std::__2::allocator<unsigned int> > const&)
         4: <unknown>!__original_main
         5: <unknown>!_start

The expected call sequence here is meshopt_decodeVertexBuffer -> decodeVertexBlockSimd -> decodeBytesSimd -> decodeBytesGroupSimd, with no recursion. Unsure what "call stack exhausted" indicates here...

Unfortunately trying to add prints to this to understand the behavior fixes the issue as well, so the investigation here might be complicated. Is there a way to coerce wasmtime to generate a debuggable binary?

view this post on Zulip Wasmtime GitHub notifications bot (Apr 03 2020 at 19:49):

zeux edited a comment on Issue #1407:

I strongly suspect this isn't an issue in the benchmark; the behavior seems highly dependent on the inlining here. Adding noinline to decodeBytesGroupSimd & decodeBytesSimd & decodeVertexBlockSimd fixes this. With only decodeBytesSimd marked as noinline, I get this instead:

/mnt/c/work/meshoptimizer $ make -B codecbench-simd.wasm && ../wasmtime/target/debug/wasmtime --enable-simd codecbench-simd.wasm
emcc tools/codecbench.cpp src/vertexcodec.cpp src/vertexfilter.cpp src/overdrawanalyzer.cpp src/indexgenerator.cpp src/vcacheoptimizer.cpp src/clusterizer.cpp src/indexcodec.cpp src/vfetchanalyzer.cpp src/spatialorder.cpp src/allocator.cpp src/vcacheanalyzer.cpp src/vfetchoptimizer.cpp src/overdrawoptimizer.c
pp src/simplifier.cpp src/stripifier.cpp -O3 -DNDEBUG -s TOTAL_MEMORY=268435456 -msimd128 -o codecbench-simd.wasm -g
source: vertex data 32064032 bytes, index data 24000000 bytes
pass 0: vertex data 18518385 bytes, index data 2332680 bytes
Error: failed to run main module `codecbench-simd.wasm`

Caused by:
    0: failed to invoke `_start`
    1: wasm trap: call stack exhausted, source location: @-
       wasm backtrace:
         0: <unknown>!meshopt::decodeBytesSimd(unsigned char const*, unsigned char const*, unsigned char*, unsigned long)
         1: <unknown>!meshopt::decodeVertexBlockSimd(unsigned char const*, unsigned char const*, unsigned char*, unsigned long, unsigned long, unsigned char*)
         2: <unknown>!meshopt_decodeVertexBuffer
         3: <unknown>!benchCodecs(std::__2::vector<Vertex, std::__2::allocator<Vertex> > const&, std::__2::vector<unsigned int, std::__2::allocator<unsigned int> > const&)
         4: <unknown>!__original_main
         5: <unknown>!_start

The expected call sequence here is meshopt_decodeVertexBuffer -> decodeVertexBlockSimd -> decodeBytesSimd -> decodeBytesGroupSimd, with no recursion. Unsure what "call stack exhausted" indicates here...

Also worth noting is that expanding various buffers to accomodate for possible overruns didn't help; in some configurations I'm not getting a stack overflow, but meshopt_decodeVertexBuffer returns a non-0 result because it exits early during parsing, which suggests some issues with control flow here.

Unfortunately trying to add prints to this to understand the behavior fixes the issue as well, so the investigation here might be complicated. Is there a way to coerce wasmtime to generate a debuggable binary?

view this post on Zulip Wasmtime GitHub notifications bot (Apr 03 2020 at 19:50):

zeux edited a comment on Issue #1407:

I strongly suspect this isn't an issue in the benchmark; the behavior seems highly dependent on the inlining here. Adding noinline to decodeBytesGroupSimd & decodeBytesSimd & decodeVertexBlockSimd fixes this. With only decodeBytesSimd marked as noinline, I get this instead:

/mnt/c/work/meshoptimizer $ make -B codecbench-simd.wasm && ../wasmtime/target/debug/wasmtime --enable-simd codecbench-simd.wasm
emcc tools/codecbench.cpp src/vertexcodec.cpp src/vertexfilter.cpp src/overdrawanalyzer.cpp src/indexgenerator.cpp src/vcacheoptimizer.cpp src/clusterizer.cpp src/indexcodec.cpp src/vfetchanalyzer.cpp src/spatialorder.cpp src/allocator.cpp src/vcacheanalyzer.cpp src/vfetchoptimizer.cpp src/overdrawoptimizer.c
pp src/simplifier.cpp src/stripifier.cpp -O3 -DNDEBUG -s TOTAL_MEMORY=268435456 -msimd128 -o codecbench-simd.wasm -g
source: vertex data 32064032 bytes, index data 24000000 bytes
pass 0: vertex data 18518385 bytes, index data 2332680 bytes
Error: failed to run main module `codecbench-simd.wasm`

Caused by:
    0: failed to invoke `_start`
    1: wasm trap: call stack exhausted, source location: @-
       wasm backtrace:
         0: <unknown>!meshopt::decodeBytesSimd(unsigned char const*, unsigned char const*, unsigned char*, unsigned long)
         1: <unknown>!meshopt::decodeVertexBlockSimd(unsigned char const*, unsigned char const*, unsigned char*, unsigned long, unsigned long, unsigned char*)
         2: <unknown>!meshopt_decodeVertexBuffer
         3: <unknown>!benchCodecs(std::__2::vector<Vertex, std::__2::allocator<Vertex> > const&, std::__2::vector<unsigned int, std::__2::allocator<unsigned int> > const&)
         4: <unknown>!__original_main
         5: <unknown>!_start

The expected call sequence here is meshopt_decodeVertexBuffer -> decodeVertexBlockSimd -> decodeBytesSimd -> decodeBytesGroupSimd, with no recursion. Unsure what "call stack exhausted" indicates here...

Also worth noting is that expanding various buffers to accomodate for possible overruns didn't help; also in some inlining/codegen configurations I'm not getting a stack overflow or OOB, but meshopt_decodeVertexBuffer returns a non-0 result because it exits early during parsing, which suggests some issues with control flow here.

Unfortunately trying to add prints to this to understand the behavior fixes the issue as well, so the investigation here might be complicated. Is there a way to coerce wasmtime to generate a debuggable binary?

view this post on Zulip Wasmtime GitHub notifications bot (Apr 03 2020 at 19:51):

bjorn3 commented on Issue #1407:

Pass -g if your wasm file was build with debuginfo. I don't know if wasm2obj supports it though.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 03 2020 at 20:10):

zeux commented on Issue #1407:

-g doesn't seem to work with Emscripten-generated debug info here:

Error: failed to emit debug sections

Caused by:
    The end offset of a location list entry must not be before the beginning.`

Might be an Emscripten bug, not sure.

codecbench-simd.zip

view this post on Zulip Wasmtime GitHub notifications bot (Apr 03 2020 at 20:19):

zeux commented on Issue #1407:

One other observation is that --opt-level 0 doesn't trigger this bug:

/mnt/c/work/meshoptimizer $ ../wasmtime/target/debug/wasmtime --enable-simd --disable-cache --opt-level 0 codecbench-simd.wasm
source: vertex data 32064032 bytes, index data 24000000 bytes
pass 0: vertex data 18518385 bytes, index data 2332680 bytes
decode: vertex 28.46 ms (1.05 GB/sec), index 25.10 ms (0.89 GB/sec); rv 0 ri 0
...

/mnt/c/work/meshoptimizer $ ../wasmtime/target/debug/wasmtime --enable-simd --disable-cache --opt-level 1 codecbench-simd.wasm
source: vertex data 32064032 bytes, index data 24000000 bytes
pass 0: vertex data 18518385 bytes, index data 2332680 bytes
Error: failed to run main module `codecbench-simd.wasm`

Caused by:
    0: failed to invoke `_start`
    1: wasm trap: out of bounds memory access, source location: @7ca4
       wasm backtrace:
         0: <unknown>!<wasm function 75>
         1: <unknown>!<wasm function 37>
         2: <unknown>!<wasm function 76>
         3: <unknown>!<wasm function 28>
         4: <unknown>!<wasm function 67>

This is on a file without Emscripten-generated debug info.
codecbench-simd.zip

I don't think I have enough understanding here to provide further help, but it looks to me as if the .wasm file in question has control flow that is complicated enough to trigger some miscompilation if optimizations are enabled, and the out of bounds access is just an odd side effect.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 03 2020 at 20:28):

abrown commented on Issue #1407:

Glad you were able to replicate and that --opt-level 0 difference is actually pretty interesting. There is a pass that converts a load with an offset that is the result of a sum to a complex load; I wonder if something weird is happening there.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 29 2020 at 23:26):

abrown commented on Issue #1407:

I just re-ran the Wasm files above and they ran without issue in the master branch of wasmtime (except for emscripten-built-for-js.wasm of course--that failure is expected). It's hard to say exactly what has changed that would have fixed this but I'm going to close it since I can't reproduce now (thankfully!).

view this post on Zulip Wasmtime GitHub notifications bot (Apr 29 2020 at 23:26):

abrown closed Issue #1407:

What do you expect to happen? What does actually happen? Does it panic, and if so, with which assertion?

wasmtime traps with an OOB memory access; Node does not.

In Node the code runs as expected (though we have to run it from the JS wrapper):

$ node --version
v13.9.0
$ node --experimental-wasm-simd emscripten-built-for-js.js
source: vertex data 32064032 bytes, index data 24000000 bytes
pass 0: vertex data 18518385 bytes, index data 2332680 bytes
decode: vertex 16.32 ms (1.83 GB/sec), index 11.15 ms (2.00 GB/sec)
decode: vertex 16.33 ms (1.83 GB/sec), index 11.15 ms (2.00 GB/sec)
decode: vertex 16.57 ms (1.80 GB/sec), index 11.23 ms (1.99 GB/sec)
decode: vertex 16.19 ms (1.84 GB/sec), index 11.35 ms (1.97 GB/sec)
decode: vertex 16.18 ms (1.85 GB/sec), index 11.16 ms (2.00 GB/sec)
decode: vertex 16.12 ms (1.85 GB/sec), index 11.19 ms (2.00 GB/sec)
decode: vertex 16.19 ms (1.84 GB/sec), index 11.15 ms (2.01 GB/sec)
decode: vertex 16.16 ms (1.85 GB/sec), index 11.14 ms (2.01 GB/sec)
decode: vertex 16.15 ms (1.85 GB/sec), index 11.17 ms (2.00 GB/sec)
decode: vertex 16.17 ms (1.85 GB/sec), index 11.16 ms (2.00 GB/sec)
pass 1: vertex data 18518204 bytes, index data 2001016 bytes
decode: vertex 16.12 ms (1.85 GB/sec), index 11.07 ms (2.02 GB/sec)
decode: vertex 16.17 ms (1.85 GB/sec), index 11.08 ms (2.02 GB/sec)
decode: vertex 16.11 ms (1.85 GB/sec), index 11.11 ms (2.01 GB/sec)
decode: vertex 16.21 ms (1.84 GB/sec), index 11.09 ms (2.02 GB/sec)
decode: vertex 16.17 ms (1.85 GB/sec), index 11.10 ms (2.01 GB/sec)
decode: vertex 16.07 ms (1.86 GB/sec), index 11.13 ms (2.01 GB/sec)
decode: vertex 16.19 ms (1.84 GB/sec), index 11.06 ms (2.02 GB/sec)
decode: vertex 16.17 ms (1.85 GB/sec), index 11.10 ms (2.01 GB/sec)
decode: vertex 16.04 ms (1.86 GB/sec), index 11.14 ms (2.01 GB/sec)
decode: vertex 16.19 ms (1.84 GB/sec), index 11.07 ms (2.02 GB/sec)
filters: oct8 data 4000000 bytes, oct12/quat12 data 8000000 bytes
filter: oct8 2.12 ms (1.76 GB/sec), oct12 2.26 ms (3.29 GB/sec), quat12 2.84 ms (2.63 GB/sec)
filter: oct8 2.11 ms (1.76 GB/sec), oct12 2.19 ms (3.40 GB/sec), quat12 2.79 ms (2.67 GB/sec)
filter: oct8 2.11 ms (1.77 GB/sec), oct12 2.17 ms (3.43 GB/sec), quat12 2.79 ms (2.67 GB/sec)
filter: oct8 2.13 ms (1.75 GB/sec), oct12 2.25 ms (3.32 GB/sec), quat12 2.86 ms (2.61 GB/sec)
filter: oct8 2.10 ms (1.77 GB/sec), oct12 2.17 ms (3.43 GB/sec), quat12 2.80 ms (2.66 GB/sec)
filter: oct8 2.09 ms (1.78 GB/sec), oct12 2.16 ms (3.45 GB/sec), quat12 2.81 ms (2.65 GB/sec)
filter: oct8 2.13 ms (1.75 GB/sec), oct12 2.28 ms (3.26 GB/sec), quat12 2.82 ms (2.64 GB/sec)
filter: oct8 2.23 ms (1.67 GB/sec), oct12 2.16 ms (3.44 GB/sec), quat12 2.81 ms (2.65 GB/sec)
filter: oct8 2.10 ms (1.78 GB/sec), oct12 2.15 ms (3.47 GB/sec), quat12 2.83 ms (2.63 GB/sec)
filter: oct8 2.14 ms (1.74 GB/sec), oct12 2.17 ms (3.44 GB/sec), quat12 2.80 ms (2.66 GB/sec)

In wasmtime (on branch https://github.com/abrown/wasmtime/tree/additional-i8x16-shift which implements needed instructions). I tried various versions of the same code built with different tools:

$ ls ../oob/*.wasm | xargs -I{} sh -c "cargo run -- run --enable-simd --disable-cache {}"


    Finished dev [unoptimized + debuginfo] target(s) in 0.07s
     Running `target/debug/wasmtime run --enable-simd --disable-cache ../oob/emscripten-built-for-js.wasm`
Error: failed to run main module `../oob/emscripten-built-for-js.wasm`

Caused by:
    import module `a` was not found



    Finished dev [unoptimized + debuginfo] target(s) in 0.07s
     Running `target/debug/wasmtime run --enable-simd --disable-cache ../oob/emscripten-built.wasm`
source: vertex data 32064032 bytes, index data 24000000 bytes
pass 0: vertex data 18518385 bytes, index data 2332680 bytes
Error: failed to run main module `../oob/emscripten-built.wasm`

Caused by:
    0: failed to invoke `_start`
    1: wasm trap: out of bounds memory access, source location: @7d97
       wasm backtrace:
         0: <unknown>!<wasm function 74>
         1: <unknown>!<wasm function 37>
         2: <unknown>!<wasm function 75>
         3: <unknown>!<wasm function 28>
         4: <unknown>!<wasm function 67>



    Finished dev [unoptimized + debuginfo] target(s) in 0.07s
     Running `target/debug/wasmtime run --enable-simd --disable-cache ../oob/wasi-sdk-built-extra-memory.wasm`
source: vertex data 32064032 bytes, index data 24000000 bytes
pass 0: vertex data 18518385 bytes, index data 2332680 bytes
Error: failed to run main module `../oob/wasi-sdk-built-extra-memory.wasm`

Caused by:
    0: failed to invoke `_start`
    1: wasm trap: out of bounds memory access, source location: @22a5
       wasm backtrace:
         0: <unknown>!meshopt::decodeVertexBlockSimd(unsigned char const*, unsigned char const*, unsigned char*, unsigned long, unsigned long, unsigned char*)
         1: <unknown>!meshopt_decodeVertexBuffer
         2: <unknown>!benchCodecs(std::__2::vector<Vertex, std::__2::allocator<Vertex> > const&, std::__2::vector<unsigned int, std::__2::allocator<unsigned int> > const&)
         3: <unknown>!__original_main
         4: <unknown>!_start



    Finished dev [unoptimized + debuginfo] target(s) in 0.07s
     Running `target/debug/wasmtime run --enable-simd --disable-cache ../oob/wasi-sdk-built.wasm`
source: vertex data 32064032 bytes, index data 24000000 bytes
pass 0: vertex data 18518385 bytes, index data 2332680 bytes
Error: failed to run main module `../oob/wasi-sdk-built.wasm`

Caused by:
    0: failed to invoke `_start`
    1: wasm trap: out of bounds memory access, source location: @22a4
       wasm backtrace:
         0: <unknown>!meshopt::decodeVertexBlockSimd(unsigned char const*, unsigned char const*, unsigned char*, unsigned long, unsigned long, unsigned char*)
         1: <unknown>!meshopt_decodeVertexBuffer
         2: <unknown>!benchCodecs(std::__2::vector<Vertex, std::__2::allocator<Vertex> > const&, std::__2::vector<unsigned int, std::__2::allocator<unsigned int> > const&)
         3: <unknown>!__original_main
         4: <unknown>!_start

Which Wasmtime version / commit hash / branch are you using?

On branch https://github.com/abrown/wasmtime/tree/additional-i8x16-shift which implements needed instructions.

What are the steps to reproduce the issue?

See above. Also, here are steps for building the Wasm modules from https://github.com/zeux/meshoptimizer/tree/9047ac1936351d0508bb26b5b82ec1101f9735b4:

# wasi-sdi-built.wasm (2^28 bytes of memory, 4096x64K pages)
$ /opt/wasi-sdk/bin/clang++ --version
clang version 11.0.0 (https://github.com/llvm/llvm-project 46bb6613a31fd43b6d4485ce7e71a387dc22cbc7)
Target: wasm32-unknown-wasi
Thread model: posix
InstalledDir: /opt/wasi-sdk/bin
$ make clean && make codecbench-simd.wasm
/opt/wasi-sdk/bin/clang++ tools/codecbench.cpp src/vertexcodec.cpp src/vertexfilter.cpp src/overdrawanalyzer.cpp src/indexgenerator.cpp src/vcacheoptimizer.cpp src/indexcodec.cpp src/vfetchanalyzer.cpp src/spatialorder.cpp src/clusterizer.cpp src/allocator.cpp src/vcacheanalyzer.cpp src/vfetchoptimizer.cpp src/overdrawoptimizer.cpp src/simplifier.cpp src/stripifier.cpp -O3 -DNDEBUG -fno-exceptions -Wl,--initial-memory=268435456 -msimd128 -o codecbench-simd.wasm

# wasi-sdk-built-extra-memory.wasm (2^30 bytes, 16384x64K pages)
$ make clean && make codecbench-simd.wasm
/opt/wasi-sdk/bin/clang++ tools/codecbench.cpp src/vertexcodec.cpp src/vertexfilter.cpp src/overdrawanalyzer.cpp src/indexgenerator.cpp src/vcacheoptimizer.cpp src/indexcodec.cpp src/vfetchanalyzer.cpp src/spatialorder.cpp src/clusterizer.cpp src/allocator.cpp src/vcacheanalyzer.cpp src/vfetchoptimizer.cpp src/overdrawoptimizer.cpp src/simplifier.cpp src/stripifier.cpp -O3 -DNDEBUG -fno-exceptions -Wl,--initial-memory=1073741824 -msimd128 -o codecbench-simd.wasm

# emscripten-built.wasm
$ emcc --version
emcc (Emscripten gcc/clang-like replacement) 1.39.10 (commit 1bd7d547598f3fc74699c172f6c9c59a1e8484f1)
$ make clean && make codecbench-simd.wasm
emcc tools/codecbench.cpp src/vertexcodec.cpp src/vertexfilter.cpp src/overdrawanalyzer.cpp src/indexgenerator.cpp src/vcacheoptimizer.cpp src/indexcodec.cpp src/vfetchanalyzer.cpp src/spatialorder.cpp src/clusterizer.cpp src/allocator.cpp src/vcacheanalyzer.cpp src/vfetchoptimizer.cpp src/overdrawoptimizer.cpp src/simplifier.cpp src/stripifier.cpp -O3 -DNDEBUG -s TOTAL_MEMORY=268435456 -msimd128 -o codecbench-simd.wasm

# then generated wat and dump files with
ls *.wasm | xargs -I{} sh -c "wasm2wat --enable-all {} > {}.wat"
ls *.wasm | xargs -I{} sh -c "wasm-objdump -d {} > {}.dump"

emscripten-built.wasm.txt
emscripten-built.wasm.wat.txt
emscripten-built-for-js.js.txt
emscripten-built-for-js.wasm.dump.txt
emscripten-built-for-js.wasm.txt
emscripten-built-for-js.wasm.wat.txt
wasi-sdk-built.wasm.dump.txt
wasi-sdk-built.wasm.txt
wasi-sdk-built.wasm.wat.txt
[wasi-sdk-b
[message truncated]


Last updated: Oct 23 2024 at 20:03 UTC