Stream: wasmtime

Topic: wasm trap: unitialized element


view this post on Zulip Juan Gómez (Oct 06 2023 at 15:40):

Hi!
I'm trying to port a kind of big projet (llama.cpp) to WASI. I have managed to compile a wasm binary after patching some stuff, but when it comes to run it with wasmtime, I'm getting this error:

(base) jgomez@DESKTOP-ANTLE2E:~/llama.cpp$ wasmtime run --wasm-features=threads --wasi-modules=experimental-wasi-threads  --listenfd --dir=./  llama.wasm -- -m ./wiz
ardcoder-python-34b-v1.0.Q3_K_M.gguf
Log start
main: warning: changing RoPE frequency base to 0 (default 10000.0)
main: warning: scaling RoPE frequency by 0 (default 1.0)
main: build = 1274 (ffe88a3)
main: built with clang version 16.0.0 for wasm32-unknown-wasi
main: seed  = 1696600350
Error: failed to run main module `llama.wasm`

Caused by:
    0: failed to invoke command default
    1: error while executing at wasm backtrace:
           0: 0x7790c - <unknown>!__fseeko
           1: 0x7794f - <unknown>!fseek
           2: 0xfa37b - <unknown>!llama_file::llama_file(char const*, char const*)
           3: 0x9ea93 - <unknown>!llama_model_loader::llama_model_loader(std::__2::basic_string<char, std::__2::char_traits<char>, std::__2::allocator<char>> const&, bool)
           4: 0x9a9b0 - <unknown>!llama_load_model_from_file
           5: 0x878be - <unknown>!main
           6: 0x734b8 - <unknown>!__main_void
           7: 0x1482 - <unknown>!_start
       note: using the `WASMTIME_BACKTRACE_DETAILS=1` environment variable may show more debugging information
    2: wasm trap: uninitialized element

If I run in debug mode (using the -g flag, as the -D debug-info is not recognized) I'm getting this error:

thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `1`,
 right: `0`: the memory base pointer may be incorrect due to sharing memory', crates/cranelift/src/compiler.rs:564:13
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

gdb is telling me:

Thread 1 "wasmtime" received signal SIGILL, Illegal instruction.
0x00007ffff749e08f in ?? ()
(gdb) bt
#0  0x00007ffff749e08f in ?? ()
#1  0x0000000000000000 in ?? ()

... which is not very informative.
I guess it is relevant to mention that I'm using wasi-threads (it's obvious from the args passed to wasmtime, though :))
Any pointer on how to move forward? It'll be great if there's a way I can see what illegal instruction is actually causing this.
Thanks!!!

view this post on Zulip Juan Gómez (Oct 06 2023 at 15:58):

If I inspect the core file emitted after the trap with wasmgdb, I get this other error:

not implemented: value type 0

view this post on Zulip Juan Gómez (Oct 06 2023 at 16:03):

Ok, this looks like an error of wasmgdb itself (wasm-parser?):

(base) jgomez@DESKTOP-ANTLE2E:~/llama.cpp$ wasmgdb ./main ./core.file
thread 'main' panicked at 'not implemented: value type 0', /home/jgomez/.cargo/registry/src/index.crates.io-6f17d22bba15001f/wasm-parser-0.1.22/src/coredump.rs:100:17

view this post on Zulip Alex Crichton (Oct 06 2023 at 16:46):

To set expectations a bit, the wasm threads proposal is still pretty green-field in wasmtime and it's not a well-trodden path, things shouldn't be expected to all work with the threads proposal. For example the -g flag seems like it doesn't support shared memory, which isn't all that surprising. I don't know what not implemented: value type 0 refers to, but wasmgdb is not a project I am familiar with.

All that being said you probably don't need gdb here, or even if you could get it working it may not help a whole lot. I'd recommend setting WASMTIME_BACKTRACE_DETAILS=1 to try to get line numbers in your backtrace. You could then take a look at the source of the __fseeko function ot see why it's possibly trapping.

At a high level though you're likely running into one of the limitations of the current implementation of threads which is that each thread's function table can get out of sync. I'm not sure if you're explicitly adding to the function table or if the module itself is modifying its function table, but modifications in one thread will not reach other threads. If you've only got one thread in your program though then this wouldn't be the issue.

The trap here itself is akin to calling the null function pointer in native code. This may also indicate it's simply a bug in the original source program. I'd recommend taking a look at __fseeko and working backwards from there to figure out why the indirect function call is calling null (or trapping)

view this post on Zulip Juan Gómez (Oct 06 2023 at 17:29):

Thanks Alex, that makes a lot of sense. Gonna focus in the __fseeko stuff and see where I can get from there :)

view this post on Zulip Alex Crichton (Oct 06 2023 at 17:41):

I should also mention, one possible culprit is that --dir / and --tcplistendon't work together, so this may be fixable by dropping --tcplisten. If you need TCP, however, that's something to fix on Wasmtime's side

Test Case testcase.zip Steps to Reproduce Extract main.wasm from the test case zip file. Or compile it yourself using WASI SDK and the following source code, e.g. via ~/wasi-sdk/bin/clang main.c --...

Last updated: Dec 23 2024 at 13:07 UTC