Hi!
I'm trying to port a kind of big projet (llama.cpp) to WASI. I have managed to compile a wasm binary after patching some stuff, but when it comes to run it with wasmtime, I'm getting this error:
(base) jgomez@DESKTOP-ANTLE2E:~/llama.cpp$ wasmtime run --wasm-features=threads --wasi-modules=experimental-wasi-threads --listenfd --dir=./ llama.wasm -- -m ./wiz
ardcoder-python-34b-v1.0.Q3_K_M.gguf
Log start
main: warning: changing RoPE frequency base to 0 (default 10000.0)
main: warning: scaling RoPE frequency by 0 (default 1.0)
main: build = 1274 (ffe88a3)
main: built with clang version 16.0.0 for wasm32-unknown-wasi
main: seed = 1696600350
Error: failed to run main module `llama.wasm`
Caused by:
0: failed to invoke command default
1: error while executing at wasm backtrace:
0: 0x7790c - <unknown>!__fseeko
1: 0x7794f - <unknown>!fseek
2: 0xfa37b - <unknown>!llama_file::llama_file(char const*, char const*)
3: 0x9ea93 - <unknown>!llama_model_loader::llama_model_loader(std::__2::basic_string<char, std::__2::char_traits<char>, std::__2::allocator<char>> const&, bool)
4: 0x9a9b0 - <unknown>!llama_load_model_from_file
5: 0x878be - <unknown>!main
6: 0x734b8 - <unknown>!__main_void
7: 0x1482 - <unknown>!_start
note: using the `WASMTIME_BACKTRACE_DETAILS=1` environment variable may show more debugging information
2: wasm trap: uninitialized element
If I run in debug mode (using the -g
flag, as the -D debug-info
is not recognized) I'm getting this error:
thread 'main' panicked at 'assertion failed: `(left == right)`
left: `1`,
right: `0`: the memory base pointer may be incorrect due to sharing memory', crates/cranelift/src/compiler.rs:564:13
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
gdb is telling me:
Thread 1 "wasmtime" received signal SIGILL, Illegal instruction.
0x00007ffff749e08f in ?? ()
(gdb) bt
#0 0x00007ffff749e08f in ?? ()
#1 0x0000000000000000 in ?? ()
... which is not very informative.
I guess it is relevant to mention that I'm using wasi-threads
(it's obvious from the args passed to wasmtime, though :))
Any pointer on how to move forward? It'll be great if there's a way I can see what illegal instruction is actually causing this.
Thanks!!!
If I inspect the core file emitted after the trap with wasmgdb, I get this other error:
not implemented: value type 0
Ok, this looks like an error of wasmgdb itself (wasm-parser?):
(base) jgomez@DESKTOP-ANTLE2E:~/llama.cpp$ wasmgdb ./main ./core.file
thread 'main' panicked at 'not implemented: value type 0', /home/jgomez/.cargo/registry/src/index.crates.io-6f17d22bba15001f/wasm-parser-0.1.22/src/coredump.rs:100:17
To set expectations a bit, the wasm threads proposal is still pretty green-field in wasmtime and it's not a well-trodden path, things shouldn't be expected to all work with the threads proposal. For example the -g
flag seems like it doesn't support shared memory, which isn't all that surprising. I don't know what not implemented: value type 0
refers to, but wasmgdb
is not a project I am familiar with.
All that being said you probably don't need gdb here, or even if you could get it working it may not help a whole lot. I'd recommend setting WASMTIME_BACKTRACE_DETAILS=1
to try to get line numbers in your backtrace. You could then take a look at the source of the __fseeko
function ot see why it's possibly trapping.
At a high level though you're likely running into one of the limitations of the current implementation of threads which is that each thread's function table can get out of sync. I'm not sure if you're explicitly adding to the function table or if the module itself is modifying its function table, but modifications in one thread will not reach other threads. If you've only got one thread in your program though then this wouldn't be the issue.
The trap here itself is akin to calling the null function pointer in native code. This may also indicate it's simply a bug in the original source program. I'd recommend taking a look at __fseeko
and working backwards from there to figure out why the indirect function call is calling null (or trapping)
Thanks Alex, that makes a lot of sense. Gonna focus in the __fseeko stuff and see where I can get from there :)
I should also mention, one possible culprit is that --dir /
and --tcplisten
don't work together, so this may be fixable by dropping --tcplisten
. If you need TCP, however, that's something to fix on Wasmtime's side
Last updated: Nov 22 2024 at 16:03 UTC