After patching Config
with:
diff --git a/crates/wasmtime/src/config.rs b/crates/wasmtime/src/config.rs
index 54073aa83..18e7f1750 100644
--- a/crates/wasmtime/src/config.rs
+++ b/crates/wasmtime/src/config.rs
@@ -188,7 +188,10 @@ impl Config {
wasm_backtrace: true,
wasm_backtrace_details_env_used: false,
native_unwind_info: true,
- features: WasmFeatures::default(),
+ features: WasmFeatures {
+ threads: true,
+ ..Default::default()
+ },
#[cfg(feature = "async")]
async_stack_size: 2 << 20,
async_support: false,
I got this far with a patched wasi-libc:
❯ cargo run -- ~/pthread_create_example
Finished dev [unoptimized + debuginfo] target(s) in 0.23s
Running `target/debug/wasmtime /home/harald/pthread_create_example`
Error: failed to run main module `/home/harald/pthread_create_example`
Caused by:
0: failed to instantiate "/home/harald/pthread_create_example"
1: unknown import: `wasi_snapshot_preview2::thread_spawn` has not been defined
ok, with a wasi_snapshot_preview2::thread_spawn
stub function I get:
Error: failed to run main module `/home/harald/pthread_create_example`
Caused by:
0: failed to invoke command default
1: wasm trap: out of bounds memory access
wasm backtrace:
0: 0x298 - <unknown>!__wasm_init_memory
@Andrew Brown ^^^
Good to see you are making some progress; not sure what to make of the wasm trap: out of bounds memory access
--is there something wrong in wasi-libc
? (BTW, you probably don't need to patch Wasmtime since you should be alter the configuration from the CLI; just use cargo run -- --wasm-features threads ...
or something like that).
E.g., if your Wasm file's memory is declared like (memory 0 ...)
, then somewhere in the Wasm file there needs to be a memory.grow
or all memory accesses will be out of bounds.
Another thought, unrelated to the above, is that the wasi-threads
implementation I'm working on is not the only piece to get a working end-to-end example in Wasmtime. Wasmtime only has stub implementations for wait
and notify
so those need to be implemented as well since they are used by, e.g., pthread_join
in wasi-libc
.
those need to be implemented
Many chickens, many eggs...
yeah, it's the --shared-memory
linker directive causing the "__wasm_init_memory" trap... set also by -pthread
CC option
The memory.init instructions that initialize these passive segments and the data.drop instructions that mark them collectible will be emitted into a synthetic function __wasm_init_memory that is made the WebAssembly start function and called automatically on instantiation but is not exported. __wasm_init_memory shall perform any synchronization necessary to ensure that no thread returns from instantiation until memory has been fully initialized, even if a module is instantiated on multiple threads simultaneously.
so, it's the start function created by the linker
@Andrew Brown
(;@2b6 ;) memory.atomic.notify
0: failed to invoke `_start`
1: wasm trap: out of bounds memory access
wasm backtrace:
0: 0x2b6 - <unknown>!__wasm_init_memory
for this one
this is linker generated code, so it feels like wasmtime
is missing something here?
Err(
anyhow::anyhow!("unimplemented: wasm atomics (fn memory_atomic_notify) unsupported",)
.into(),
)
Yeah, this is what I meant up above:
Wasmtime only has stub implementations for wait and notify so those need to be implemented as well since they are used by...
I will hopefully have the WASI bindings for wasi-threads
available in some form today but one of us needs to take a look at implementing wait
and notify
to make an end-to-end example work. (Also, you might be interested in @Dan Gohman's recent change to avoid running constructors twice in wasi-libc: https://github.com/WebAssembly/wasi-libc/pull/339).
@Harald Hoyer, the progress I have made on a WIP implementation of wasi-threads
is available on a branch of that name in my fork: https://github.com/bytecodealliance/wasmtime/compare/main...abrown:wasmtime:wasi-threads. The unstable feature get_mut_unchecked
is a particularly heinous hack that will need to get resolved before I consider putting this up as a PR, even as a draft. I have documented the issues I observed in implementing this in the commit comments (note that this change stacks on #5054, included in the branch). Please use this change with the understanding that it is highly experimental, unsafe (no host import locking), and can easily be broken.
(cc: @Alex Crichton, @Dan Gohman, who may be interested as well)
diff --git a/tests/all/cli_tests.rs b/tests/all/cli_tests.rs
index fa13f32c4..002800bf0 100644
--- a/tests/all/cli_tests.rs
+++ b/tests/all/cli_tests.rs
@@ -404,13 +404,16 @@ fn run_threads() -> Result<()> {
let wasm = build_wasm("tests/all/cli_tests/threads.wat")?;
let stdout = run_wasmtime(&[
"run",
+ "--wasi-modules",
+ "experimental-wasi-threads",
+ "--wasm-features",
+ "threads",
+ "--disable-cache",
"--",
wasm.path().to_str().unwrap(),
- "--wasi-modules experimental-wasi-threads",
- "--wasm-features threads",
- "--disable-cache",
])?;
@Andrew Brown ^^
@Andrew Brown here is an attempt for memory_atomic_wait32 and memory_atomic_notify
https://github.com/haraldh/wasmtime/commit/2341d1bb1ae2892992b977c4162d495233d1700b
I kind of can report success, although the new threads do not have the same SharedMemory it seems
@Andrew Brown also I have made a PR to fix validate_atomic_addr()
.. https://github.com/bytecodealliance/wasmtime/pull/5063
So, if I use clang -pthread ...
it generates code with:
(memory (;0;) 2 2 shared)
(export "memory" (memory 0))
unless I use -Xlinker --import-memory
, which produces:
(import "env" "memory" (memory (;0;) 2 2 shared))
but then:
Error: failed to run main module `/home/harald/pthread_create_example`
Caused by:
0: failed to invoke command default
1: missing required memory export
wasm backtrace:
0: 0x751 - <unknown>!__wasi_fd_write
1: 0x953 - <unknown>!writev
2: 0x9e4 - <unknown>!__stdio_write
3: 0x119d - <unknown>!vfprintf
4: 0x4089 - <unknown>!fprintf
5: 0x6fa8 - <unknown>!__pthread_create
6: 0x4ab - <unknown>!__original_main
7: 0x2d5 - <unknown>!_start
re: your initial patch to fix the test's CLI arguments, thanks! I think that is right. I still see test failures due to the WAT reading-writing though, so something else needs fixing
re: the shared memory, I believe that for now we must both import a shared memory (so that all of the thread instances share the same one) AND export that same memory (so that Wiggle can use it for buffers in host calls)
I can't tell from your last example if the emitted module had both the import and the export, but it seems as it does not due to the error. I side-stepped this by hand-crafting the WAT test. I would think there is a way to tell Clang to do both?
btw, I found in crates/wasmtime/src/memory.rs:695 :
/// let shared_memory = SharedMemory::new(&engine, MemoryType::shared(1, 2))?;
/// let module = Module::new(&engine, r#"(module (memory (import "" "") 1 2 shared))"#)?;
/// let instance = Instance::new(&mut store, &module, &[shared_memory.into()])?;
and I have the feeling, that the SharedMemory
returned from your crates/wasi-threads/src/lib.rs:add_to_linker()
has to be used the same somehow
Andrew Brown said:
I can't tell from your last example if the emitted module had both the import and the export, but it seems as it does not due to the error. I side-stepped this by hand-crafting the WAT test. I would think there is a way to tell Clang to do both?
It had only the import
❯ $CC $CFLAGS -Xlinker --help ~/pthread_create_example.c -o ~/pthread_create_example|fgrep memory
--import-memory Import memory from the environment
--initial-memory=<value>
Initial size of the linear memory
--max-memory=<value> Maximum size of the linear memory
--shared-memory Use shared linear memory
--stack-first Place stack at start of linear memory rather than after data
Isn't export
just fine, as clang
does it with -pthread
(which activates the --shared-memory
linker flag)?
Harald Hoyer said:
and I have the feeling, that the
SharedMemory
returned from yourcrates/wasi-threads/src/lib.rs:add_to_linker()
has to be used the same somehow
Ah, so here's how that happens using a Linker
, which in this case is a more indirect way of defining the imports: if wasi_threads::add_to_linker
finds a shared memory import it must satisfy, it generates a new SharedMemory
and adds it to the Linker
here. (add_to_linker
should only be called once, so there should only be one shared memory to speak of). Then when the Linker
is used to instantiate
the child module here, that shared memory will be imported to the child instance.
Harald Hoyer said:
Isn't
export
just fine, asclang
does it with-pthread
(which activates the--shared-memory
linker flag)?
At least for now, until several things change, we will need to have the shared memory be both imported and exported.
Andrew Brown said:
re: the shared memory, I believe that for now we must both import a shared memory (so that all of the thread instances share the same one) AND export that same memory (so that Wiggle can use it for buffers in host calls)
Andrew Brown said:
Andrew Brown said:
re: the shared memory, I believe that for now we must both import a shared memory (so that all of the thread instances share the same one) AND export that same memory (so that Wiggle can use it for buffers in host calls)
Why not just export it? Then the runtime can use the exported shared memory when it creates new threads. At least this is how we do it on the web currently.
Having the runtime create a memory and pass it in seems like extra work that the runtime doesn't need to do. In emscripten we avoid creating the memory in the loader in almost all cases these days.
But we want the memory to actually be shared between all the threads: if the parent thread creates it then the only way to make sure the child threads use that same linear memory is some engine magic. By importing the shared memory, both the parent and the child threads can use the same memory in a Wasm-friendly way.
Andrew Brown said:
But we want the memory to actually be shared between all the threads: if the parent thread creates it then the only way to make sure the child threads use that same linear memory is some engine magic. By importing the shared memory, both the parent and the child threads can use the same memory in a Wasm-friendly way.
Oh, you are completely correct, please ignore me :) In emscripten we create the memory (SAB) in the loader code in the case of the pthreads. Doh
On the Web, with Worker-based threads, does a trap in one thread automatically take down the other threads, or does Emscripten implement this behavior in JS?
Emscripten has to implement the behavior internally. Basically, any uncaught exception in a worker is proxied back to the main thread an results in an abort that kill all threads/workers.
@Andrew Brown So, I can confirm threading works ... just one big problem: If I read that correct WasiCtx
is not thread safe and is shared to all threads mutable, which leads to unwanted strange side effects. Or, if WasiCtx
is not shared mutable, how can they share state change?
@Dan Gohman ? ^^
@Dan Gohman and I were just discussing this a couple of days ago. I was thinking of placing a mutex or read-write lock around each WASI context but struggling to find a refactoring in which this can be done optionally, so that users who do not want threads can avoid the locking overhead.
Maybe for your use case it makes sense to always lock...
Some global locks defy the use of threading. So while one thread wants to listen for a connection and waits, another cannot write.
Then there is the problem of RwLockWriteGuard not being Send, which needs to be with async and await. And WasiCtx being created from non async code.
anyway, the whole Table
is a problem
Let me retry again tomorrow.. I might have forgotten to Arc<> the RwLock<>s
Yeah, this sounds like the issues I was running into; what did you do re: the get_cx
signature?
There are no shared tables in wasm (yet) so each instance/threads would get its own table.
(assuming you mean the indirection function table)
Sam Clegg said:
(assuming you mean the indirection function table)
I mean the WasiCtx.table
with the WasiFile
s and directories and sockets.
@Harald Hoyer WasiCtx
as a whole is fairly problematic. I have rather strong opinions about its current implementation.
So, I managed to Arc<RwLock>>
the Table
entries ... with RwLock
from async_lock
. Every now and then, I get:
thread 'wasi-thread-680788223' panicked at 'wasi-thread-680788223 exited unsuccessfully: Cannot wait on pending future: must enable wiggle "async" future and execute on an async Store
Although I enabled "wasmtime-wasi/tokio", "wasmtime/async"
Here the debug output of some experimental C-lib with TLS and a simple pthread_create with two threads and join at the end.
❯ cargo +nightly run --features wasi-threads -- run --disable-cache --wasm-features threads --wasi-modules experimental-wasi-threads tt
Finished dev [unoptimized + debuginfo] target(s) in 0.16s
Running `target/debug/wasmtime run --disable-cache --wasm-features threads --wasi-modules experimental-wasi-threads tt`
m map = 0x11230
m &args->thread = 0x11228
m args->thread = 0x11230
m __wasi_thread_spawn(0x11220)
m ret = 28742
aligned_alloc sizeof(struct pthread) = 112 size=112
m map = 0x112c0
m &args->thread = 0x112b8
m args->thread = 0x112c0
m __wasi_thread_spawn(0x112b0)
wasi_thread_start start_args = 0x11220
wasi_thread_start &args->thread = 0x11228
wasi_thread_start args->thread = 0x11230
wasi_thread_start args->start_func = 0x1
wasi_thread_start args->start_arg = 0
First thread processing
Finished
__pthread_exit self = 0x11230
wasi_thread_start start_args = 0x112b0
wasi_thread_start &args->thread = 0x112b8
wasi_thread_start args->thread = 0x112c0
wasi_thread_start args->start_func = 0x1
wasi_thread_start args->start_arg = 0
Second thread processing
Finished
__pthread_exit self = 0x112c0
@Andrew Brown I got a version of your ./tests/all/cli_tests/threads.wat
, which reliably outputs:
Hello _start
Hello wasi_thread_start
Hello wasi_thread_start
Hello wasi_thread_start
Hello done
using memory.atomic.wait32
and memory.atomic.notify
let me rework stuff and make PRs for the memory.atomic
operations
Nice!
It's exciting that there is now a working example of all of this stuff working. Any more issues on the tool chain side? Now the trick is to see if we can find a way to make all of this upstreamable.
Do any of you want to meet today before going offline?
Harald Hoyer said:
Every now and then, I get:
thread 'wasi-thread-680788223' panicked at 'wasi-thread-680788223 exited unsuccessfully: Cannot wait on pending future: must enable wiggle "async" future and execute on an async Store
Although I enabled
"wasmtime-wasi/tokio", "wasmtime/async"
ok, managed to create a full async wasmtime version. No help needed anymore
Looking at the wat, I see
(global $__stack_pointer (;0;) (mut i32) i32.const 70000)
Is that something, which has to be TLS?
interesting ... https://github.com/kettle11/wasm_set_stack_pointer
ok, TIL global
is TLS
Andrew Brown said:
It's exciting that there is now a working example of all of this stuff working. Any more issues on the tool chain side? Now the trick is to see if we can find a way to make all of this upstreamable.
toolchain wise, the linker -Xlinker --import-memory
works, but the re-export (export "memory" (memory 0))
is missing then
other than that: CFLAGS=-pthread
seems to work fine
@Andrew Brown so, regarding the $__stack_pointer
I would love to hear your advice.. should it be a wasm assembler trampoline, which first allocates the stack, sets the $__stack_pointer
, then calls the real wasi_thread_start
func?
or can this be done from within wasmtime?
toolchain wise, the linker
-Xlinker --import-memory
works, but the re-export(export "memory" (memory 0))
is missing then
I looked into this and perhaps it is not yet possible in the toolchains. wasm-ld
(from wasi-sdk) has --export=<value>
to force a symbol to be exported but I don't think we can name memories with symbols yet (right, @Dan Gohman?). Perhaps there is some trick to convince wasm-ld
to both import the memory and then turn around and export it, but I don't know it and don't see any mention in the docs either.
The "import then export" pattern seems to me a temporary thing, but perhaps this is still a useful feature to have in wasm-ld
; what do you think @Dan Gohman, @Sam Clegg?
so, regarding the
$__stack_pointer
I would love to hear your advice
I think the linker is adding this, or perhaps you are triggering dynamic linking? I don't have the answer off-hand but my sense is that some modifications to __pthread_create
in wasi-libc should make this work?
To answer your earlier question about this, yes, I think the stack pointer should be part of the TLS data.
(rereading what I wrote, I think I would edit the above to read: I think the _stack_ should be part of the TLS data and the stack pointer global should be set as part of the TLS data setup in pthread_create
)
https://reviews.llvm.org/D135898 is a patch which I think should support "import than export"
please enlighten me and correct my statements.
$__stack_pointer
is a global
i32 pointer, which is thread local$__stack_pointer
points to memory in env.memory
env.memory
is shared across all threadsSo, what I am might see in my execution is $__stack_pointer
being used with the same index in threads (leading to UB), because nothing initializes $__stack_pointer
on thread start. And even, if, to what value should it be set???? Seems like the value of $__stack_pointer
is just the same on thread clone.
@Dan Gohman ^^
I guess, I will have to allocate memory and point the $__stack_pointer
to it, using hand written functions without $__stack_pointer
usage. The only remaining question is, how big should it be? Can I assume the current value of $__stack_pointer
is a good size?
Even, if I allocate new memory, would I need to clone the old one, because it could still reference memory from there?
I expect you'll want to have the parent allocate the memory for the new thread, before the new thread is created, so that you can just use regular malloc
and friends. Then the thread entrypoint can just have a small bit of code that assigns the thread-local stack pointer to the allocated stack before doing anything else.
got it working! :)
How do I allow memory.grow? Currently I am manually editing:
(import "env" "memory" (memory (;0;) 2 2 shared))
to (import "env" "memory" (memory (;0;) 100 100 shared))
@Dan Gohman ?
But doing that I can create and join 10 threads :)
❯ cargo +nightly run --features wasi-threads -- run --disable-cache --wasm-features bulk-memory,threads --wasi-modules experimental-wasi-threads tt
Finished dev [unoptimized + debuginfo] target(s) in 0.12s
Running `target/debug/wasmtime run --disable-cache --wasm-features bulk-memory,threads --wasi-modules experimental-wasi-threads tt`
Thread 70064 processing
Thread 70496 processing
Thread 70352 processing
Thread 70640 processing
Thread 70928 processing
Thread 69920 processing
Thread 70784 processing
Thread 70208 processing
Thread 71072 processing
Thread 71216 processing
Thread 71216 still processing
Thread 70928 still processing
Thread 70784 still processing
Thread 70208 still processing
Thread 69920 still processing
Thread 70496 still processing
Thread 71072 still processing
Thread 70352 still processing
Thread 70640 still processing
Thread 70064 still processing
Joined thread #0 69920
Joined thread #1 70064
Joined thread #2 70208
Joined thread #3 70352
Joined thread #4 70496
Joined thread #5 70640
Joined thread #6 70784
Joined thread #7 70928
Joined thread #8 71072
Joined thread #9 71216
All threads joined
Harald Hoyer said:
How do I allow memory.grow? Currently I am manually editing:
(import "env" "memory" (memory (;0;) 2 2 shared))
to(import "env" "memory" (memory (;0;) 100 100 shared))
I think you could use the --initial-memory
and --max-memory
flags in the wasm-ld
to do this when you compile the Wasm program (see here). (Sort of a separate issue, but this raises the question in my mind whether MUSL's stack limit heuristic here should change for Wasm.)
So, here is my wasi-libc PR:
https://github.com/WebAssembly/wasi-libc/pull/343
so, here is a notify/wait implementation https://github.com/bytecodealliance/wasmtime/pull/5255 with parking_lot_core
CC @Alex Crichton
CC @Amanieu
@Alex Crichton could you please give me some hints on how to construct a multi threaded test?
There are some basic examples here
but I don't have much to offer beyond combining the primitives that the wasmtime
crate API gives
Thanks!
@Alex Crichton first test added, more to come tomorrow
@Alex Crichton Is there a way to access shared memory from a guest compiled from rust? All the examples I've seen are in wat so I don't know if the tooling supports it. Some related questions:
std::sync::atomic
package be mapped into that shared memory?What build flags are needed to create a shared memory section?
I have not personally tried to get wasm32-wasi
working, but I know abrown has and may know more. Otherwise though this is sufficient for me: RUSTFLAGS=-Ctarget-feature=+atomics,+bulk-memory,+mutable-globals cargo +nightly build -Zbuild-std=panic_abort,std --target wasm32-unknown-unknown
Can the shared memory be accessed as a byte slice?
Yes and not, it's just "memory". Rust only works with a single linear memory and with +atomics
the singular linear memory is a shared memory.
Can Atomics from the std::sync::atomic package be mapped into that shared memory?
They can indeed, in +atomics
mode atomic instructions are used for all atomic operations.
Thanks. I think I'm still missing one or two pieces of information. How does the rust guest get a pointer to the shared memory region? It's clear how the host can get it, so the host could export a function that the guest could use to fetch the pointer, but I don't know if there's a way for the guest to obtain that pointer directly.
Then, can you use mem::transmute to tell rust that a set of bytes contain an Atomic? I did't know if llvm/wasm has a special way to access atomics (vs other rust data structures), so I wasn't sure if it would work for arbitrary memory locations
Rust doesn't really get a pointer to the region or anything specific like that, it just gets access to the whole thing. It's the same compilation model for Rust as before, it gets one linear memory, it's just that in this case it's a shared linear memory. If you mean whether the memory is defined locally or imported from the system then that's controlled by -Clink-arg=--import-memory
or not
@Alex Crichton @Andrew Brown first Draft for immutable WasiCtx
https://github.com/bytecodealliance/wasmtime/pull/5326
Still some things to do, of course
This time I managed to use std::sync::RwLock
by removing async
from WasiFile::num_ready_bytes()
, so no need to use tokio RwLock and go full tokio.
Most of us in the US are out due to Thanksgiving; I will take a look on Monday.
Hi @bstrie !
Hello!
@Andrew Brown @Alex Crichton meet @bstrie , he will take over my wasi threads work for now, as I was shifted to implement a page fault handler in our microkernel for lazy memory loading, to provide wasmtime with 2GB of mmaped static memory (default).
Last updated: Jan 24 2025 at 00:11 UTC