Introduction to WAMR WASI threads
Posted in introduction on July 3, 2023 by Marcin Kolny ‐ 7 min read
One of the functionalities missing in WebAssembly for a long time was the ability to spawn new threads within the process. Various runtimes made efforts to address this limitation by introducing non-standard APIs for thread creation. WAMR provides a WAMR pthread library
that implements a wide range of pthread APIs, (including synchronization primitives, pthread_create()
function and many more).
In 2022, the introduction of the WASI threads proposal aimed to establish a standardized API for thread creation in WebAssembly. Subsequently, in 2023, both (v1.2.0) and Wasmtime implemented this proposal. This article delves into the implementation details of the WASI threads proposal within WAMR, shedding light on the distinctions between the newly introduced WASI threads
and the pre-existing WAMR pthread library
implementation.
See the article on the Bytecode Alliance blog for the official WASI threads announcement.
WASI threads - overview
WASI threads proposal (as of today) defines a single hostcall that needs to be implemented by the runtime:
int32_t thread_spawn(uint32_t start_arg)
The purpose of this function is to start a new thread and return an identifier associated with that particular thread. The start_arg
parameter denotes a value passed to the newly created thread (e.g. can be used as a pointer to a complex data structure in memory). The thread_spawn
function is responsible for invoking the following entry function in the new thread:
void wasi_thread_start(int32_t tid, uint32_t start_arg)
The wasi_thread_start
function must be implemented and exported from the WASM module. The value of the start_arg
parameter is the same as the one passed to the thread_spawn
function.
POSIX threads implementation
WAMR pthread library
exposes POSIX Threads API to WASM code through native API functions. Each supported function within the library is implemented in WAMR native code and essentially serves as a wrapper for POSIX Threads.
In contrast, the WASI threads
proposal differs from WAMR pthread library
implementation as it requires the host environment to only expose a single function (thread_spawn
), while the complete implementation of the POSIX Threads API must be included in the WebAssembly code. Many of the API functions, such as pthread_join
and TSD, have already been implemented in the WASI libc and can be accessed through the WASI SDK starting from version 20.
Synchronization primitives
The WASI threads proposal focuses only on spawning new threads and does not cover synchronization primitives. However, WASI libc
already provides methods such as pthread_mutex*
or pthread_cond*
that are built on top of another proposal (Threads and atomics). This proposal’s implementation is available in WAMR from version 01-18-2022.
Memory model
The WASI threads proposal does not provide specific details about the memory model and treats it as an implementation detail. This paragraph highlights the differences between the memory models used in WASI libc
and the WAMR pthread
library.
The WAMR pthread library
allocates stack memory for each thread from the auxiliary (AUX) stack. The AUX memory is divided into N+1
equal regions, where N
represents the maximum number of threads that WAMR can spawn. This maximum thread count can be controlled either through the --max-threads=N
flag in iwasm
or by setting it programmatically using the wasm_cluster_set_max_thread_num()
API.
On the other hand, the WASI libc
implementation dynamically allocates stack memory for each thread from the linear memory, utilizing malloc()
. It also allocates memory for thread-local storage (TLS) and thread-specific data (TSD). However, it’s important to note that TLS is not supported in the WAMR pthread library
. The allocated memory is deallocated upon thread exit, making it available for reuse.
Dynamic memory allocation offers the advantage of more efficient memory usage compared to the WAMR pthread library
’s approach. If an application doesn’t frequently utilize the maximum number of threads specified by --max-threads
, the pre-allocated memory for those threads remains unutilized, occupying unnecessary space. Additionally, dynamic allocation eliminates the need to determine the number of threads at build time since the AUX stack size is determined by the linker flag.
WAMR implementation details
Underneath, the WASI threads
feature leverages the existing thread manager utilized by both the WAMR pthread library
and the WAMR embedded scenario. The thread manager keeps track of all the threads within a process, allowing them to interact with each other. For example, if one thread encounters an error, the thread manager can share that information with all the other running threads.
Conceptually, the process of requesting a new thread in a high level can be summarized as follows:
- Create a new instance of the module that requested a new thread
- Create a new execution environment with the newly created module instance and add it to the thread manager’s cluster
- Spawn a native thread; in the thread’s entry function, call an exported
wasi_thread_start
symbol from the module. - When the
wasi_thread_start
function completes (i.e. the thread is finished), destroy the module instance and execution environment
Example
As an example, we’ll compile and run a single hello world-like program on iwasm
(the code comes from the WASI threads announcement article):
#include <pthread.h>
#include <stdio.h>
#include <string.h>
#define NUM_THREADS 10
void *thread_entry_point(void *ctx) {
int id = (int) ctx;
printf(" in thread %d\n", id);
return 0;
}
int main(int argc, char **argv) {
pthread_t threads[10];
for (int i = 0; i < NUM_THREADS; i++) {
int ret = pthread_create(&threads[i], NULL, &thread_entry_point, (void *) i);
if (ret) {
printf("failed to spawn thread: %s", strerror(ret));
}
}
for (int i = 0; i < NUM_THREADS; i++) {
pthread_join(threads[i], NULL);
}
return 0;
}
Configure WAMR
To enable WASI threads support in WAMR, the flag WAMR_BUILD_LIB_WASI_THREADS
needs to be set to 1
. E.g. to build iwasm
for linux platform, use:
cmake -Bbuild -Sproduct-mini/platforms/linux/ -DWAMR_BUILD_LIB_WASI_THREADS=1
In addition to that, user can specify a maximum number of running threads:
- for
iwasm
, use--max-threads
CLI argument - for embedded WAMR, use
wasm_runtime_set_max_thread_num
to set the max number of threads If the max value is not specified, default value (currently set to 4) is used.
Compile WASM code
The easiest way to compile WASM application is through WASI SDK v20 (threading support is available from version 20).
/opt/wasi-sdk/bin/clang --target=wasm32-wasi-threads -Wl,--max-memory=1048576 -pthread hello_world.c -o hello_world.wasm
Run WASM code
The program can be now run e.g. using iwasm
:
./iwasm --max-threads=4 hello_world.wasm
And the output should be similar to:
Creating thread 0
Thread #262180, counter: 0
Creating thread 1
Thread #327860, counter: 0
Final counter value: 2
Benchmarking
Just to demonstrate the potential benefits of running multi-threaded applications, we ran iwasm
with two different parallelizable WASM programs:
- sorting - parallel implementation of merge sort
- compression - pigz and gzip compiled to WASM
Number of threads | Compression | Sorting |
---|---|---|
1 | 67s | 11s |
2 | 39s | 6s |
As expected, WASM program with two threads is about two times faster than its single-threaded version.
Next steps
In the past few months, the team spent a massive amount of time validating an implementation by testing various scenarios. That resulted with many bug fixes and significant stability improvements in the latest WAMR 1.2.2 release. Thanks to a team effort, WAMR supports now multi-threading for interpreter (classic, fast), JIT (LLVM, fast) and AOT modes. However, we’re not done yet:
- Rust toolchain support - WASI threads are already available to C programs through WASI libc. Using threads in Rust was not straightforward, but there’s already work in progress to implement the
std::thread
module from Rust standard library for thewasm32-wasi-threads
target - WASI threads API extensions - at the moment WASI threads API consists of a single function; there are currently discussions on whether more functions should be added to the interface (e.g. the discussion about
pthread_exit
-like interface) - many other bigger and smaller discussions regarding the specification - see the Github page for all the open threads
Contributors
The development of WASI threads in WAMR was a team effort involving various folks who deserve a big thanks:
- Alexandru Ene, Andrew Brown and Takashi Yamamoto for shaping the WASI threads proposal.
- Sam Clegg and Dan Gohman for their awesome work on the toolchain to support threading.
- Enrico Loparco, Georgii Rylov, Hritik Gupta and Wenyong Huang for making the proposal a reality in WAMR.