I am curious about the option of providing "host" functionality to modules via other modules. Particularly interesting is a case where a "decorator" module satisfies the import of a "user" module -- the function "func" for example -- , does a particular operation on the argument provided by the user module, and then calls the "func" function provided by the host (i.e., the environment where wasmtime is embedded). We could have a similar functionality in the other direction, where the decorator "intercepts" the value "returned" by the host.
It would be really cool to flexibly say that we either want to instantiate the module linking against the host or adding one or multiple decorators in between.
I am aware of the component model, but would first like to look into the more straight forward and stable implementations using import/exports of modules.
For the case where the function takes and returns only primitive types, I got it to work by using (a) different linkers for the "decorator" and the "user" module ( the "func" import of the decorator is linked to the host function in the decorator-linker, the exports of decorator module are provided as functions in the user-linker, where they are used to satisfy the import of the user module ) and a bit of aliasing to map between the function names and namespaces.
The more interesting case is the case where we want to "exchange" more complex data types. When interacting with the host, we would just allow it to read the information out of the module's memory and write the "result" back into it. I am reasonably sure that I could get the above flow to work by providing the "decorator" with functions offered by the host which would let it copy data between modules.
Super ideally, I would like to have a decorator which does not introduce additional copying. Since the "contract" between user module and decorator seems to be the same as between the user module and the host, it seems to be safe to let them share memory (the user blocks on each call which is served by the decorator/host).
So, what I was trying to do was to let both modules import memory and, upon instantiation, providing them with the same memory. The two problems I am running into there are:
wasi32-wasm
target with cargo, while setting the shared flag (I could hack around that though)Sorry for the long description. To formulate my question a bit shorter:
Thank you! :)
Here's the one weird trick to building Wasm .so modules which import their memory and function table with Rust today: https://bytecodealliance.zulipchat.com/#narrow/stream/235408-rust-toolchain/topic/Rust.20cdylib/near/424769116. Needless to say, we're planning to make that easier in the future.
To actually use such modules, your only option today is to use the component model via wasm-tools component link
(which is what componentize-py
does, and the CRuby folks are experimenting with doing likewise).
FWIW, I've had informal discussions with a few people about expanding https://github.com/WebAssembly/tool-conventions/blob/main/DynamicLinking.md to describe a module-level host interface for linking Wasm .so files either at startup or dynamically (via dlopen
) without necessarily using the component model. Emscripten already supports this, but presumably won't be able to fulfill WASI imports. Ideally, WASI-capable runtimes like Wasmtime would also support the same convention once we've documented it. I have no immediate plans to work on that myself (mostly because I'm all-in on the component model and have no reason to avoid it), but I do think it's a great idea.
I suppose I should add that you can use WASI .so files today without the component model, but you'd be responsible for writing a custom embedding of Wasmtime (or whatever your favorite WASI runtime is) and hooking everything up yourself, providing the imports, implementing dlopen
and friends if applicable, etc.
Thank you for your answer Joel :) . Two additional questions if I may:
The .so terminology comes from UNIX .so files (meaning "shared object"), equivalent to .dll files in Windows. The convention we've been using so far is to name libraries lib<name>.so, e.g. libpython312.so
, mirroring UNIX. You could instead name it e.g. python312.wasm,
but then it might be confused with the statically-linked python312.wasm
which is meant to be used as a CLI executable, not a shared library.
Yes, you can link a core module with a component by first wrapping the core module in a component using wasm-tools component new
. If the core module targets WASI 0.1, you'll need to give wasm-tools
an adapter (e.g. https://github.com/bytecodealliance/wasmtime/releases/download/v18.0.3/wasi_snapshot_preview1.reactor.wasm) which translates the 0.1 imports to 0.2 imports. Then you'll end up with two components which you can link together in a shared-nothing fashion using wasm-tools compose
. The key to making that work is to define the interface which the two components use to communicate with each other using WIT such that one of the components imports the interface and the other one exports it.
However, from what you described earlier, I take it you want "shared-everything" linking instead, such that the modules share the same memory and can pass pointers back and forth. In that case, you don't want component-level composition, but rather https://github.com/WebAssembly/component-model/blob/main/design/mvp/examples/SharedEverythingDynamicLinking.md, in which case you'll want to leave the units of composition as core modules (not components) and then use wasm-tools component link
to combine the modules into a single component. In that case you won't use WIT but rather the traditional C ABI. And in that case, you can even create a cyclical dependency where module A can import from module B and also vice versa -- wasm-tools component link
will handle that automatically.
I will dive into all of this and try it out.
Thank you for the great answer! :)
If I understood correctly, using the "shared-everything" linking would allow me to take two core modules: module A which imports a function (e.g., double_sum
) and module B which exports the function required by A while importing a function from the host (e.g., double_sum_host
). I then would use the wasm-tools component link
command on these two modules to create a single component which would import double_sum_host
and export a method used to start the module (which it would take from module A).
I tried to implement a minimal example with this functionality. Module A is compiled from a bin crate with the following main.rs
:
use std::time::Duration;
extern "C" {
fn double_num(n: i32) -> i32;
}
fn main() {
let mut num = 1;
loop {
std::thread::sleep(Duration::from_millis(500));
num = unsafe { double_num(num) };
println!("doubled the num; New value: {num}");
}
}
Module B is compiled from a lib crate with the following lib.rs
:
extern "C" {
fn double_num_host(n: i32) -> i32;
}
#[no_mangle]
pub extern "C" fn double_num(input: i32) -> i32 {
let doubled_once = input * 2;
unsafe { double_num_host(doubled_once) }
}
I create core modules from these crates using cargo build --target=wasm32-wasi --release
. When I then try to link the modules using wasm-tools component link -o linked_module.wasm module_a.wasm module_b.wasm
I am getting the following error:
error: failed to encode a component from modules
Caused by:
0: failed to extract linking metadata from module_a.wasm
1: unsupported export kind for memory: Memory
(getting the same error for module_b.wasm
if I omit module_a.wasm
from the link command).
Could you please tell me what I am doing wrong? Is it a problem with how I compile the modules to WASM?
I am not an expert here but I believe wasm-tools component link
is specific to dynamic libraries, following dynamic linking conventions; I'm not sure what extra steps that entails, possibly just different rustc options
Ah I think this is the magic: https://bytecodealliance.zulipchat.com/#narrow/stream/235408-rust-toolchain/topic/Rust.20cdylib/near/424769116
Ahh, I see, I have to compile them differently so that they can be dynamically linked. Thank you for the tip :)
Now, if I take the module B as in the example above and follow information you linked for creating a dynamically linked shared library module by :
(a) changing the crate-type to staticlib
(b) running RUSTFLAGS="-C relocation-model=pic" cargo +nightly build -Zbuild-std=panic_abort,std --release --target=wasm32-wasi
(runs without complaining; creates the files libmodule_b.a
and libmodule_b.d
in target/wasm32-wasi/release
)
(c) and then trying running clang -shared -Wl,--whole-archive libmodule_b.a --Wl,--no-whole-archive
I get the error:
clang: error: unsupported option '--Wl,--no-whole-archive'
.
Could you maybe point me to what I am doing wrong here?
Perhaps an old version of clang
? Or your OS is lying about it actually being clang? :smile: What does clang --version
show?
Ubuntu clang version 14.0.0-1ubuntu1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
You have an extra dash; it should just be -Wl,--no-whole-archive
with only one leading dash.
Ah, I see. Should have figured that out since it is in the command twice :) . Now a new error:
/usr/bin/ld: libmodule_b.a: member libmodule_b.a(module_b-98a5ed2d35966c6a.module_b.4409c8c284b8a334-cgu.0.rcgu.o) in archive is not an object
clang: error: linker command failed with exit code 1 (use -v to see invocation)
/usr/bin/ld is probably your host system linker; you'll need wasm-ld to link a wasm binary. One way to get one is https://github.com/WebAssembly/wasi-sdk/
Installed the release binary version 20 and defined it as the CC
variable as described in the repository. echo $CC
now prints /root/wasi-sdk-20.0/bin/clang --sysroot=/root/wasi-sdk-20.0/share/wasi-sysroot
.
When I then run $CC -shared -Wl,--whole-archive libmodule_b.a -Wl,--no-whole-archive
, I am getting
clang-16: warning: argument unused during compilation: '-shared' [-Wunused-command-line-argument]
wasm-ld: error: /root/wasi-sdk-20.0/share/wasi-sysroot/lib/wasm32-wasi/libc.a(__main_void.o): undefined symbol: main
clang-16: error: linker command failed with exit code 1 (use -v to see invocation)
Support for shared libraries is new in wasi-sdk version 21; so it looks like you need a newer version.
Thank you! :)
I think we are nearly there :) . With wasi-sdk version 21, the above command runs through (with just a warning about -share
being unstable) and creates a .wasm
file.
I have then changed module A
to also be a static lib, which then exports a run
function. When I now try to link these modules running wasm-tools component link -o combined.wasm module_a.wasm module_b.wasm
, I am getting an error about the modules requiring libc.so
:
error: failed to encode a component from modules
Caused by:
0: missing libraries:
module_a.wasm needs libc.so
module_b.wasm needs libc.so
do I need libc
as a module as well? How would I provide it to the modules?
Yes, libc.so
is shipped as part of wasi-sdk
21, so you can use that one. You'll need to tell wasm-tools component link
about it explicitly as one of the parameters.
On my system, it lives in /opt/wasi-sdk-21.0/share/wasi-sysroot/lib/wasm32-wasi/libc.so
Great :) . So, if I give it the libc.so
by specifying the --dl-openable <PATH-TO-libc.so>
of the wasm-tools component link
command, it stops complaining about libc.so
, but now is not happy that it doesn't know how to provide the import of module_b
:
error: failed to encode a component from modules
Caused by:
0: unresolved symbol(s):
module_b.wasm needs double_num_host (function [I32] -> [I32])
I understand that this import is not fulfilled, but this is kind of the point in my case: This is the import which would be satisfied by the host when the module created by linking is deployed. Would it be possible to adjust the configuration, such that not defined imports would not cause an error but could be linked to functions provided by the host/runtime?
Is the --adapt
parameter part of the solution? Would I need some kind of adapter explaining that this import will be available during deployment?
Keep in mind that wasm-tools component link
doesn't generate a module -- it generates a component. That means any undefined symbols must be imported at the component level (i.e. via an interface defined in WIT), not at the module level.
Alternatively, you could generate a component that imports a module, and use that imported module to provide the symbol, but wasm-tools component link
does not yet have support for that, unfortunately. It's definitely been something I've been planning to add, but haven't gotten around to it yet. I don't think it would be super hard to do, but I haven't even experimented with module imports in the component model yet, so I'm not sure how they work.
I see. Maybe a stupid question: As I compile the modules, they are using WASI, right (at the very least for the printing)? Why doesn't wasm-tools
have the same problem there? Aren't the wasi functions provided to the component by the host at the point of its deployment, the same way I would like to provide the double_num_host
function? How come they don't cause the same kind of problem for wasm-tools
?
If we're talking about WASI 0.2 imports, those all happen at the component level, so they're imported by the component as described above. If we're talking about WASI 0.1 imports, they are transformed into WASI 0.2 imports via an adapter.
Either way, wasm-tools component link
discovers information about component-level imports via custom sections attached to some or all of the modules it is given as parameters and uses that to determine what the component will import. Likewise for any component-level exports.
Those custom sections are normally added by wit-bindgen
when compiling the module(s).
If none of the modules have such custom sections, then either the component won't import or export anything (which would be kind of useless) or it needs the WASI 0.1->0.2 adapter, which will have the custom section with the component type information.
Last updated: Nov 22 2024 at 17:03 UTC