Stream: wasmtime

Topic: undefined symbol wasmtime_tls_get


view this post on Zulip Julien Cretin (ia0) (Feb 18 2025 at 12:16):

I'm missing wasmtime_tls_{get,set} when linking for thumbv7em-none-eabi. Am I supposed to implement them like the min-platform example? Here's the code: https://github.com/google/wasefire/compare/main...ia0:wasefire:pulley (single core where wasmtime is not accessed from interrupt handlers)

Context: I'm trying to use Pulley on embedded platforms after the recent progress on https://github.com/bytecodealliance/wasmtime/issues/7311

Follow-up question: When I implement those functions myself, I get a panic at src/runtime/vm/mmap_vec.rs:72 "Allocation of MmapVec storage failed" trying to allocate 264912 bytes, which is too much. Is there a way to control this particular allocation to be below 128K?

Thanks!

Secure firmware framework focusing on developer experience - Comparing google:main...ia0:pulley · google/wasefire
my co-worker told me that wasmtime project wanted to hear use-cases/demands/requirements for embedded scenarios. here is FYI about our requirements. Feature support embedded environment by "embedde...

view this post on Zulip Julien Cretin (ia0) (Feb 18 2025 at 12:22):

For the last question, it seems that it's the pulley bytecode which is huge (264920 bytes versus 7771 for the wasm module, i.e. 34x bigger). It seems the pulley bytecode is a full ELF file. Is there a way to have a binary format on par with WASM bytecode? The pulley module will be written to flash which is also a limited resource, also I'm surprised it needs to be loaded to RAM at runtime. The pulley interpreter needs to modify it? Can't it read it from the provided read-only slice?

view this post on Zulip Julien Cretin (ia0) (Feb 18 2025 at 14:08):

(seems like you can't edit your messages, or I fail to find how) The code was merged (and branch deleted), so here it is now: https://github.com/google/wasefire/pull/753

Target Runtime Perf Flash RAM Linux Base 33.1 Linux Wasm3 2531 (76x) Linux Wasmi 1556 (47x) Linux Wasmtime 22652 (684x) Nordic Base 0.0925 144920 5384 Nordic Wasmi 4.6 (50x) 8...

view this post on Zulip Alex Crichton (Feb 18 2025 at 17:20):

Am I supposed to implement them like the min-platform example?

Yes, there's some more documentation here on that but the tl;dr is that you need to implement this header file (and that's released per-version of Wasmtime)

A lightweight WebAssembly runtime that is fast, secure, and standards-compliant - bytecodealliance/wasmtime

view this post on Zulip Alex Crichton (Feb 18 2025 at 17:22):

Is there a way to control this particular allocation to be below 128K?

Yes and no. If the wasm module itself is asking for more than 128K of memory, for example if it's initial linear memory is 3+ pages, then there's nothing that can be done from the embedder about that. You'll instead need to build the wasm module differently such that it requires 2 or fewer pages. In the future the custom-page-sizes proposal should help this but that's not integrated into toolchains yet (although I think it will be soon-ish)

Otherwise you can also explore various configuration options such as Config::memory_reservation_for_growth where the defaults may not be suitable for your embedding (for example you might want to set that to zero)

view this post on Zulip Alex Crichton (Feb 18 2025 at 17:26):

it seems that it's the pulley bytecode which is huge

My guess is that a large part of this is padding with zeros. We emit object files that are suitable for mmap-ing to virtual memory to assist with copy-on-write initialization. For Pulley we don't know the target platform so we conservatively assume the target page size is 64k (which corresponds to some arm64 platforms) which greatly increases the size of the ELF output file. This naturally doesn't make sense in an embedded context, however, and such padding shouldn't be present at all. Basically one big optimization here is going to be plumbing the config knob for "CoW is disabled, don't align things".

Otherwise Pulley doesn't modify bytecode, it's possible to read it all from a read-only slice provided im a rom. The missing piece here is that internally Wasmtime needs to expose an API to read this from an external location rather than trying to copy it around.

Basically I think the issues you're seeing here should be fixable but will likely require modifications to Wasmtime. We'd be more than happy to help guide such changes and review, too!

view this post on Zulip Ralph (Feb 18 2025 at 17:55):

those would be cool wasmtime advances, frankly.....

view this post on Zulip Julien Cretin (ia0) (Feb 18 2025 at 19:48):

Thanks for the answers!

view this post on Zulip Alex Crichton (Feb 18 2025 at 20:18):

Ah ok yeah 262k tracks with the cwasm you're loading so that makes sense. Agreed that's probably what's happening here. And no worries on contributing, I'll take some time soon to file issues for these improvements regardless.

For binary size there's notes here on various rust compile flags you can use to build a minimally-sized binary. It's one where we've tried to optimize for size/dependencies in Wasmtime but we got to a point where we can't reasonably push it further without something concrete to work towards. If you're able to have a standalone "example embedding" that would be extremely helpful for us to have something to target (e.g. "this crate" should compile down to something smaller than XX kilobytes or something like that)

view this post on Zulip Alex Crichton (Feb 18 2025 at 20:26):

Also for binary size even ballpark numbers would be super helpful. For example if you're aiming for 10k vs aiming for 1M we don't have many existing users with constraints like that so we've just been shooting in the dark historically

view this post on Zulip Julien Cretin (ia0) (Feb 18 2025 at 22:48):

Actually for binary size, I forgot to subtract the embedded ELF, so using Wasmtime is only 120KiB bigger (making the final binary 2 to 3 times bigger depending on features) instead of the 380KiB with ELF. So it's not as bad as I initially thought (in particular because I expect the perf to be around 100x if not more). To give an order of magnitude for numbers:

view this post on Zulip Alex Crichton (Feb 18 2025 at 23:08):

Oh wow those are fantastic numbers, thank you for those!

When I've looked at Wasmtime historically Module::deserialize was a huge portion of the size, specifically the serde deserialization and validation that happens of the loaded module/Config. My guess is that it wouldn't be too too hard to optimize all that and cut it down to a much smaller size, probably shaving off ~20k at least from the base. That being said I haven't done any sort of rigorous testing in the past.

I realize that this may be a bit of a stretch, but if you're able to describe what your embedding does (or even better have a fork/project that can be built) and/or describe what the wasm is doing (or even better share a sample wasm) that'd be awesome. I'd love to take some time into the future myself to dig in and see what can't be improved (in addition to the cwasm improvements above)

view this post on Zulip Alex Crichton (Feb 19 2025 at 20:35):

Ok I've filed https://github.com/bytecodealliance/wasmtime/issues/10244 and https://github.com/bytecodealliance/wasmtime/issues/10245 for follow-ups on this

Currently Wasmtime will allocate space in *.cwasm files to ensure that sections are page-aligned on disk. This is done for two reasons: When mmap-ing a *.cwasm into the address space the .text sect...
Currently in Wasmtime we have Module::deserialize and Module::deserialize_file. Given these APIs though it has the fundamental requirements that deserialize will copy the bytes into Wasmtime (e.g. ...

view this post on Zulip Julien Cretin (ia0) (Feb 20 2025 at 18:22):

Thanks, I've added some background in https://github.com/google/wasefire/issues/458 and answered your last questions there. The next update from my side might take some time (on vacations with kid).

Now that Wasmtime has no-std support, it becomes a possible alternative for the platform WASM runtime. This task should track the feasibility of using Wasmtime, since many roadblocks are expected (...

view this post on Zulip Paul Osborne (Feb 24 2025 at 21:28):

I've opened https://github.com/bytecodealliance/wasmtime/pull/10285 for #10244 and plan to take a look at #10245, will update if it looks like I'll be landing an impl for that as well.

When targeting pulley we aren't directly emitting executable code in the .text section and we don't have a good idea of the true target page size so we end up with ELF files that can have a...

Last updated: Feb 27 2025 at 22:03 UTC