Stream: git-wasmtime

Topic: wasmtime / issue #8341 Add no_std support to Wasmtime


view this post on Zulip Wasmtime GitHub notifications bot (Apr 11 2024 at 20:54):

alexcrichton added the wasmtime:platform-support label to Issue #8341.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 11 2024 at 20:54):

alexcrichton opened issue #8341:

In this issue I'd like to propose officially adding support for Rust's #![no_std] mode and crate attribute to the wasmtime crate and runtime. Notably, I'm proposing that we refute current documentation about "why not no_std". This would additionally revert earlier work in https://github.com/bytecodealliance/wasmtime/pull/554 and https://github.com/bytecodealliance/wasmtime/pull/2024.

Before I go into more depth, this is not a new issue to Wasmtime. This has been discussed in a number of places such as https://github.com/bytecodealliance/wasmtime/issues/1158, https://github.com/bytecodealliance/wasmtime/issues/3451, https://github.com/bytecodealliance/wasmtime/issues/3495, https://github.com/bytecodealliance/wasmtime/issues/6424, https://github.com/bytecodealliance/wasmtime/issues/7700, and I'm sure I'm missing others as well. I'll also point out that I was personally one of the ones previously advocating for specifically not supporting no_std, and my opinion has changed in the intervening years.

Why support no_std?

As far as I know the benefits of no_std haven't really ever been in question. After all, who doesn't want a project to be able to run in as many places as possible? To me I would personally rephrase this question as why to use no_std to support platform as opposed to alternative strategies. I'd answer this with the fact that no_std is the most idiomatic and well-supported solution in the Rust ecosystem. Rust developers in the embedded space are already used to no_std and what it entails. Additionally there are community idioms/expectations around the no_std feature to follow which set precedent.

Why now?

For a number of years now Wasmtime has had a page in its documentation for "What about #[no_std]?", so I think a fair question is why would we revert this and reconsider this previous decision at this point in time. The embedded space has become more interested in WebAssembly over time and there are a fair number of users today. The general feedback is that Wasmtime is not suitable in these environments, and one primary reason is that it's difficult to get Wasmtime working in these environments. Work such as https://github.com/bytecodealliance/wasmtime/pull/7995 is to obscure for others to productively use as-is without being able to slot more idiomatically into the no_std ecosystem. More-or-less there's interest to use WebAssembly on platforms Wasmtime cannot easily target, and no_std support is intended to be a step towards making Wasmtime support on these platforms easier.

Was the previous rationale wrong?

In my opinion, the current documentation Wasmtime has for not supporting no_std is both right and wrong. Supporting no_std is not "free", it's something we'll have to maintain over time as a project and explicitly test in CI and consider for all new features. This cost wasn't necessarily justifiable earlier on in Wasmtime's development. Nowadays, though, Wasmtime has many more compile-time features and it's more clear what a no_std build of Wasmtime might look like. Supporting no_std is, however, idiomatic and the best way to support non-standard platforms in Rust. While it's theoretically possible to change how the standard library looks in upstream rust-lang/rust that's basically not happening any time soon.

What does no_std for Wasmtime mean?

With some of the more rationale-facing questions out of the way, I want to dive into some more details as to what no_std means for Wasmtime. If you read the title of this issue and say "wow finally it's about time Wasmtime had support" you're probably not going to be too interested in this section. This section is intended to outline what I learned from my experience of porting Wasmtime to no_std. I have two branches which ports the wasmparser crate to no_std as well as the runtime feature of the wasmtime crate:

Features of Wasmtime and no_std

Primarily the first part to mention is that I don't think we'll want to support every single feature of Wasmtime in the no_std configuration. Instead we'll want a few features explicitly listed as "this supports no_std" but everything else will implicitly enable a dependency on the std feature and the standard library. An example of this is that the runtime feature, the ability to execute WebAssembly modules, won't depend on std but the cache feature, which writes to a filesystem, would depend on the standard library. Notably, for the initial implementation, I'd also expect that cranelift itself would require the standard library. There's no inherent reason to do so, but that's how I wrote things originally.

Usage of the alloc crate

I also personally think it's worth clarifying that no_std for Wasmtime means core and alloc will be used. Notably this means that Wasmtime will still assume that all allocations succeed at all times and a failing allocation will cause a process abort and/or unrecoverable error. Wasmtime can't really feasibly be ported to an alloc-less build at this time, although if that's of interest to folks then it's something that might be worth investigating later on. I'm not sure how we'd support this in Wasmtime, though, so I'd prefer to defer such a question to a future issue. In the meantime the rest here will assume that no_std means that alloc can be used.

Technical details of what support looks like to users

To end users and consumers of Wasmtime what I'd propose is that the wasmtime crate will grow a compile-time Cargo feature called std. This feature is enabled by default and is additionally enabled by any other feature which depends on standard library support, such as cache. If all of Wasmtime's features are disabled, however, then a small set of features, such as runtime and component-model, can be enabled without enabling the std feature of Wasmtime. This will prevent use of the std crate on some platforms.

The wasmtime-runtime crate will, by default, use the standard library on platforms that are known to have the standard library. For example on Linux the wasmtime-runtime crate will continue to use std as necessary. If wasmtime-runtime is compiled for an unknown platform, however, then the custom platform support added in https://github.com/bytecodealliance/wasmtime/pull/7995 will be enabled by default. This means that Wasmtime will not work out-of-the-box since this header file will need to be implemented. This will be mentioned as part of Wasmtime's documentation and will be the escape hatch to enable embedders to implement custom logic to implement Wasmtime's runtime needs.

Put together, the general workflow for embedders using Wasmtime in no_std mode would look like:

My intent would be that we'd have at least one or two examples in the repository doing all this, like the current min-platform example.

Technical details of what support looks like to Wasmtime developers

To those who develop Wasmtime this issue is effectively asking more development cost to be taken on by the project. Despite having community norms and expectations the no_std development mode in Rust is not the standard development mode and may serve as a road bump to contributions and/or developers. Here I'd like to outline the specific issues I've found when porting Wasmtime to no_std:

  1. Imports now come from core and alloc, not std. This is mostly a muscle memory thing to get over, but it's a downside in that only some parts of the project will do this. For example in wasmtime-cache we'll probably still have use std::mem; whereas in wasmtime-runtime that would be spelled use core::mem;.
  2. The prelude is different for #![no_std] mode. Notably collections like Vec and String are not present. To work around this I've created a mod prelude { ... } in the wasmtime-environ crate which is reexported at all other crate roots. Most modules in crate then have use crate::prelude::*; instead of individually importing common types like String and Vec.
  3. Hash maps are different. The Rust standard library does not provide HashMap in the alloc
    [message truncated]

view this post on Zulip Wasmtime GitHub notifications bot (Apr 11 2024 at 22:40):

fitzgen commented on issue #8341:

I support this general direction, although I do have some bikes to shed on a few particular details. I'll leave those nitpicks for a little bit later on in the discussion.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 11 2024 at 23:18):

sunfishcode commented on issue #8341:

I support this direction as well.

As the smallest of bikesheds, I'm curious if there's a reason for using

#![no_std]
#[cfg(feature = "std")]
extern crate std;

instead of

#![cfg_attr(not(feature = "std"), no_std)]

view this post on Zulip Wasmtime GitHub notifications bot (Apr 12 2024 at 05:46):

bjorn3 commented on issue #8341:

The former consistently imports libcore and uses the libcore prelude, thus reducing the need for #[cfg(feature = "std")] elsewhere. Ot also makes rust-analyzer prefer importing from libcore over libstd.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 12 2024 at 18:12):

alexcrichton commented on issue #8341:

Carring over some thoughts from this comment I've looked more closely now at the hash map and seeding situation. As background the Rust standard library uses a DoS-resistant hashing algorithm which randomizes the seed for hashes for each thread. GIven wasmparser's ubiquitous use of hash maps with user-supplied inputs (wasms) it seems prudent to retain this protection where possible.

What I would propose as a solution for now is:

The consequence of this is that when std is enabled we should have the exact same implementation/guarantees as today. This should cover "big" use cases such as FaaS which take lots of wasms. Without std the ahash::RandomState type uses the address of a constant as a seed input as well as a stack-based address. This relies on ASLR for security and isn't as strong as libstd's seeding. That being said there is set_random_state which can be used to provide a higher quality seed. I think this is probably less efficient than the creation of hash maps in the standard library but that shouldn't affect Wasmtime too much.

There's two drawbacks to this. First is that the HashMap types in the public API of wasmtime and wasmparser will not be the same as hashbrown::HashMap (or IndexMap) because of the use of a custom hasher. That's more common for wasmparser than wasmtime but it's hopefully not too onerous (type aliases will be provided in wasmparser). Second is that in no-std mode the hashing DoS guarantees will not be as strong and the performance of creating hash maps will not get as much scrutiny. With ahash-based seeding though that should help improve the DoS posture if necessary and performance can always be improved over time as necessary too.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 13 2024 at 08:47):

Robbepop commented on issue #8341:

I appreciate the overall direction.

I am the maintainer of the Wasmi crate and am using BTreeMap instead of HashMap pervasively there. This is also true for its dependencies such as wasmparser-nostd as well as indexmap-nostd. So far I have seen no significant impacts on performance.
Therefore I wonder, why bother with all this initialization proceudure for HashMap in Wasmtime and its dependencies if there is another data structure just as good (*) and fast without all this hassle?
I can say with confidence that for some Wasmi users using HashMap is simply not a possibility even with the ahash solution due to deterministic builds, no ASLR and no way to make use of set_random_state.

(*) I am aware that generally a well-tuned HashMap might outperform BTreeMap but this heavily depends on the usage.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 13 2024 at 08:48):

Robbepop edited a comment on issue #8341:

I appreciate the overall direction.

I am the maintainer of the Wasmi crate and am using BTreeMap instead of HashMap pervasively there. This is also true for its dependencies such as wasmparser-nostd as well as indexmap-nostd. So far I have seen no significant impacts on performance.
Therefore I wonder, why bother with all this initialization proceudure for HashMap in Wasmtime and its dependencies if there is another data structure just as good and fast (*) without all this hassle?
I can say with confidence that for some Wasmi users using HashMap is simply not a possibility even with the ahash solution due to deterministic builds, no ASLR and no way to make use of set_random_state.

(*) I am aware that generally a well-tuned HashMap might outperform BTreeMap but this heavily depends on the usage.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 13 2024 at 08:48):

Robbepop edited a comment on issue #8341:

I appreciate the overall direction.

I am the maintainer of the Wasmi crate and am using BTreeMap instead of HashMap pervasively there. This is also true for its dependencies such as wasmparser-nostd as well as indexmap-nostd. So far I have seen no significant impacts on performance.
Therefore I wonder, why bother with all this initialization proceudure for HashMap in Wasmtime and its dependencies if there is another data structure just as good and fast (*) without all this hassle - at least where easily possible.
I can say with confidence that for some Wasmi users using HashMap is simply not a possibility even with the ahash solution due to deterministic builds, no ASLR and no way to make use of set_random_state.

(*) I am aware that generally a well-tuned HashMap might outperform BTreeMap but this heavily depends on the usage.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 13 2024 at 08:48):

Robbepop edited a comment on issue #8341:

I appreciate the overall direction.

I am the maintainer of the Wasmi crate and am using BTreeMap instead of HashMap pervasively there. This is also true for its dependencies such as wasmparser-nostd as well as indexmap-nostd. So far I have seen no significant impacts on performance.
Therefore I wonder, why bother with all this initialization proceudure for HashMap in Wasmtime and its dependencies if there is another data structure just as good and fast (*) without all this hassle - at least where easily possible.
Unfortunately for some Wasmi users using HashMap is simply not a possibility even with the ahash solution due to deterministic builds, no ASLR and no way to make use of set_random_state.

(*) I am aware that generally a well-tuned HashMap might outperform BTreeMap but this heavily depends on the usage.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 13 2024 at 08:49):

Robbepop edited a comment on issue #8341:

I appreciate the overall direction.

I am the maintainer of the Wasmi crate and am using BTreeMap instead of HashMap pervasively there. This is also true for its dependencies such as wasmparser-nostd as well as indexmap-nostd. So far I have seen no significant impacts on performance.
Therefore I wonder, why bother with all this initialization proceudure for HashMap in Wasmtime and its dependencies if there is another data structure just as good and fast (*) without all this hassle - at least where easily possible.
Unfortunately for some Wasmi users using HashMap is simply not a possibility even with the ahash solution due to deterministic builds, no ASLR and no reliable way to make use of set_random_state.

(*) I am aware that generally a well-tuned HashMap might outperform BTreeMap but this heavily depends on the usage.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 13 2024 at 08:52):

Robbepop edited a comment on issue #8341:

I appreciate the overall direction.

I am the maintainer of the Wasmi crate and am using BTreeMap instead of HashMap pervasively there. This is also true for its dependencies such as wasmparser-nostd as well as indexmap-nostd. So far I have seen no significant impacts on performance.
Therefore I wonder, why bother with all this initialization proceudure for HashMap in Wasmtime and its dependencies if there is another data structure just as good and fast (*) without all this hassle - at least where easily possible.
Unfortunately for some Wasmi users using HashMap is simply not a possibility even with the ahash solution due to deterministic builds, no ASLR and no reliable way to make use of set_random_state and the criticality of being resilient to malicious actors.

(*) I am aware that generally a well-tuned HashMap might outperform BTreeMap but this heavily depends on the usage.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 13 2024 at 13:17):

bjorn3 commented on issue #8341:

BTreeMap has a non-trivial impact on compile time performance and it too can have worse than average performance under adversarial circumstances (inssrting elements in ascending order is supposedly slower than inserting in descending order according to a recent topic on users.rust-lang.org.)

view this post on Zulip Wasmtime GitHub notifications bot (Apr 13 2024 at 14:10):

bjorn3 commented on issue #8341:

Unfortunately for some Wasmi users using HashMap is simply not a possibility even with the ahash solution due to deterministic builds

Do you mean that the output of compilation and the execution of the resulting compiled wasm module needs to be deterministic? That should already be the case even with our current HashMap usage. Or do you mean that the execution environment in which wasmtime would run must be deterministic itself too?

view this post on Zulip Wasmtime GitHub notifications bot (Apr 13 2024 at 14:54):

Robbepop commented on issue #8341:

BTreeMap has a non-trivial impact on compile time performance and it too can have worse than average performance under adversarial circumstances (inssrting elements in ascending order is supposedly slower than inserting in descending order according to a recent topic on users.rust-lang.org.)

Are you referring to this forum post from 2 days ago?
https://users.rust-lang.org/t/mystery-why-would-std-btreemap-be-slower-inserting-in-ascending-order-than-reverse/109720/4
I'd hope that the BTreeMap insertions still reside in O(log(n)) complexity even in the worst-case where it is significantly slower whereas HashMap could be exploited to have O(n) instead of O(1) insertion complexity. But maybe I am underinformed about BTreeMap implementations.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 13 2024 at 14:56):

Robbepop edited a comment on issue #8341:

BTreeMap has a non-trivial impact on compile time performance and it too can have worse than average performance under adversarial circumstances (inssrting elements in ascending order is supposedly slower than inserting in descending order according to a recent topic on users.rust-lang.org.)

Are you referring to this forum post from 2 days ago?
https://users.rust-lang.org/t/mystery-why-would-std-btreemap-be-slower-inserting-in-ascending-order-than-reverse/109720/4
I'd hope that the BTreeMap insertions still reside in O(log(n)) complexity even in the worst-case where it is significantly slower whereas HashMap could be exploited to have O(n) instead of O(1) insertion complexity. But maybe I am underinformed about BTreeMap implementations.

Or do you mean that the execution environment in which wasmtime would run must be deterministic itself too?

That's what I meant but just for Wasmi and not Wasmtime. Wasmi is a wasmparser (fork) user. Some users run Wasmi within Wasm (wasm32-unknown-unknown) executed via Wasmtime.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 13 2024 at 15:00):

Robbepop edited a comment on issue #8341:

BTreeMap has a non-trivial impact on compile time performance and it too can have worse than average performance under adversarial circumstances (inssrting elements in ascending order is supposedly slower than inserting in descending order according to a recent topic on users.rust-lang.org.)

Are you referring to this forum post from 2 days ago?
https://users.rust-lang.org/t/mystery-why-would-std-btreemap-be-slower-inserting-in-ascending-order-than-reverse/109720/4
I'd hope that the BTreeMap insertions still reside in O(log(n)) complexity even in the worst-case where it is significantly slower whereas HashMap could be exploited to have O(n) instead of O(1) insertion complexity. But maybe I am underinformed about BTreeMap implementations.

Or do you mean that the execution environment in which wasmtime would run must be deterministic itself too?

That's what I meant but just for Wasmi and not Wasmtime. Wasmi is a wasmparser (fork) user. Some users run Wasmi within Wasm (wasm32-unknown-unknown) executed via Wasmtime. There is no random source for set_random_state and I am not aware of any way to get the address of a global/constant in Wasm, too. Maybe via shadow stack global it is possible to get the address of an item on the stack? But it doesn't sound very solid to me.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 13 2024 at 15:00):

Robbepop edited a comment on issue #8341:

BTreeMap has a non-trivial impact on compile time performance and it too can have worse than average performance under adversarial circumstances (inssrting elements in ascending order is supposedly slower than inserting in descending order according to a recent topic on users.rust-lang.org.)

Are you referring to this forum post from 2 days ago?
https://users.rust-lang.org/t/mystery-why-would-std-btreemap-be-slower-inserting-in-ascending-order-than-reverse/109720/4
I'd hope that the BTreeMap insertions still reside in O(log(n)) complexity even in the worst-case where it is significantly slower whereas HashMap could be exploited to have O(n) instead of O(1) insertion complexity. But maybe I am underinformed about BTreeMap implementations.

Or do you mean that the execution environment in which wasmtime would run must be deterministic itself too?

That's what I meant but just for Wasmi and not Wasmtime. Wasmi is a wasmparser (fork) user. Some users run Wasmi within Wasm (wasm32-unknown-unknown) executed via Wasmtime. There is no random source for set_random_state and I am not aware of any way to get the address of a global/constant in Wasm, too. Maybe via shadow stack global it is possible to get the address of an item on the stack? But it doesn't sound very solid to me.

To clarify: my remakrs are mostly about the wasmparser crate.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 13 2024 at 15:01):

Robbepop edited a comment on issue #8341:

BTreeMap has a non-trivial impact on compile time performance and it too can have worse than average performance under adversarial circumstances (inssrting elements in ascending order is supposedly slower than inserting in descending order according to a recent topic on users.rust-lang.org.)

Are you referring to this forum post from 2 days ago?
https://users.rust-lang.org/t/mystery-why-would-std-btreemap-be-slower-inserting-in-ascending-order-than-reverse/109720/4
I'd hope that the BTreeMap insertions still reside in O(log(n)) complexity even in the worst-case where it is significantly slower whereas HashMap could be exploited to have O(n) instead of O(1) insertion complexity. But maybe I am underinformed about BTreeMap implementations.

Or do you mean that the execution environment in which wasmtime would run must be deterministic itself too?

That's what I meant but just for Wasmi and not Wasmtime. Wasmi is a wasmparser (fork) user. Some users run Wasmi within Wasm (wasm32-unknown-unknown) executed via Wasmtime. There is no random source for set_random_state and I am not aware of any way to get the address of a global/constant in Wasm, too. Maybe via shadow stack global it is possible to get the address of an item on the stack? But it doesn't sound very solid to me.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 13 2024 at 16:40):

bjorn3 commented on issue #8341:

Are you referring to this forum post from 2 days ago?

Yes

Some users run Wasmi within Wasm (wasm32-unknown-unknown) executed via Wasmtime.

I see. That only seems to be relevant to the wasmparser changes, and not the wasmtime changes as Wasmtime can't run in wasm anyway. It can compile wasm modules to machine code (wasmtime compile), but not run the resulting code.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 13 2024 at 17:15):

Robbepop commented on issue #8341:

Are you referring to this forum post from 2 days ago?

Yes

Some users run Wasmi within Wasm (wasm32-unknown-unknown) executed via Wasmtime.

I see. That only seems to be relevant to the wasmparser changes, and not the wasmtime changes as Wasmtime can't run in wasm anyway. It can compile wasm modules to machine code (wasmtime compile), but not run the resulting code.

Yes that is correct. I should have said that.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 15 2024 at 15:28):

alexcrichton commented on issue #8341:

I don't really have benchmarks one way or another, but at the same time I would be hesitant to replace all usage of all hash maps in Wasmtime with BTree maps because randomness/seeding may be hard to come by in some niche situations. Hash maps feel tailor made for most of what Wasmtime needs in various purposes, so if they're problematic I would imagine that a refactoring to remove the need for a map completely would be necessary. In that sense I would prefer to not use no_std support as a reason to change how data structures are done in Wasmtime, but instead consider it as motivation to refactor and rewrite preexisting algorithms to avoid types that are less suitable in a no_std context.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 17 2024 at 15:17):

alexcrichton commented on issue #8341:

I was talking with @tschneidereit yesterday about this issue and one point he brought up is that it would be reasonable to have a compile-time Cargo feature on the wasmparser crate for whether to use b-tree maps or hash maps for your use case @Robbepop perhaps. That would probably be off-by-default but opting-in to it I think would be reasonable to do. Doing so would mostly look like "enhance the wrapper map types to be their own structure to abstract internally on hash-map-or-not" which seems reasonable to maintain in wasmparser.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 17 2024 at 15:34):

Robbepop commented on issue #8341:

I was talking with @tschneidereit yesterday about this issue and one point he brought up is that it would be reasonable to have a compile-time Cargo feature on the wasmparser crate for whether to use b-tree maps or hash maps for your use case @Robbepop perhaps. That would probably be off-by-default but opting-in to it I think would be reasonable to do. Doing so would mostly look like "enhance the wrapper map types to be their own structure to abstract internally on hash-map-or-not" which seems reasonable to maintain in wasmparser.

That would be amazing for me! :heart:

view this post on Zulip Wasmtime GitHub notifications bot (Apr 17 2024 at 15:36):

Robbepop edited a comment on issue #8341:

I was talking with @tschneidereit yesterday about this issue and one point he brought up is that it would be reasonable to have a compile-time Cargo feature on the wasmparser crate for whether to use b-tree maps or hash maps for your use case @Robbepop perhaps. That would probably be off-by-default but opting-in to it I think would be reasonable to do. Doing so would mostly look like "enhance the wrapper map types to be their own structure to abstract internally on hash-map-or-not" which seems reasonable to maintain in wasmparser.

That would be amazing for me! :heart:
Though what is the intended solution for indexmap which uses hashmaps internally? So far I have used my own no_std reimplementation of it: https://crates.io/crates/indexmap-nostd

view this post on Zulip Wasmtime GitHub notifications bot (Apr 17 2024 at 15:36):

Robbepop edited a comment on issue #8341:

I was talking with @tschneidereit yesterday about this issue and one point he brought up is that it would be reasonable to have a compile-time Cargo feature on the wasmparser crate for whether to use b-tree maps or hash maps for your use case @Robbepop perhaps. That would probably be off-by-default but opting-in to it I think would be reasonable to do. Doing so would mostly look like "enhance the wrapper map types to be their own structure to abstract internally on hash-map-or-not" which seems reasonable to maintain in wasmparser.

That would be amazing for me! :heart:
Though what is the intended solution for indexmap which uses hashmaps internally? So far I have used my own no_std reimplementation of it in my wasmparser-nostd fork: https://crates.io/crates/indexmap-nostd

view this post on Zulip Wasmtime GitHub notifications bot (Apr 17 2024 at 15:37):

Robbepop edited a comment on issue #8341:

I was talking with @tschneidereit yesterday about this issue and one point he brought up is that it would be reasonable to have a compile-time Cargo feature on the wasmparser crate for whether to use b-tree maps or hash maps for your use case @Robbepop perhaps. That would probably be off-by-default but opting-in to it I think would be reasonable to do. Doing so would mostly look like "enhance the wrapper map types to be their own structure to abstract internally on hash-map-or-not" which seems reasonable to maintain in wasmparser.

That would be amazing for me! :heart:

Though what is the intended solution for indexmap which uses hashmaps internally? So far I have used my own no_std reimplementation of it in my wasmparser-nostd fork: https://crates.io/crates/indexmap-nostd

If we plan to actually use this indexmap-nostd fork then we'd have to give it a proper cleanup first.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 17 2024 at 18:41):

alexcrichton commented on issue #8341:

My thinking was that the type aliases in wasmparser would become type definitions which internally #[cfg] for hash maps or tree-based maps. For the tree-based-indexmap-version I was thinking we'd implement it directly in the wasmparser crate itself as opposed to using a fork of indexmap.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 17 2024 at 18:42):

alexcrichton edited a comment on issue #8341:

My thinking was that the type aliases in wasmparser would become type definitions which internally #[cfg] for hash maps or tree-based maps. For the tree-based-indexmap-version I was thinking we'd implement it directly in the wasmparser crate itself as opposed to using a fork of indexmap (under the assumption it's not too too big)

view this post on Zulip Wasmtime GitHub notifications bot (Apr 17 2024 at 19:40):

Robbepop commented on issue #8341:

My thinking was that the type aliases in wasmparser would become type definitions which internally #[cfg] for hash maps or tree-based maps. For the tree-based-indexmap-version I was thinking we'd implement it directly in the wasmparser crate itself as opposed to using a fork of indexmap (under the assumption it's not too too big)

Okay that could work. We could use the indexmap-nostd implementation as basis. Its implementation focuses on simplicity and from my own experiences with Wasmi I never saw perf problems because of it.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 18 2024 at 06:26):

JonasKruckenberg commented on issue #8341:

As someone who has maintained their own no_std fork of cranelift for a number of months now I'm overjoyed to see you considering the usecase!
I have a couple features related to riscv on my fork that Id love to contribute back upstream

I'd also expect that cranelift itself would require the standard library. There's no inherent reason to do so, but that's how I wrote things originally.

If I can help out with this in any way I would love to do so. I see no_std cranelift as being more useful in embedded/low-level anyway

view this post on Zulip Wasmtime GitHub notifications bot (Apr 18 2024 at 06:27):

JonasKruckenberg edited a comment on issue #8341:

As someone who has maintained their own no_std fork of cranelift for a number of months now I'm overjoyed to see you considering the usecase!
I have a couple features related to riscv on my fork that Id love to contribute back upstream

I'd also expect that cranelift itself would require the standard library. There's no inherent reason to do so, but that's how I wrote things originally.

If I can help out with this in any way I would love to do so. I see no_std cranelift as being more useful in embedded/low-level as in those environments one needs greater control over execution & memory (I do anyways)

view this post on Zulip Wasmtime GitHub notifications bot (Apr 18 2024 at 14:33):

alexcrichton commented on issue #8341:

@Robbepop sounds good! And also to make sure I clarify I don't plan on personally pursuing the BTreeMap approach of wasmparser, but I wanted to outline how I think it'd be reasonable to add so if you're up for making a PR once the no_std work starts landing I'd be happy to review.

@JonasKruckenberg oh nice! I think that PRs to add no_std to Cranelift after this work starts landing would indeed be welcome. I asked some folks at the Cranelift meeting yesterday and there were no objections and I suspect most won't object to this. One point brought up was that if Wasmtime is "doing the no_std thing" then it's pretty easy for Cranelift to do it as well as it'll just be an extension of what Wasmtime is already doing.


Also, to clarify, I'm not sure precisely how best to have a "go no go" decision on this issue. My rough plan is to bring it up at the next Wasmtime meeting (a week from today) and assuming there are no objections start landing work at that point. My plan in landing the work is to do it piecemeal crate-by-crate so reviewers have an easier time seeing the impact as things progress.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 18 2024 at 16:06):

Robbepop commented on issue #8341:

@Robbepop sounds good! And also to make sure I clarify I don't plan on personally pursuing the BTreeMap approach of wasmparser, but I wanted to outline how I think it'd be reasonable to add so if you're up for making a PR once the no_std work starts landing I'd be happy to review.

@alexcrichton Yes, I'd be willing to file that PR once no_std support landed in the wasmparser crate. :)

view this post on Zulip Wasmtime GitHub notifications bot (Apr 18 2024 at 16:07):

Robbepop edited a comment on issue #8341:

@Robbepop sounds good! And also to make sure I clarify I don't plan on personally pursuing the BTreeMap approach of wasmparser, but I wanted to outline how I think it'd be reasonable to add so if you're up for making a PR once the no_std work starts landing I'd be happy to review.

@alexcrichton Yes, I'd be willing to file that PR once no_std support landed in the wasmparser crate. :)

I am already looking forward to the day I can finally put the wasmparser-nostd fork to rest.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 26 2024 at 08:37):

JonasKruckenberg commented on issue #8341:

For reference I pulled my no_std changes out of vendored folders and into a proper fork here: https://github.com/JonasKruckenberg/wasmtime/tree/no_std.
I've changed thiserror to onlyerror (which implements core::error::Error) which in turn requires error_in_core for no_std builds. I'm personally fine with that (most no_std applications probably use nightly anyway) but I can also see the opposite and would be open to just disable the error implementation on no_std platforms like I've done across cranelift-codegen already

view this post on Zulip Wasmtime GitHub notifications bot (Apr 26 2024 at 08:56):

bjorn3 commented on issue #8341:

I'm a bit worried about the use of spin::Once. Spinlocks are terrible for throughput in case of contention as the spinning task/thread/... will starve the task/thread/... which would unlock the spinlock until the next preemption, which in case of a real time OS may never come if the thread waiting for the lock has a higher priority than the thread which locked it. I believe the critical-section crate is frequently used for locks in a way that allows the user to specify how locks should actually be implemented internally. Looks like it doesn't have a ready made Once implementation though. How hard would it be to make it possible to create the MachineEnv's in a const fn such that no runtime initialization is needed?

view this post on Zulip Wasmtime GitHub notifications bot (Apr 26 2024 at 09:25):

JonasKruckenberg commented on issue #8341:

Agreed, the addition of spin for such a small use feels pretty suboptimal. I'm not sure if contention is a concern here though (from my understanding of the code this is essentially homegrown lazy_static to cache the created MachineEnv) so you'd need multiple threads to hit the same "once state is uninitialised" codepath (which may happen a lot in typical cranelift operation idk)

If we come up with a better solution I'd be more than happy ripping out spin (on a dependency minimalist basis alone)

maybe on no_std targets we just don't cache the MachineEnv?
or change Callee::machine_env to take &mut self and cache the created MachineEnv not in s static but in its own state?

from looking at the regalloc2 docs it seems MachineEnv has a field that is Vec<[PReg]> so a const fn is out of the question I think :/

view this post on Zulip Wasmtime GitHub notifications bot (Apr 26 2024 at 09:26):

JonasKruckenberg edited a comment on issue #8341:

Agreed, the addition of spin for such a small use feels pretty suboptimal. I'm not sure if contention is a concern here though (from my understanding of the code this is essentially homegrown lazy_static to cache the created MachineEnv) so you'd need multiple threads to hit the same "once state is uninitialised" codepath (which may happen a lot in typical cranelift operation idk)
but on a dependency minimalist basis alone I'd be more than happy ripping out spin if we come up with a better solution

maybe on no_std targets we just don't cache the MachineEnv?
or change Callee::machine_env to take &mut self and cache the created MachineEnv not in s static but in its own state?

from looking at the regalloc2 docs it seems MachineEnv has a field that is Vec<[PReg]> so a const fn is out of the question I think :/

view this post on Zulip Wasmtime GitHub notifications bot (Apr 26 2024 at 09:26):

JonasKruckenberg edited a comment on issue #8341:

Agreed, the addition of spin for such a small use feels pretty suboptimal. I'm not sure if contention is a concern here though (from my understanding of the code this is essentially homegrown lazy_static to cache the created MachineEnv) so you'd need multiple threads to hit the same "once state is uninitialised" codepath (which may happen a lot in typical cranelift operation idk)
but on a dependency minimalist basis alone I'd be more than happy ripping out spin if we come up with a better solution

maybe on no_std targets we just don't cache the MachineEnv?
or change Callee::machine_env to take &mut self and cache the created MachineEnv not in a static but in its own state?

from looking at the regalloc2 docs it seems MachineEnv has a field that is Vec<[PReg]> so a const fn is out of the question I think :/

view this post on Zulip Wasmtime GitHub notifications bot (Apr 26 2024 at 09:37):

bjorn3 commented on issue #8341:

from looking at the regalloc2 docs it seems MachineEnv has a field that is Vec<[PReg]> so a const fn is out of the question I think :/

That could theoretically be replaced by a Cow<'static, [PReg]> or even &'static [PReg] I think, which would allow const initialization by providing a const &'static [PReg] as value.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 26 2024 at 10:03):

JonasKruckenberg commented on issue #8341:

That could theoretically be replaced by a Cow<'static, [PReg]> or even &'static [PReg] I think, which would allow const initialization by providing a const &'static [PReg] as value.

Ah yeah, that might be worthwhile on its own actually. In the meantime I updated my branch with this https://github.com/JonasKruckenberg/wasmtime/commit/eab337c71a3955339105dc9c01aeae9b118c516f (and cherry picked into it's PR #8489)

view this post on Zulip Wasmtime GitHub notifications bot (May 06 2024 at 19:48):

alexcrichton closed issue #8341:

In this issue I'd like to propose officially adding support for Rust's #![no_std] mode and crate attribute to the wasmtime crate and runtime. Notably, I'm proposing that we refute current documentation about "why not no_std". This would additionally revert earlier work in https://github.com/bytecodealliance/wasmtime/pull/554 and https://github.com/bytecodealliance/wasmtime/pull/2024.

Before I go into more depth, this is not a new issue to Wasmtime. This has been discussed in a number of places such as https://github.com/bytecodealliance/wasmtime/issues/1158, https://github.com/bytecodealliance/wasmtime/issues/3451, https://github.com/bytecodealliance/wasmtime/issues/3495, https://github.com/bytecodealliance/wasmtime/issues/6424, https://github.com/bytecodealliance/wasmtime/issues/7700, and I'm sure I'm missing others as well. I'll also point out that I was personally one of the ones previously advocating for specifically not supporting no_std, and my opinion has changed in the intervening years.

Why support no_std?

As far as I know the benefits of no_std haven't really ever been in question. After all, who doesn't want a project to be able to run in as many places as possible? To me I would personally rephrase this question as why to use no_std to support platform as opposed to alternative strategies. I'd answer this with the fact that no_std is the most idiomatic and well-supported solution in the Rust ecosystem. Rust developers in the embedded space are already used to no_std and what it entails. Additionally there are community idioms/expectations around the no_std feature to follow which set precedent.

Why now?

For a number of years now Wasmtime has had a page in its documentation for "What about #[no_std]?", so I think a fair question is why would we revert this and reconsider this previous decision at this point in time. The embedded space has become more interested in WebAssembly over time and there are a fair number of users today. The general feedback is that Wasmtime is not suitable in these environments, and one primary reason is that it's difficult to get Wasmtime working in these environments. Work such as https://github.com/bytecodealliance/wasmtime/pull/7995 is to obscure for others to productively use as-is without being able to slot more idiomatically into the no_std ecosystem. More-or-less there's interest to use WebAssembly on platforms Wasmtime cannot easily target, and no_std support is intended to be a step towards making Wasmtime support on these platforms easier.

Was the previous rationale wrong?

In my opinion, the current documentation Wasmtime has for not supporting no_std is both right and wrong. Supporting no_std is not "free", it's something we'll have to maintain over time as a project and explicitly test in CI and consider for all new features. This cost wasn't necessarily justifiable earlier on in Wasmtime's development. Nowadays, though, Wasmtime has many more compile-time features and it's more clear what a no_std build of Wasmtime might look like. Supporting no_std is, however, idiomatic and the best way to support non-standard platforms in Rust. While it's theoretically possible to change how the standard library looks in upstream rust-lang/rust that's basically not happening any time soon.

What does no_std for Wasmtime mean?

With some of the more rationale-facing questions out of the way, I want to dive into some more details as to what no_std means for Wasmtime. If you read the title of this issue and say "wow finally it's about time Wasmtime had support" you're probably not going to be too interested in this section. This section is intended to outline what I learned from my experience of porting Wasmtime to no_std. I have two branches which ports the wasmparser crate to no_std as well as the runtime feature of the wasmtime crate:

Features of Wasmtime and no_std

Primarily the first part to mention is that I don't think we'll want to support every single feature of Wasmtime in the no_std configuration. Instead we'll want a few features explicitly listed as "this supports no_std" but everything else will implicitly enable a dependency on the std feature and the standard library. An example of this is that the runtime feature, the ability to execute WebAssembly modules, won't depend on std but the cache feature, which writes to a filesystem, would depend on the standard library. Notably, for the initial implementation, I'd also expect that cranelift itself would require the standard library. There's no inherent reason to do so, but that's how I wrote things originally.

Usage of the alloc crate

I also personally think it's worth clarifying that no_std for Wasmtime means core and alloc will be used. Notably this means that Wasmtime will still assume that all allocations succeed at all times and a failing allocation will cause a process abort and/or unrecoverable error. Wasmtime can't really feasibly be ported to an alloc-less build at this time, although if that's of interest to folks then it's something that might be worth investigating later on. I'm not sure how we'd support this in Wasmtime, though, so I'd prefer to defer such a question to a future issue. In the meantime the rest here will assume that no_std means that alloc can be used.

Technical details of what support looks like to users

To end users and consumers of Wasmtime what I'd propose is that the wasmtime crate will grow a compile-time Cargo feature called std. This feature is enabled by default and is additionally enabled by any other feature which depends on standard library support, such as cache. If all of Wasmtime's features are disabled, however, then a small set of features, such as runtime and component-model, can be enabled without enabling the std feature of Wasmtime. This will prevent use of the std crate on some platforms.

The wasmtime-runtime crate will, by default, use the standard library on platforms that are known to have the standard library. For example on Linux the wasmtime-runtime crate will continue to use std as necessary. If wasmtime-runtime is compiled for an unknown platform, however, then the custom platform support added in https://github.com/bytecodealliance/wasmtime/pull/7995 will be enabled by default. This means that Wasmtime will not work out-of-the-box since this header file will need to be implemented. This will be mentioned as part of Wasmtime's documentation and will be the escape hatch to enable embedders to implement custom logic to implement Wasmtime's runtime needs.

Put together, the general workflow for embedders using Wasmtime in no_std mode would look like:

My intent would be that we'd have at least one or two examples in the repository doing all this, like the current min-platform example.

Technical details of what support looks like to Wasmtime developers

To those who develop Wasmtime this issue is effectively asking more development cost to be taken on by the project. Despite having community norms and expectations the no_std development mode in Rust is not the standard development mode and may serve as a road bump to contributions and/or developers. Here I'd like to outline the specific issues I've found when porting Wasmtime to no_std:

  1. Imports now come from core and alloc, not std. This is mostly a muscle memory thing to get over, but it's a downside in that only some parts of the project will do this. For example in wasmtime-cache we'll probably still have use std::mem; whereas in wasmtime-runtime that would be spelled use core::mem;.
  2. The prelude is different for #![no_std] mode. Notably collections like Vec and String are not present. To work around this I've created a mod prelude { ... } in the wasmtime-environ crate which is reexported at all other crate roots. Most modules in crate then have use crate::prelude::*; instead of individually importing common types like String and Vec.
  3. Hash maps are different. The Rust standard library does not provide HashMap in the alloc
    [message truncated]

view this post on Zulip Wasmtime GitHub notifications bot (May 06 2024 at 19:48):

alexcrichton commented on issue #8341:

With https://github.com/bytecodealliance/wasmtime/pull/8555 I believe that all the necessary pieces are now in place, so I'm going to close this.


Last updated: Dec 23 2024 at 13:07 UTC