Stream: git-wasmtime

Topic: wasmtime / PR #10321 Add load_code_raw for platforms w/o ...


view this post on Zulip Wasmtime GitHub notifications bot (Mar 03 2025 at 16:46):

posborne opened PR #10321 from posborne:copyless-deserialize-no-std to bytecodealliance:main:

For targets without virtual memory support, the only mechanism for load code previously was to copy bytes from a provided slice into a separate, owned, allocated buffer containing a copy of the serialized module contents. This is expensive for constrained targets and prohibits embedded runtimes like doing things like running code directly out of memory-mapped devices such as NOR flash.

The tradeoff for directly using the externally owned memory is that the caller must ensure that the code memory will not be written to for the lifetime of the CodeMemory.

Issue #10245


This code has limited testing right now as the min-platform example is only able to run in-tree and CI when virtual memory is enabled via the custom feature in the embedding. A potentially interesting enhancement would be to support another form of the example which targets a "true" embedded target such as a cortex-m processor (cortex-m4/thumbv7-none-eabihf or similar -- such as would target an STM32 MCU). This could be setup to run on a developers machine and assert expected output in CI via QEMU.

CC @alexcrichton @ia0 @sunfishcode

<!--
Please make sure you include the following information:

Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.html

Please ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->

view this post on Zulip Wasmtime GitHub notifications bot (Mar 03 2025 at 16:46):

posborne requested fitzgen for a review on PR #10321.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 03 2025 at 16:46):

posborne requested wasmtime-core-reviewers for a review on PR #10321.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 03 2025 at 16:47):

posborne submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 03 2025 at 16:47):

posborne created PR review comment:

Reusing NonNull/SendSyncPtr here and casting the const ptr to mut definitely feels hacky so feedback very welcome on a better approach if one would make sense for this.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 03 2025 at 17:21):

posborne updated PR #10321.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 03 2025 at 17:50):

fitzgen created PR review comment:

There is no difference between *mut T and *const T in Rust's semantics, so it isn't a big deal.

Soft preference for using .cast::<T>() and .cast_{const,mut}() methods for conversions, since they are a little more explicit and future proof, since they cast just one thing at a time, where as can change multiple things.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 03 2025 at 17:50):

fitzgen submitted PR review:

Looks great! Just a couple nitpicks below.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 03 2025 at 17:50):

fitzgen created PR review comment:

Maybe also mention that the pointer must be valid for size bytes and must point to the contents of a .cwasm file or result of {Module,Component}::serialize-style APIs.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 03 2025 at 18:53):

posborne submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 03 2025 at 18:53):

posborne created PR review comment:

I'll change the cast; the ptr semantics between *const and *mut matched my understanding as well (at least within this context), but it always throws me off a bit. Casting to mut and then mutating would potentially be a source of UB I believe.

The usage of NonNull in this way would seem to essentially fall under this note from the docs:

Notice that NonNull<T> has a From instance for &T. However, this does not change the fact that mutating through a (pointer derived from a) shared reference is undefined behavior unless the mutation happens inside an UnsafeCell<T>. The same goes for creating a mutable reference from a shared reference. When using this From instance without an UnsafeCell<T>, it is your responsibility to ensure that as_mut is never called, and as_ptr is never used for mutation.

Since we are prohibiting mutation (via as_mut()), the usage should be sound. Some of the UB details may be specific to data originally owned by rust, but that could be the case here as well, even though the API is taking a raw *const.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 03 2025 at 20:02):

posborne updated PR #10321.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 03 2025 at 20:04):

posborne updated PR #10321.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 03 2025 at 20:30):

alexcrichton submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 03 2025 at 20:30):

alexcrichton created PR review comment:

A small bikeshed on this, one thing we could do here is take NonNull<[u8]> which (a) bundles ptr/len into one, and (b) helps clarify the contract that it must be non-null

view this post on Zulip Wasmtime GitHub notifications bot (Mar 03 2025 at 21:21):

posborne submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 03 2025 at 21:21):

posborne created PR review comment:

I'm definitely not against this; I think the counterpoint is that, at least with the type signature, you lose the ability to document that the usage of the memory is going to be immutable (which I do believe would be an important invariant for the intended use case) and we end up pushing the *mut transmute to callers who have a reference to an immutable chunk of memory.

So, we gain the ability to model with types that we require a non-null pointer but we lose the ability to inform the caller that we're using the memory immutably. :shrug:

view this post on Zulip Wasmtime GitHub notifications bot (Mar 03 2025 at 22:07):

fitzgen submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 03 2025 at 22:07):

fitzgen created PR review comment:

That's already true as soon as you're using raw pointers instead of safe references though, so we are already in that world here.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 03 2025 at 23:18):

posborne updated PR #10321.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 03 2025 at 23:19):

posborne submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 03 2025 at 23:19):

posborne created PR review comment:

Pushed a new commit making the change to NonNull<[u8]>.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 04 2025 at 15:41):

alexcrichton submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 04 2025 at 15:41):

alexcrichton created PR review comment:

Could this unconditionally use the new_externally_owned variant here? I think it'd be best to keep the semantics of this API independent of activated crate features since deserialize_raw is documented as not copying but here using from_slice_with_alignment I think would do a copy?

If that chang is made I think it's also fine to drop alignment here and basically document that as part of the unsafe contract.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 04 2025 at 15:42):

alexcrichton submitted PR review:

Oh one last comment, for testing could something be added to tests/all/module.rs perhaps? It doesn't need to be super comprehensive but I think it'd be good to have at least a small test (maybe Pulley-only?) which loads a module, asserts the image is pointing within the original pointer, and maybe does some basic arithmetic in-wasm to ensure it works.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 04 2025 at 17:22):

posborne submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 04 2025 at 17:22):

posborne created PR review comment:

Could this unconditionally use the new_externally_owned variant here?

Yeah, should be doable. I'll do some testing to ensure we don't simply panic or crash but are able to nicely reject modules that will not be able to be execute without mmap but agree that it's probably best to provide this on all platforms in a consistent fashion.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 04 2025 at 23:14):

posborne updated PR #10321.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 04 2025 at 23:17):

posborne commented on PR #10321:

@alexcrichton Hopefully the updates address your latest comments, I did a little bit of extra in order to ensure that we bailed out with hopefully reasonable errors rather than panics for cases where virtual memory is required.

I wasn't able to find a clear way to assert in the test context that the memory offsets were identical as a lot of the APIs that actually make it through to the CodeMemory and mmap are either not public or they work on copy types. It's likely that I missed something that I can assert on but I did at least test that we can do some basic math.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 04 2025 at 23:59):

alexcrichton commented on PR #10321:

:+1: tests and changes look great to me, no need to test what I was thinking if it's difficult

view this post on Zulip Wasmtime GitHub notifications bot (Mar 05 2025 at 00:28):

posborne updated PR #10321.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 05 2025 at 00:29):

posborne commented on PR #10321:

Fixed the compile error, missed doing a final compile check on a target with virtual memory.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 05 2025 at 00:30):

alexcrichton has enabled auto merge for PR #10321.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 05 2025 at 01:42):

posborne commented on PR #10321:

Ok, I missed that for platforms like i686 that the default engine config falls back on pulley so those wouldn't expect to fail -- I should be able to handle that just fine. I'll try to do some testing for those cases and fix that up tomorrow morning. Will probably just do a check around the assertion on the engine target triple containing "pulley".

With that small change I'll probably amend the original commit and squash down as the initial title about only adding this support for platforms without virtual memory support is no longer true as well.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 05 2025 at 15:55):

posborne updated PR #10321.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 05 2025 at 15:59):

posborne edited PR #10321:

For targets without virtual memory support, the only mechanism for
load code previously was to copy bytes from a provided slice into
a separate, owned, allocated buffer containing a copy of the serialized
module contents. This is expensive for constrained targets and
prohibits embedded runtimes like doing things like running code
directly out of memory-mapped devices such as NOR flash.

The tradeoff for directly using the externally owned memory is that
the caller must ensure that the code memory will not be written
to for the lifetime of the CodeMemory.

Loading code from externally owned memory is supported for all
configurations but is expected to fail to deserialize on platforms
that support virtual memory and are attempting to load modules
or components that require that the memory be made executable
(native code rather than pulley).

https://github.com/bytecodealliance/wasmtime/issues/10245


Last updated: Apr 17 2025 at 01:31 UTC