alexcrichton opened PR #3350 from expose-raw
to main
:
This commit is what is hopefully going to be my last installment within
the saga of optimizing function calls in/out of WebAssembly modules in
the C API. This is yet another alternative approach to #3345 (sorry) but
also contains everything necessary to make the C API fast. As in #3345
the general idea is just moving checks out of the call path in the same
style ofTypedFunc
.This new strategy takes inspiration from previously learned attempts
effectively "just" exposes how we previously passed*mut u128
through
trampolines for arguments/results. This storage format is formalized
through a newValRaw
union that is exposed from thewasmtime
crate.
By doing this it made it relatively easy to expose two new APIs:
Func::new_unchecked
Func::call_unchecked
These are the same as their checked equivalents except that they're
unsafe
and they work with*mut ValRaw
rather than safe slices of
Val
. Working with these eschews type checks and such and requires
callers/embedders to do the right thing.These two new functions are then exposed via the C API with new
functions, enabling C to have a fast-path of calling/defining functions.
This fast path is akin toFunc::wrap
in Rust, although that API can't
be built in C due to C not having generics in the same way that Rust
has.For some benchmarks, the benchmarks here are:
nop
- Call a wasm function from the host that does nothing and
returns nothing.
i64
- Call a wasm function from the host, the wasm function calls a
host function, and the host function returns ani64
all the way out to
the original caller.
many
- Call a wasm function from the host, the wasm calls
host function with 5i32
parameters, and then ani64
result is
returned back to the original host
i64
host - just the overhead of the wasm calling the host, so the
wasm calls the host function in a loop.
many
host - same asi64
host, but calling themany
host function.All numbers in this table are in nanoseconds, and this is just one
measurement as well so there's bound to be some variation in the precise
numbers here.
Name Rust C (before) C (after) nop 19 112 25 i64 22 207 32 many 27 189 34 i64 host 2 38 5 many host 7 75 8 The main conclusion here is that the C API is significantly faster than
before when using the*_unchecked
variants of APIs. The Rust
implementation is still the ceiling (or floor I guess?) for performance
The main reason that C is slower than Rust is that a little bit more has
to travel through memory where on the Rust side of things we can
monomorphize and inline a bit more to get rid of that. Overall though
the costs are way way down from where they were originally and I don't
plan on doing a whole lot more myself at this time. There's various
things we theoretically could do I've considered but implementation-wise
I think they'll be much more weighty.<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
alexcrichton updated PR #3350 from expose-raw
to main
.
alexcrichton updated PR #3350 from expose-raw
to main
.
alexcrichton requested peterhuene for a review on PR #3350.
alexcrichton updated PR #3350 from expose-raw
to main
.
peterhuene submitted PR review.
peterhuene submitted PR review.
peterhuene created PR review comment:
Do we need a comment regarding the ownership of the returned value? Caller is responsible for calling
wasmtime_externref_delete
, right?
peterhuene created PR review comment:
Should
ref
beconst
(seems like other functions in this file could useconst
for their parameters as well, e.g.wasmtime_externref_clone
,wasmtime_externref_data
, etc) so that there's no confusion regarding owernship?
peterhuene created PR review comment:
Is it more accurate to say that the returned raw value is not tracked by the garbage collector and the underlying
externref
_may_ be collected if a GC occurs, thereby leaving the raw value dangling?From this wording it sounds like the dangling is guaranteed, but if, for example, the
wasmtime_externref_t
given to this function hasn't been deleted yet, it should still have a strong reference even after a GC occurs and the raw value would still be valid, correct?
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Indeed!
alexcrichton updated PR #3350 from expose-raw
to main
.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Nah yeah that makes sense, I've pushed some tweaks to the wording.
peterhuene submitted PR review.
alexcrichton merged PR #3350.
Last updated: Jan 24 2025 at 00:11 UTC