Stream: rfc-notifications

Topic: rfcs / PR #43 Add support for defining builtin host funct...


view this post on Zulip RFC notifications bot (May 28 2025 at 23:21):

fitzgen opened PR #43 from fitzgen:compile-time-builtins to bytecodealliance:main:

Add support for defining builtin host functions at compile-time. Because these functions are early-bound at compile-time -- rather than late-bound at instantiation-time, like regular imports -- the Wasm's compilation can be specialized for these exact imports, enabling inlining without just-in-time compilation, for example. This is the rough equivalent of the [js-string-builtins proposal][js-string-builtins] but for Wasmtime's API and Wasmtime embedding environments rather than JavaScript's WebAssembly API and JavaScript execution environments.

[js-string-builtins]: https://github.com/WebAssembly/js-string-builtins/blob/main/proposals/js-string-builtins/Overview.md

Rendered

view this post on Zulip RFC notifications bot (May 29 2025 at 03:18):

cfallin commented on PR #43:

Thanks for writing this up -- I think this will be a really powerful ability once we have it, and will be extremely important for certain kinds of applications!

When I was originally mulling over this design space for the zero-copy buffer use-case, I had been imagining something like a raw CLIF interface, but I agree that that's got a lot of downsides and is pretty much fully subsumed by the other options. I'm happy to see we're moving toward the "just define the logic in Wasm" idea (and also not the "special sublanguage" idea, though I liked that when we talked about it too) -- this cleans up a lot of duplication.

I think my input here comes in two major lines:

  1. I suspect that the ability to make slow-path calls is going to be essential to many of the real use-cases we have imagined for this feature (certainly for zero-copy buffers: they have the "append and maybe grow" behavior that Vec does, so a mutable API will need this, and even read-only accessors will have more complex paths for e.g. moving to the next rope segment that may or may not be practical to write inline). So I think this

    Should we allow self-hosted Wasm to import and call non-intrinsic functions?

    might not be punt-able as

    I think this is something we will want eventually, but I'd like to get something basic working first before tackling this more-complicated use case.

    if we want to use the thing in our embedding.

  2. I find myself going back and forth on whether the compile-time-builtin Wasm module should be "special" (with constraints as this RFC describes) or more general and which is actually more in the spirit of the standards -- I'll address this one first as I think it addresses the first question if we resolve it another way.

Preliminarily: yes 100% to

We must not deviate from the WebAssembly language semantics.

and also to the more subjective "let's not encourage people to use a nonstandard extension" (and to clarify to any readers: yes that concern is still in scope even if we are literally standards-compliant by presenting function imports, because a sufficiently tempting set of function imports can become a de-facto standard).

However I find myself wondering whether we shouldn't do something a little more general and provide

  1. The load/store imports this RFC defines, and
  2. The ability to fix a module import (of any general Wasm module) while compiling another Wasm module, and implement some (simple one-level?) inlining from that module, with none of the other restrictions this RFC names.

I'll call this the "privileged adapter module" approach.


On the "negative motivation" side first (against special Wasm subset for "self-hosted Wasm"): defining restrictions on which subset of Wasm modules is acceptable for a compile-time-builtin module could be seen as restricting/subsetting the standard somewhat. I can absolutely see the logic for it (as the RFC says, discouraging general use, but also in particular: it's simpler if we don't have a VMContext for the special module that is separate from the module that is using it) but isn't it also defining a "Wasm-prime" in the other direction?

On the "positive side" now (for fully general Wasm for "self-hosted Wasm"): the spirit of virtualization and precedent elsewhere (e.g., WASI virtualization, and also places that we talk about adapter modules, such as the debug RFC) suggests to me at least that there isn't too much danger in defining a more privileged interface (here: "load/store host memory") and then allowing any general Wasm module to be an adapter module that provides a higher-level "safe" service on top of it. (In fact we know of folks doing this in production with components.) It is still standard Wasm -- it just has a particular API available to it, and one that the embedder must enable for a specific module. If someone wanted to implement this "peek/poke API" today as ordinary host functions, they could; we are saying that we recognize the need for it because we want to move more logic into Wasm for inlinability. Philosophically, I think the notion that we can never have privileged interfaces imported into a Wasm module (because someone might write Wasm that always requires the privileged interfaces and misuse the specialized environment as a general environment) sits less well with me: it says that Wasm is somehow not universal, and can't be used to implement some parts of the system.

Said another way: "root privilege" (arbitrary load/store) then virtualization is more or less directly aligned, I think, with how capability systems are supposed to work. The idea is that the danger is in plumbing the wrong capabilities to the wrong modules, and safety requires us to put the right access-filtering or -subsetting modules inline with powerful capabilities. But this is already true, and we already trust our embedders to "wire things up right" because one can write arbitrary hostcall implementations or grant the wrong pre-opens or whatnot. Right now in Wasmtime I think we haven't seen this situation much because we have host-native filtering of most of the privileges we grant (e.g. WASI APIs) but I think there's nothing fundamental about that.


And finally, if we restrict ourselves to (i) these load/store intrinsics, provided when configured to a privileged adapter module, and (ii) early binding of this adapter module in a way that enables inlining, then all of the open design questions are addressed, as far as I can tell:


Anyway -- all strong opinions, relatively weakly held -- happy to discuss further!

view this post on Zulip RFC notifications bot (Jun 05 2025 at 23:14):

abrown commented on PR #43:

I don't have a strong opinion on whether we need two-level inlining (intrinsics -> self-hosted -> component) or just a single level (intrinsics -> component) — @cfallin is already saying some of the things I was thinking. But I do want to point out how much inlining we're doing and make a plug for making that easier. In either case, we want to inline some CLIF instructions for each intrinsic, right? I'm not sure I caught where the intrinsics were to be specified (are they proposed additions to the component model?) but, in any case, I was imagining there would be some compiler code that converted a call to these special imports into some CLIF, like we currently do for CM built-ins and trampolines (?). And this is what I was hoping could become easier: I know you were kind of discarding the first idea, "exposing CLIF", but it seems helpful for this kind of problem: we tell the compiler "here are the CLIF instructions for calls to this import". I understood from your RFC the danger of misuse, so perhaps it should not be a public, embedder-accessible API, but just having an easy way to inline intrinsics could make it easier to pursue the self-hosted functions?

view this post on Zulip RFC notifications bot (Jun 12 2025 at 14:12):

saulecabrera submitted PR review.

view this post on Zulip RFC notifications bot (Jun 12 2025 at 14:12):

saulecabrera created PR review comment:

I believe the Winch topic was (briefly?) discussed during the Wasmtime 06-05 meeting. However, I wanted to mention that I agree with the no-inlining direction for Winch; even though I believe that some sort of inlining is feasible, especially for the types of functions outlined in this RFC, I'm also very confident that doing so, has the potential to against Winch's simplicity and compilation performance design principles.

view this post on Zulip RFC notifications bot (Jun 13 2025 at 19:04):

fitzgen submitted PR review.

view this post on Zulip RFC notifications bot (Jun 13 2025 at 19:04):

fitzgen created PR review comment:

Makes total sense, thanks for weighing in.

view this post on Zulip RFC notifications bot (Jun 13 2025 at 19:17):

fitzgen commented on PR #43:

@abrown

I don't have a strong opinion on whether we need two-level inlining (intrinsics -> self-hosted -> component) or just a single level (intrinsics -> component) — @cfallin is already saying some of the things I was thinking. But I do want to point out how much inlining we're doing and make a plug for making that easier. In either case, we want to inline some CLIF instructions for each intrinsic, right? I'm not sure I caught where the intrinsics were to be specified (are they proposed additions to the component model?) but, in any case, I was imagining there would be some compiler code that converted a call to these special imports into some CLIF, like we currently do for CM built-ins and trampolines (?). And this is what I was hoping could become easier: I know you were kind of discarding the first idea, "exposing CLIF", but it seems helpful for this kind of problem: we tell the compiler "here are the CLIF instructions for calls to this import". I understood from your RFC the danger of misuse, so perhaps it should not be a public, embedder-accessible API, but just having an easy way to inline intrinsics could make it easier to pursue the self-hosted functions?

Yes, there will be two kinds of inlining:

  1. The compile-time builtins need access to intrinsics for reading/writing native memory, and these instrinsics need to be inlined to meet our performance goals. This will happen with a very ham-fisted approach during Wasm-to-CLIF translation where we immediately turn calls to these intrinsics into the relevant CLIF instrucitons.
  2. We need to inline the compile-time builtins into the Wasm application that imports them. Depending on the approach we take, and the constraints we put on the shape of compile-time builtins, this could be done with the same ham-fisted approach used for (1). My thinking recently, reflected in the last Wasmtime meeting's discussion but not yet in this RFC, has been to instead make a more general inliner-as-a-library kind of thing for Cranelift that still allows Cranelift embedders to drive overall compilation like they do today but provides hooks for them to do inlining and use their own inlining heuristics. This would allow us to also do things like inlining calls across components and their fused adapters.

view this post on Zulip RFC notifications bot (Jun 13 2025 at 19:39):

saulecabrera edited PR review comment.

view this post on Zulip RFC notifications bot (Jul 29 2025 at 19:14):

fitzgen updated PR #43.

view this post on Zulip RFC notifications bot (Jul 29 2025 at 19:23):

fitzgen commented on PR #43:

Heads up: I've updated this RFC with a more-concrete proposal after the discussion in this issue and at various Cranelift and Wasmtime meetings.

I've also implemented general function inlining for Wasmtime in https://github.com/bytecodealliance/wasmtime/pull/11283. Right now, that is useful for inlining calls between components (including their generated adapters) but with compile-time builtins can also be reused to inline the definitions of compile-time builtins into their callers. And we shouldn't really have to do anything special to get that inlining, it should just happen For Free when inlining is enabled. (I do expect we will need to tweak the inlining heuristics as time goes on, but that is a separate discussion.)

I think there is just one last open question we need to resolve before moving forward: exactly how to implement the resource.address intrinsic (or something similar / equally powerful).

Please take a look at the updated RFC and let me know what you think and any ideas you have for that last open question!

view this post on Zulip RFC notifications bot (Jul 29 2025 at 20:45):

programmerjake submitted PR review.

view this post on Zulip RFC notifications bot (Jul 29 2025 at 20:45):

programmerjake created PR review comment:

you could use LLVM intrinsics as inspiration and have the import's name tell you which type and resources to use, e.g. have resource.address.t1 for a resource in table 1 or something.

view this post on Zulip RFC notifications bot (Jul 30 2025 at 17:26):

fitzgen commented on PR #43:

Okay so I actually had flawed assumptions with the way that resources work in the component model, and this nicely nullifies that last open question.

The resource-definer gets to use an arbitrary u32 as their internal representation of a resource. Resource tables are only involved for other components, and their handles to that resource. The resource-definer always gets access to their u32 representation directly.

So given all that, the u32 resource representation can be an index into some embedder-defined table in the T in a Store<T> and compile-time builtins can inline accesses to those embedder-defined tables if we give them an intrinsic like store.data_address that returns a *mut T (and the embedder's T is repr(C)).

Will update the RFC shortly.

view this post on Zulip RFC notifications bot (Jul 30 2025 at 18:22):

fitzgen updated PR #43.

view this post on Zulip RFC notifications bot (Jul 30 2025 at 18:24):

fitzgen commented on PR #43:

Will update the RFC shortly.

RFC updated accordingly.

I think that resolves all open questions for this RFC. I will give it a little bit of time for some more discussion before officially starting the motion to merge, just to give people an extra chance to read the updated RFC first and provide feedback.

view this post on Zulip RFC notifications bot (Jul 30 2025 at 20:10):

alexcrichton submitted PR review.

view this post on Zulip RFC notifications bot (Jul 30 2025 at 22:59):

programmerjake submitted PR review.

view this post on Zulip RFC notifications bot (Jul 30 2025 at 22:59):

programmerjake created PR review comment:

you also have to watch out for primitive type alignment changing offsets, e.g. in #[repr(C)] struct S(u8, f64) the f64 field is at offset 4 on i686-unknown-linux-gnu but at offset 8 on x86_64-unknown-linux-gnu

view this post on Zulip RFC notifications bot (Aug 05 2025 at 20:09):

fitzgen submitted PR review.

view this post on Zulip RFC notifications bot (Aug 05 2025 at 20:09):

fitzgen created PR review comment:

That's true and I expect we will need to add some wording to our documentation for this feature around what exactly the safety conditions will need to be, such that the compile-time builtins can portably access the host data. My plan right now is to document these things in detail as I prototype and dog-food the feature.

view this post on Zulip RFC notifications bot (Aug 05 2025 at 20:10):

fitzgen commented on PR #43:

Okay, I think this is ready and we've had a bit of time for folks to look it over, so let's officially start the process.

Motion to Finalize

Disposition: Merge


As always, details on the RFC process can be found here: https://github.com/bytecodealliance/rfcs/blob/main/accepted/rfc-process.md#making-a-decision-merge-or-close

view this post on Zulip RFC notifications bot (Aug 06 2025 at 00:06):

alexcrichton commented on PR #43:

I second! (don't think I can re-approve)

view this post on Zulip RFC notifications bot (Aug 06 2025 at 18:59):

fitzgen commented on PR #43:

As there has been sign-off from representatives of two different BA stakeholder organizations, this RFC is now entering its 10-day

Final Comment Period

and the last day to raise concerns before this RFC merges is 2025-08-16.

Thanks everyone!

view this post on Zulip RFC notifications bot (Aug 12 2025 at 16:58):

cfallin submitted PR review:

LGTM!

view this post on Zulip RFC notifications bot (Aug 12 2025 at 16:58):

cfallin created PR review comment:

This seems to be in conflict with the later section describing intrinsics as always taking u64 addresses (and ignoring the upper 32 bits on 32-bit targets). I like the latter a lot more and it seems nice to provide this portability boon, so perhaps this is just outdated text?

view this post on Zulip RFC notifications bot (Aug 12 2025 at 17:24):

fitzgen submitted PR review.

view this post on Zulip RFC notifications bot (Aug 12 2025 at 17:24):

fitzgen created PR review comment:

Ah yes, sorry this is outdated text, will update momentarily.

view this post on Zulip RFC notifications bot (Aug 12 2025 at 21:19):

fitzgen updated PR #43.

view this post on Zulip RFC notifications bot (Aug 18 2025 at 16:01):

fitzgen commented on PR #43:

The final comment period has passed without any concerns being raised, so I will go ahead and merge this. Thanks again, everyone!

view this post on Zulip RFC notifications bot (Aug 18 2025 at 16:01):

fitzgen merged PR #43.


Last updated: Dec 06 2025 at 06:05 UTC