alexcrichton requested abrown for a review on PR #9651.
alexcrichton opened PR #9651 from alexcrichton:backend-intrinsics
to bytecodealliance:main
:
This commit is an initial stab at implementing interpreter-to-host communication in Pulley. The basic problem is that Pulley needs the ability to call back into Wasmtime to implement tasks such as
memory.grow
, imported functions, etc. For native platforms this is a simplecall_indirect
operation in Cranelift but the story for Pulley must be different because it's effectively switching from interpreted code to native code.The solution I've ended up settling on looks pretty similar to native platforms but with a few important tweaks:
- A new
call_indirect_host
opcode is added to Pulley.
- Function signatures that can be called from Pulley bytecode are statically enumerated at build-time.
- This enables the implementation of
call_indirect_host
to take an immediate of which signature is being used and cast the function pointer to the right type.- A new "backend intrinsic" concept is added to Cranelift.
- This is a new variant of
ExternalName
.- The intention is that this has backend-specific meaning.
- For Pulley, this means that the Nth function signature is being called.
- Code generation for Pulley in
wasmtime-cranelift
now has Pulley-specific handling of the wasm-to-host transition where all previouscall_indirect
instructions are replaced with a call to a "backend intrinsic" which gets lowered to acall_indirect_host
.Note that most of this still isn't hooked up everywhere in Wasmtime. That means that the testing here is pretty light at this time. It'll require a fair bit more work to get everything fully integrated from Wasmtime in Pulley. This is expected to be one of the significant remaining chunks of work and should help unblock future testing (or make those diffs smaller ideally).
<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
alexcrichton requested wasmtime-compiler-reviewers for a review on PR #9651.
alexcrichton requested pchickey for a review on PR #9651.
alexcrichton requested wasmtime-core-reviewers for a review on PR #9651.
alexcrichton requested wasmtime-default-reviewers for a review on PR #9651.
alexcrichton updated PR #9651.
alexcrichton commented on PR #9651:
cc @cfallin as you're also no doubt interested in this as well
cfallin submitted PR review:
LGTM on the Cranelift bits -- happy to dive deeper into the Pulley and Wasmtime bits too if you'd like but I'm not as deep into that context at the moment.
cfallin created PR review comment:
tiny preference nit but could we call this
intrinsic{i}
?backend{i}
makes me think it's the name of a backend or something like that
cfallin created PR review comment:
s/are use to/are used to/
alexcrichton updated PR #9651.
github-actions[bot] commented on PR #9651:
Subscribe to Label Action
cc @fitzgen
<details>
This issue or pull request has been labeled: "cranelift", "cranelift:module", "pulley"Thus the following users have been cc'd because of the following labels:
- fitzgen: pulley
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
Learn more.
</details>
Instead of adding a whole new instruction, would it be possible to use a UserExternalName with a reserved namespace like
u32::MAX
? Or the Pulley interpreter could have a set of UserExternalName for which to call the specified host function without reserving any namespace.
alexcrichton commented on PR #9651:
I was hoping originally to do that yeah but one part I couldn't figure out was how to translate a
UserExternalNameRef
to aUserExternalName
in the backend. I don't think that the tables are easily accessible, but I could very well have missed something too
Each Function consists of a FunctionStencil with things like all instructions and tables in FunctionParameters to map eg UserExternalNameRef to UserExternalName. The backend is only supposed to look at the stencil such that the stencil can be used as cache key for a compilation cache. The user of the generated machine code then uses these tables to map the relocations emitted together with the machine code into the actual symbols that are referenced. For example using
func.params.ensure_user_func_name(func_name_ref)
.
alexcrichton updated PR #9651.
alexcrichton commented on PR #9651:
The backend itself has no way of knowing whether to generate such a relocation though, right? In that it can't look at the
UserExternalName
to determine whether it's a pulley->pulley call or a pulley->host call. One possible option is theRelocDistance
though where "near" means pulley->pulley and "far" means pulley->host.To confirm though you're thinking that something could look like:
- Somehow (e.g. via
RelocDistance
) the backend recognizes a difference between pulley->pulley and pulley->host calls.- The
enc::call_indirect_host
pulley opcode is emitted for pulley->host calls, but the actual signature number is left "blank" with a relocation- Later on the relocation processing cross-references the
UserExternalName
(which it now has access to) and fills in the numberAnd that'd skip the idea of intrinsics altogether?
The backend itself has no way of knowing whether to generate such a relocation though, right?
All calls to other functions should result in relocations whether they target another pulley function or a host function. It is then up to whatever consumes the MachBuffer to apply the right relocations as indicated, which in the case call relocations would include filling in the actual index of the UserExternalName and patch a call instruction to call_indirect_host instruction in case the target is a host function.
alexcrichton closed without merge PR #9651.
alexcrichton commented on PR #9651:
Ah ok that makes sense to me yeah. That requires that the opcodes have the same length but I think that's reasonable to arrange. I'll work on that route!
alexcrichton commented on PR #9651:
I like how the relocation idea worked out better in https://github.com/bytecodealliance/wasmtime/pull/9665, thanks @bjorn3 for the suggestion!
Last updated: Jan 24 2025 at 00:11 UTC