Stream: git-wasmtime

Topic: wasmtime / PR #12061 Cranelift: add a "patchable call" ABI.


view this post on Zulip Wasmtime GitHub notifications bot (Nov 20 2025 at 23:47):

cfallin opened PR #12061 from cfallin:patchable-abi-is-happy-to-preserve-all-your-registers to bytecodealliance:main:

This ABI is intended for use in scenarios where we want a very lightweight callsite that can be turned on and off by patching in one instruction. (The actual patchable call instruction is not in this PR; that will be a separate PR.)

The idea is that we define a call to clobber no registers -- not even the arguments! And we restrict signatures such that on all of our supported architectures, all arguments go into registers only. Those two requirements together mean that all callsites for this ABI should have only a raw call instruction, with no loads/stores to stackslots; and have the minimum possible impact on regalloc, by only imposing constraints on args to ensure they are in certain registers but not altering those registers.

Given this, we could implement, e.g., breakpoints with patchable callsites (off by default) at every sequence point in compiled code. In a typical use-case with Wasmtime-compiled Wasm, that would put a bunch of uses of vmctx constrained to the first argument register in every code path, but vmctx likely already sits there most of the time anyway (for any call to other Wasm functions or for libcalls). Thus, the impact is just the one instruction and nothing else.

This PR adds the calling convention itself and tests that show that two consecutive callsites can be compiled with no register setup re-occurring from one call to the next (thus demonstrating no clobbers).

<!--
Please make sure you include the following information:

Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.html

Please ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->

view this post on Zulip Wasmtime GitHub notifications bot (Nov 20 2025 at 23:47):

cfallin requested wasmtime-compiler-reviewers for a review on PR #12061.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 20 2025 at 23:47):

cfallin requested alexcrichton for a review on PR #12061.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 20 2025 at 23:53):

cfallin commented on PR #12061:

(Switching to draft actually -- want to implement the callee side in this one too, sorry)

view this post on Zulip Wasmtime GitHub notifications bot (Nov 21 2025 at 00:26):

cfallin updated PR #12061.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 21 2025 at 00:27):

cfallin has marked PR #12061 as ready for review.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 21 2025 at 00:27):

cfallin commented on PR #12061:

OK, good to go now!

view this post on Zulip Wasmtime GitHub notifications bot (Nov 21 2025 at 04:54):

cfallin updated PR #12061.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 21 2025 at 07:17):

cfallin updated PR #12061.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 21 2025 at 07:17):

cfallin requested wasmtime-core-reviewers for a review on PR #12061.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 21 2025 at 07:21):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 21 2025 at 07:21):

cfallin created PR review comment:

One thing to note here: riscv64 now counts upward rather than downward when saving clobbers, because it has to worry about 16-aligning vector regs (which were never previously callee-saves) and this is simpler to do (and to match the size computation) with a forward order. This is why a bunch of riscv64 tests were reblessed.

Pulley copied this logic but (i) I didn't want to alter the special "pulley-managed clobber saves" stuff and (ii) I think its vector stores/loads allow unaligned access, unlike riscv64 in the base case.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 21 2025 at 14:26):

alexcrichton submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 21 2025 at 14:26):

alexcrichton created PR review comment:

Question for you, which I realize is basically unrelated to this PR -- this function feels like it's duplicated logic relative to get_regs_clobbered_by_call where this more-or-less returns the inverse of the other. How come in the backends this has its own implementation vs having one function delegate to the other?

view this post on Zulip Wasmtime GitHub notifications bot (Nov 21 2025 at 14:26):

alexcrichton created PR review comment:

Similar question to aarch64 (or feel free to resolve with no comment if you respond to aarch64)

view this post on Zulip Wasmtime GitHub notifications bot (Nov 21 2025 at 14:26):

alexcrichton created PR review comment:

Similar to my question about aarch64, would it be possible to implement this method in terms of get_regs_clobbered_by_call?

view this post on Zulip Wasmtime GitHub notifications bot (Nov 21 2025 at 14:26):

alexcrichton created PR review comment:

Similar question to aarch64 (or feel free to resolve with no comment if you respond to aarch64)

view this post on Zulip Wasmtime GitHub notifications bot (Nov 21 2025 at 14:26):

alexcrichton created PR review comment:

Similar question to aarch64 (or feel free to resolve with no comment if you respond to aarch64)

view this post on Zulip Wasmtime GitHub notifications bot (Nov 21 2025 at 14:26):

alexcrichton created PR review comment:

Is it worth clarifying here that the return register, if applicable, is clobbered? I realize that's sort of like a "system" register which isn't tracked by regalloc, but I believe that's the one register that's clobbered with this ABI? (not that it matters, we always save it in the prologue anyway)

view this post on Zulip Wasmtime GitHub notifications bot (Nov 21 2025 at 19:13):

fitzgen submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 21 2025 at 19:13):

fitzgen created PR review comment:

Okay great, this is exactly what I was hoping would be here. We should document these constraints in the ir::CallConv doc comments for cranelift embedders tho.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 21 2025 at 19:13):

fitzgen created PR review comment:

Alternatively it probably makes sense to restrict this calling convention to only as many arguments as fit in registers (portably?) and disallow returns. We can enforce this in CLIF validation.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 21 2025 at 19:25):

cfallin created PR review comment:

Right, yep, this is documented in this doc-comment (see 11 lines up): "only up to four arguments of integer type, and no return values".

view this post on Zulip Wasmtime GitHub notifications bot (Nov 21 2025 at 19:25):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 21 2025 at 19:26):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 21 2025 at 19:26):

cfallin created PR review comment:

Yes, I believe it is documented: from line 55 in call_conv.rs I have

      /// The ABI is based on the native register-argument ABI on each
      /// respective platform, but puts severe restrictions on allowable
      /// signatures: only up to four arguments of integer type, and no
      /// return values. It does not support tail-calls, and disallows
      /// any extension modes on arguments.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 21 2025 at 19:29):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 21 2025 at 19:29):

cfallin created PR review comment:

That's a great question! They are mostly but not entirely the same: witness the beauty that is the aarch64 calling convention, where it is specified that half of the vector registers (the low/f64 half) is callee-saved while half is caller-saved. The result of that is that we have the special "caller and callee conventions match so ignore callsites" logic that gives us the right code in the common case, but we have to conservatively over-approximate otherwise.

In non-aarch64 cases I agree it could be refactored, though that's another cleanup project I'd rather defer till after this PR if that's OK :-)

view this post on Zulip Wasmtime GitHub notifications bot (Nov 21 2025 at 20:01):

fitzgen submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 21 2025 at 20:01):

fitzgen created PR review comment:

D'oh -- thanks!

view this post on Zulip Wasmtime GitHub notifications bot (Nov 22 2025 at 01:34):

cfallin commented on PR #12061:

(There is something weird going on with riscv64 which I spent a bit of time trying to debug with qemu+gdb today -- will resolve that before this moves further)

view this post on Zulip Wasmtime GitHub notifications bot (Nov 22 2025 at 02:02):

cfallin updated PR #12061.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 22 2025 at 02:07):

cfallin updated PR #12061.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 22 2025 at 02:08):

cfallin updated PR #12061.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 22 2025 at 02:14):

cfallin commented on PR #12061:

riscv64 issue was some weirdness with non-aligned stackslot area sizes causing problems with the new alignment of clobber-saves with the bottom-up ordering. I've reverted the changes that flipped the order to bottom-up.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 22 2025 at 02:17):

cfallin has enabled auto merge for PR #12061.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 22 2025 at 02:54):

cfallin merged PR #12061.


Last updated: Dec 06 2025 at 06:05 UTC