wasmtime / PR #12148 Cranelift: s390x: fix patchable call... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / PR #12148 Cranelift: s390x: fix patchable call...

Wasmtime GitHub notifications bot (Dec 10 2025 at 08:41):

cfallin opened PR #12148 from cfallin:clobber-not-lest-ye-be-clobbered to bytecodealliance:main:

It turns out that the s390x ABI is special wrt our others: the s390x System-V ABI provides an area to all callees to save clobbered GPRs. However this is only large enough for r6-r15, the callee-saves in the standard ABI.

The implementation implicitly assumed this, and when I adjusted the definition of clobbered registers for the patchable ABI, stating that r0-r15 are clobbered instead, the code happily generated a store-multiple (stmg) instruction that saved too much data in too little space, overwriting other bits of the stack.

Also, the clobber-save/restore sequence code only saved the bottom 64 bits of clobbered vector/float registers (which are 128 bits), implicitly encoding the fact that the SysV ABI specifies only the bottom half as callee-saved.

This PR implements patchable properly on s390x by open-coding the sequences to save all vector registers and r0-r5 in the explicit clobber-save area (still using the SysV-defined one for r6-r15).

Wasmtime GitHub notifications bot (Dec 10 2025 at 08:41):

cfallin requested alexcrichton for a review on PR #12148.

Wasmtime GitHub notifications bot (Dec 10 2025 at 08:41):

cfallin requested wasmtime-compiler-reviewers for a review on PR #12148.

Wasmtime GitHub notifications bot (Dec 10 2025 at 08:49):

cfallin commented on PR #12148:

Note that this is spun out of #12133, and makes tests pass there, in lieu of hacking up a Cranelift runtest harness for editing patchable code here (the failure would only be observable with large enough code to cause unexpected register clobbers anyway).

Wasmtime GitHub notifications bot (Dec 10 2025 at 15:53):

alexcrichton submitted PR review:

cc @uweigand on this as well, you're probably a more apt reviewer than I.

I'll go ahead and approve though to help unblock https://github.com/bytecodealliance/wasmtime/pull/12133 and I'm assuming follow-ups here are reasonable and feasible

Wasmtime GitHub notifications bot (Dec 10 2025 at 16:10):

uweigand submitted PR review:

We do have 32 VRs, so if the point is to save them all, this is not correct (it only saves 16). Apart from that LGTM from a correctness perspective.

To improve performance, we could use a second STORE MULTIPLE (STMG) for GPRs 0..5, and in fact we could use a VECTOR STORE MULTIPLE (VSTM) - not yet used in cranelift elsewhere - for VRs 0..31. That would certainly reduce code size, and also improve execution time a bit.

Wasmtime GitHub notifications bot (Dec 10 2025 at 23:35):

cfallin updated PR #12148.

Wasmtime GitHub notifications bot (Dec 10 2025 at 23:38):

cfallin commented on PR #12148:

Ah, yep, sorry, I forgot that we have 32 vector registers on this platform. Thanks for the review!

I've updated to use STMG / LMG; opting not to try to add vector store/load multiple for now since the encoding isn't there but we can do that as followup if needed. (As we discussed in the Cranelift weekly, this isn't too performance-critical on the callee side; in Wasmtime, at least, the save/restore sequence will execute only when hitting a breakpoint.)

Wasmtime GitHub notifications bot (Dec 10 2025 at 23:38):

cfallin has enabled auto merge for PR #12148.

Wasmtime GitHub notifications bot (Dec 11 2025 at 00:37):

cfallin merged PR #12148.

Last updated: Feb 24 2026 at 05:28 UTC