cfallin opened PR #12585 from cfallin:pulley-little-endian-vecs-on-big-endian-when-debugging to bytecodealliance:main:
When running Pulley on an s390x (or other big-endian) host, and enabling guest-debugging instrumentationa very strange confluence of events occurs:
- Pulley uses "native endian" of the host by default for loads and stores.
- Patchable calls to debug hooks use the
preserve_allABI, which spills all registers in the trampoline adapter (callee in this ABI), including vector registers.- Saving vector-typed locals/operand stack values to the debugger state slot also uses vector stores.
- All of these stores were thus big-endian on big-endian hosts.
- Pulley's bytecode only supports little-endian vector loads/stores.
We were thus hitting an assert in Pulley codegen (the Cranelift backend) when encountering a
VStoreVCode instruction with a big-endian mode.This PR makes two changes that avoid this issue:
- The ABI code for Pulley is careful to specify little-endian mode explicitly for any vector load/store.
- The debug instrumentation code is refactored to use little-endian explicitly for vector types only.
- (Why not for all types? Because we GC-root GC ref values, and these need to be provided to the collector as mutable storage cells, so need to be in native endianness.)
Test will come as part of #12575 incorporating a
Pulley-with-guest-debugging test and running on s390x amongst our platforms.<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
cfallin requested fitzgen for a review on PR #12585.
cfallin requested wasmtime-compiler-reviewers for a review on PR #12585.
cfallin requested wasmtime-core-reviewers for a review on PR #12585.
cfallin edited PR #12585:
When running Pulley on an s390x (or other big-endian) host, and enabling guest-debugging instrumentation, a very strange confluence of events occurs:
- Pulley uses "native endian" of the host by default for loads and stores.
- Patchable calls to debug hooks use the
preserve_allABI, which spills all registers in the trampoline adapter (callee in this ABI), including vector registers.- Saving vector-typed locals/operand stack values to the debugger state slot also uses vector stores.
- All of these stores were thus big-endian on big-endian hosts.
- Pulley's bytecode only supports little-endian vector loads/stores.
We were thus hitting an assert in Pulley codegen (the Cranelift backend) when encountering a
VStoreVCode instruction with a big-endian mode.This PR makes two changes that avoid this issue:
- The ABI code for Pulley is careful to specify little-endian mode explicitly for any vector load/store.
- The debug instrumentation code is refactored to use little-endian explicitly for vector types only.
- (Why not for all types? Because we GC-root GC ref values, and these need to be provided to the collector as mutable storage cells, so need to be in native endianness.)
Test will come as part of #12575 incorporating a
Pulley-with-guest-debugging test and running on s390x amongst our platforms.<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
cfallin updated PR #12585.
fitzgen submitted PR review.
fitzgen created PR review comment:
} fn memflags_for_debug_slot_value_clif_ty(&self, ty: ir::Type) -> MemFlags {
cfallin updated PR #12585.
cfallin submitted PR review.
cfallin created PR review comment:
Fixed, thanks!
cfallin has enabled auto merge for PR #12585.
cfallin updated PR #12585.
cfallin added PR #12585 Cranelift/Wasmtime/Pulley/Debugging: use little-endian mode to spill/reload vectors in guest-debugging slot and ABI clobbers. to the merge queue
cfallin merged PR #12585.
cfallin removed PR #12585 Cranelift/Wasmtime/Pulley/Debugging: use little-endian mode to spill/reload vectors in guest-debugging slot and ABI clobbers. from the merge queue
Last updated: Feb 24 2026 at 04:36 UTC