cfallin requested fitzgen for a review on PR #8635.
cfallin requested wasmtime-compiler-reviewers for a review on PR #8635.
cfallin requested wasmtime-core-reviewers for a review on PR #8635.
cfallin opened PR #8635 from cfallin:stackslot-alignment
to bytecodealliance:main
:
Fixes #6716.
Currently, stack slots on the stack are aligned only to a machine-word boundary. This is insufficient for some use-cases: for example, storing SIMD data or structs that require a larger alignment.
This PR adds a parameter to the
StackSlotData
to specify alignment, and the associated logic to the CLIF parser and printer. It updates the shared ABI code to compute the stackslot layout taking the alignment into account. In order to ensure the alignment is always a power of two, it is stored as a shift amount (log2 of actual alignment) in the IR.<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
cfallin submitted PR review.
cfallin created PR review comment:
(Just below this point) the old
alignment
function didn't seem to be used in the ABI code or anywhere else; it's likely left over from some earlier use-case. The "largest power of two that divides..." logic is a little weird (it can result in an alignment less than requested?) so I removed it and pushed the alignment computation to thestd::cmp::max
inabi.rs
.
cfallin updated PR #8635.
cfallin commented on PR #8635:
(for the record: work on this now spawned by this thread)
elliottt submitted PR review:
Awesome!
elliottt submitted PR review:
Awesome!
elliottt created PR review comment:
Do we have a way of enabling an explicit stack check in clif tests? That would be a nice way of verifying that the total stack space required is accounted for.
elliottt created PR review comment:
u32::try_from(align).map_err(|_| self.error("alignment must be a 32-bit unsigned integer"))?
alexcrichton submitted PR review.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Can this be set to 4? The ValRaw bits in Wasmtime require a 16-byte alignment
elliottt submitted PR review.
elliottt created PR review comment:
Ah!
function %f0() -> i64, i64, i64, i64 { gv0 = vmctx ss0 = explicit_slot 8, align=16 ss1 = explicit_slot 8, align=16 ss2 = explicit_slot 4 ss3 = explicit_slot 4 ss4 = stack_limit gv0
elliottt edited PR review comment.
cfallin updated PR #8635.
cfallin updated PR #8635.
cfallin updated PR #8635.
cfallin submitted PR review.
cfallin created PR review comment:
Done!
cfallin updated PR #8635.
cfallin has enabled auto merge for PR #8635.
cfallin updated PR #8635.
cfallin merged PR #8635.
bjorn3 submitted PR review.
bjorn3 created PR review comment:
If the alignment is bigger than the ABI stack alignment it would be necessary to adjust the stack pointer to correctly align it. This is missing here. Rust allows arbitrary power of two alignments for stack variables, even something crazy like a 1GB alignment. More realistically though, cache line alignment is used by crates like rayon for performance reasons. Without correctly aligning, rustc will trigger a debug assertion in some cases like for the aforementioned rayon crate.
cfallin submitted PR review.
cfallin created PR review comment:
Ah, yes, that's a good point -- to summarize: alignment of offset within the stack frame to alignment A does not imply alignment to A overall, because SP itself may be aligned less (e.g. only to A/4).
I don't have time to work further on this at the moment, but I'd be happy to review a PR that addressed this.
Last updated: Jan 24 2025 at 00:11 UTC