elliottt edited PR #4826 from trevor/fix-4812
to main
:
- Add function alignment to the TargetIsa trait
- Align functions by their ISA's requirements
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
elliottt edited PR #4826 from trevor/fix-4812
to main
:
Add a
function_alignment
function to theTargetIsa
trait, and use it to align functions when generating objects. This fixes a bug on x86_64 where rip-relative loads to sse registers could cause a segfault, as functions weren't always guaranteed to be aligned to 16-byte addresses.Fixes #4812
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
elliottt requested alexcrichton for a review on PR #4826.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
I think this may have actually been the only place that
None
was passed in, so with this changing would it be possible to switch the alignment parameter tou32
instead ofOption<u32>
?
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Could you leave a comment here for why this is 16? (reading this in isolation I'd naively expect it to be 1)
bjorn3 created PR review comment:
Can you also use this method in cranelift-object and cranelift-jit?
bjorn3 submitted PR review.
cfallin submitted PR review.
cfallin submitted PR review.
cfallin created PR review comment:
I'm wondering if this is enough?
Specifically, couldn't we hit the same bug on aarch64, where:
- We have a constant that needs to be 16-byte-aligned (else fault);
- We 16-align it in the function body;
- But the function is only 4-aligned, and has a starting offset != 0 mod 16.
So I think we need to either set
function_alignment
to the highest alignment statically that we know we use, or compute it dynamically based on the alignments actually present in the constant pool, on every architecture; otherwise we don't satisfy the alignment constraints.
elliottt updated PR #4826 from trevor/fix-4812
to main
.
elliottt submitted PR review.
elliottt created PR review comment:
Great point, added!
elliottt has marked PR #4826 as ready for review.
elliottt created PR review comment:
cranelift-object already has support for specifying function alignment as the
function_alignment
field onObjectBuilder
, is that sufficient?cranelift-jit takes an argument that implements
TargetIsa
already, do we need to modify it to ensure that it's emitting aligned function bodies?
elliottt submitted PR review.
elliottt created PR review comment:
That's a good point. I'm going to include the minimum alignment requirement in the output when compiling the function, and then take the max of that and the isa function_alignment when emitting instructions.
elliottt submitted PR review.
elliottt updated PR #4826 from trevor/fix-4812
to main
.
cfallin created PR review comment:
Can we call this just
alignment
? From a consumer point of view, theEmitResult
represents a blob of machine code that has to be placed somewhere, with an alignment specified here; the fact that this alignment requirement comes from our constant pool in this instance is sort of an incidental detail (and there could be other reasons later).
cfallin submitted PR review.
cfallin submitted PR review.
elliottt updated PR #4826 from trevor/fix-4812
to main
.
elliottt created PR review comment:
What do you think about removing the
function_alignment
field fromObjectBuilder
and calling thefunction_alignment()
function inTargetIsa
instead? Also it looks like we'll need to slightly adapt theModule
trait to pass through the individual function alignment requirements so that we can pick the largest one when emitting machine code. I've made that change on this branch, but held off on removingfunction_alignment
fromObjectBuilder
.Looking through cranelift-jit, it seems like it always 16-byte aligns functions, so there might not be any work needed there.
elliottt submitted PR review.
elliottt updated PR #4826 from trevor/fix-4812
to main
.
akirilov-arm submitted PR review.
akirilov-arm created PR review comment:
I don't think that this is much of a problem because the only instructions I can think of off the top of my head that have strict alignment requirements are the atomic instructions (or SP-relative operations, but they are not applicable in this case), and it sounds a bit unusual to apply them to literals in the instruction stream. Yes, it is possible to enable alignment checking for almost all relevant instructions, but I am not aware of any platform that does that, and in fact unaligned accesses by default is pretty much an aspect of the ABI at this point.
However, stricter function alignment is in fact recommended for performance reasons, i.e. to maximize fetch bandwidth. The recommendation for some of Arm's recent processors is 32 bytes.
elliottt updated PR #4826 from trevor/fix-4812
to main
.
elliottt created PR review comment:
@akirilov-arm is it worth changing this to
32
here, or should we leave that for another PR?
elliottt submitted PR review.
akirilov-arm submitted PR review.
akirilov-arm created PR review comment:
Doing it in this PR is fine, but you should add a comment that the value has been chosen for performance reasons; otherwise the correctness requirement is 4 bytes, as in your current code.
elliottt updated PR #4826 from trevor/fix-4812
to main
.
elliottt created PR review comment:
I changed this to
32
and added a comment, thanks!
elliottt submitted PR review.
elliottt merged PR #4826.
Last updated: Dec 23 2024 at 12:05 UTC