Stream: cranelift

Topic: Retrieve alignment of architecture


view this post on Zulip Terts Diepraam (May 13 2024 at 14:32):

Hi! My colleagues are telling me that I should align my fields in the structs of my language and I should probably listen to them. I can do the alignment myself, but I'd need to know what the alignment should be on a given architecture. Does cranelift expose this information anywhere?

view this post on Zulip bjorn3 (May 13 2024 at 14:45):

Alignments for various types are not just architecture specific, but also OS specific. Cranelift doesn't have any knowledge about this. TargetFrontendConfig only contains the default calling convention and the pointer size. target_lexicon::CDataModel only contains the sizes for various primitive C types on the given target, not the alignments.

view this post on Zulip bjorn3 (May 13 2024 at 14:48):

In rustc_codegen_cranelift I'm getting all this information from rustc itself which parses the LLVM data layout specifications stored in the target spec, but the relevant crate (rustc_target) doesn't compile on stable, so it probably won't be too useful for you.

view this post on Zulip bjorn3 (May 13 2024 at 14:49):

rustc also handles a fair part of the calling convention for me. (cranelift doesn't have a way to represent struct arguments, instead expecting that the lowering to primitive types is done by the producer of clif ir)

view this post on Zulip Chris Fallin (May 13 2024 at 15:06):

@Terts Diepraam a pretty good heuristic to start with in a new design IMHO is "natural alignment": align each type to its own size (so u32s are 4-aligned, u64s are 8-aligned, etc). Most platforms/OSes are pretty close to that too, with weirdness occurring primarily around larger types (u128s, SIMD vectors) or unexpectedly smaller than usual alignments

view this post on Zulip Terts Diepraam (May 13 2024 at 15:08):

Ah I see, that's unfortunate. Is there any interest in porting some of that information from that rust crate over to cranelift? I could maybe work on that with a bit of help, since I might need this anyway.

Also, I'm curious, what part of alignment is up to the OS?

Natural alignment is a good start. I'll start out with that! Thanks!

view this post on Zulip Chris Fallin (May 13 2024 at 15:12):

it can be up to the OS in the same sense that calling convention can be -- just a standard set for all programs/libraries to interop (e.g. under Windows the standard calling convention is fastcall and so DLLs and EXEs agree on that interface, whereas SysV is used on Linux). A "tool historian" could probably say more about why but my sense is it's a combination of some backward compat (platforms that evolved from smaller word sizes, etc) and arbitrary choices :-)

view this post on Zulip Chris Fallin (May 13 2024 at 15:13):

That does raise the point that it's completely up to you on the internal part of your system but one has to be a little careful if you have direct calling (FFI) to functions in e.g. libc or your runtime -- to match the layout expected by the system compiler

view this post on Zulip Terts Diepraam (May 13 2024 at 15:15):

Right, I guess then mostly what I'm asking about is the FFI-less part, where it's mostly about performance of memory accesses. Nonetheless, good to know that this is something to keep in mind if I start doing FFI.

view this post on Zulip Terts Diepraam (May 14 2024 at 12:11):

The natural alignment worked great! At least, my code now runs with the aligned memflags set, so seems to be alright. I could still work on porting rustc_target (in some form) to cranelift if there's demand for that. Just let me know. I could open an issue if we need to discuss this further (where that could live, which parts are relevant etc.).

view this post on Zulip Terts Diepraam (May 16 2024 at 12:49):

I've got another question :) I just realized that I hadn't thought about the alignment of the stack slot itself. Looking at the docs[1], I would assume that I would be able to pass the desired alignment to cranelift, but I don't see that option anywhere. But it also says:

For example, the alignment of these stack memory accesses can be inferred from the offsets and stack slot alignments.

Does that mean that it will be aligned automatically?

1: https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/docs/ir.md#explicit-stack-slots

A fast and secure runtime for WebAssembly. Contribute to bytecodealliance/wasmtime development by creating an account on GitHub.

view this post on Zulip Chris Fallin (May 16 2024 at 14:34):

Stack slots should be aligned according to natural alignment, I believe

view this post on Zulip Chris Fallin (May 16 2024 at 14:34):

(And if not in some case, that'd be a Cranelift bug)

view this post on Zulip Chris Fallin (May 16 2024 at 14:36):

Ah, interesting, they're actually only machine word-aligned: https://github.com/bytecodealliance/wasmtime/blob/91ec9a589cc6c7f031ef4cacdb295331c07b6063/cranelift/codegen/src/machinst/abi.rs#L1181-L1183 (so 8 bytes on all our 64-bit targets)

A fast and secure runtime for WebAssembly. Contribute to bytecodealliance/wasmtime development by creating an account on GitHub.

view this post on Zulip Chris Fallin (May 16 2024 at 14:36):

@bjorn3 does this cause problems for cg_clif with 128-bit values?

view this post on Zulip Chris Fallin (May 16 2024 at 14:36):

we could easily fix this if so

view this post on Zulip Chris Fallin (May 16 2024 at 14:37):

(or, probably should regardless)

view this post on Zulip Afonso Bordado (May 16 2024 at 14:37):

We've had an issue open for a while to add the capability of supporting stack slot alignment
https://github.com/bytecodealliance/wasmtime/issues/6716

Feature See title. Benefit Correct alignment are necessary to avoid crashes on some architectures and code may depend on correct alignment. In addition rustc checks that the right alignment is used...

view this post on Zulip Afonso Bordado (May 16 2024 at 14:39):

cg_clif gets around this by overallocating stack slots to align the values internally
https://github.com/rust-lang/rustc_codegen_cranelift/blob/64c73d0b3c4b91fee0e9a840be30e1d6faac7957/src/abi/pass_mode.rs#L187-L195

Cranelift based backend for rustc. Contribute to rust-lang/rustc_codegen_cranelift development by creating an account on GitHub.

view this post on Zulip Chris Fallin (May 16 2024 at 14:40):

ah, there we go! so the full idea there is to take alignment as a parameter, but I see no reason not to do natural alignment as a baseline

view this post on Zulip Chris Fallin (May 16 2024 at 14:40):

which would solve the issue as well I think?

view this post on Zulip Afonso Bordado (May 16 2024 at 14:41):

I think it would solve this particular instance, yes! But there are some other issues open that require for example 32byte alignment. But At least the 16byte would help here I think

view this post on Zulip Afonso Bordado (May 16 2024 at 14:41):

(I haven't looked at this in a while, so @bjorn3 would probably be better qualified to answer)

view this post on Zulip bjorn3 (May 16 2024 at 14:42):

Chris Fallin said:

bjorn3 does this cause problems for cg_clif with 128-bit values?

Yes! I am over allocating and manually aligning to work around this.

view this post on Zulip Terts Diepraam (May 16 2024 at 14:45):

I guess natural alignment would work, but also might not be very precise. If I use a stack slot to create a record with a couple of i8 for example, the only alignment that I need is 1 byte right? (Still new to this, please correct me if I'm wrong :sweat_smile: )

view this post on Zulip Chris Fallin (May 16 2024 at 14:46):

Right, actually I'm just paging this bit of CLIF back in now and seeing the size is arbitrary, not one CLIF type (obviously in hindsight, for aggregates etc)

view this post on Zulip Chris Fallin (May 16 2024 at 14:46):

so really it does call for an alignment parameter

view this post on Zulip Chris Fallin (May 16 2024 at 15:23):

https://github.com/bytecodealliance/wasmtime/pull/8635

Fixes #6716. Currently, stack slots on the stack are aligned only to a machine-word boundary. This is insufficient for some use-cases: for example, storing SIMD data or structs that require a large...

view this post on Zulip Terts Diepraam (Sep 16 2024 at 08:37):

Hi! It's me again :)

I've finally hit a case where I need to align stack slots for 128 bit values. Is it correct that this is not (easily) possible yet because of https://github.com/bytecodealliance/wasmtime/issues/6716? If so, I could try to pick that up.

Feature See title. Benefit Correct alignment are necessary to avoid crashes on some architectures and code may depend on correct alignment. In addition rustc checks that the right alignment is used...

view this post on Zulip Chris Fallin (Sep 16 2024 at 14:54):

I believe it should be correct up to 16-byte (128-bit) alignment; it's limited by the stackframe alignment, as we align the offset from start of stackframe per the user's request but stackframe may not be (e.g.) 64-byte-aligned if that's what the user requests. But both x86-64 and aarch64 16-align stackframes

view this post on Zulip Terts Diepraam (Sep 17 2024 at 14:54):

It seems like 16 bit alignment is not working. I have the following CLIF code as a reproduction:

function %foo() -> i64 {
    ss0 = explicit_slot 8
    ss1 = explicit_slot 32, align=16

block0():
    v1 = stack_addr.i64 ss1
    return v1
}
; run: %foo() == 0

(I don't want to check whether it's zero, just an error message so I can inspect the pointer)

When I run that with clif-util, I see (for example) that the pointer is 140735794843864, which is 0x7ffc38674aa8 in hex. That's not aligned to 16 bytes right? I might very well be using this wrong, so please correct me if I'm wrong. This is on main by the way, I just pulled with git.

In my actual application, I give pointers to stack slots to some Rust code, which panics with a message about unaligned data, which is why I'm interested in this in the first place.

view this post on Zulip Chris Fallin (Sep 17 2024 at 15:16):

hmm, indeed, that address isn't properly aligned... I don't have spare cycles to look at this at the moment but perhaps someone else does, or if you're interested in diving into the ABI code yourself...

view this post on Zulip Terts Diepraam (Sep 19 2024 at 11:05):

I dove into it :big_smile: I think this should fix it: https://github.com/bytecodealliance/wasmtime/pull/9279

Hi! I think that the stackslots weren't properly aligned, because the padding for the alignment is being added to the end of the slot instead of at the start, meaning that the slot itself isn&#...

view this post on Zulip Terts Diepraam (Sep 19 2024 at 11:44):

I'm working on a separate fix for larger alignments. Where I truncate rsp to the largest requested alignment. I think that should work for x86_64 since rbp is used to restore the frame there, but on aarch64 (and maybe others) that doesn't seem to be the case, so I'll need some help there. I'll post a draft PR for that in a bit.

view this post on Zulip Chris Fallin (Sep 19 2024 at 15:40):

I'll take a look, thanks a ton for diving in to this!

re: SP alignment on non-x86, I believe we do the "leaf function" optimization where if no other calls happen and no stack storage / spill slots need to be allocated, we don't save SP in FP; but I think/suspect that if we have stackslots, that should already be disabled, unless we've further optimized this since I last looked :-)


Last updated: Dec 23 2024 at 13:07 UTC