afonso360 commented on issue #7030:
One of the WASI tests found an issue where we were wrongly specifying the maximum range for
c.j
relocations. I'm going to let this run on the fuzzer for a while to check if it finds any other such issues.Question though: is it possible to remove the usage of emit_uncompressed? It seems like the usage here is primarily related to relocations and how PCRelHi20 for example expects a 4-byte instruction. Are there alternative relocations for the 16-bit compressed instructions we could use instead with some refactoring? Or is this matching what other compilers do where for these constructs uncompressed instructions are always emitted?
I don't fully know the answer to that question.
There's a few cases here that we are forcing to be uncompressed:
Calls
We force emit
auipc+jalr
, but we could emit justjalr
,c.jalr
or evenc.j
if we knew that the call target was in range.The way the normal toolchains deal with this is that they emit the uncompressed variant and add two relocations on the call
R_RISCV_CALL + R_RISCV_RELAX
. The second relax relocation lets the linker know that it can transform the call instruction.The RISC-V ABI does not specify any relocations for compressed versions of
jalr
. But that's just because they specify all of the allowed relaxations, and expect the compilers to emit the uncompressed versions.I think we don't have any mechanism for doing this in cranelift. Although adding the relax relocation could be useful for
cg_clif
and other users that pass the resulting code through a normal linker.Loads
I think this is the only place where
PCRelHi20
specifically is used. We don't have a compressed equivalent (i.e. there is noc.auipc
).We only use these for loads & stores from labels (i.e. from the constant pool).
Here we always emit
auipc+ld
with the two relocations, we could emit justld
orc.ld
if we knew that the constants were in range.I don't think we have a way of performing those relaxations, but it would be pretty beneficial.
We also can't mix for example
auipc+c.ld
since due to the reduced range inc.ld
we could end up in a situation where we can't encode the address if it ends up at a certain offset from theauipc
address.I did a short test to check what LLVM emits for loads from the constant pool. And they don't seem to compress those loads. Which I find slightly weird, so I might have just not fed it the proper flags.
<details>
<summary> LLVM pool loading asm</summary>
afonso@DESKTOP-1AHKMV2:/tmp/asd$ cat ./test.rs #[no_mangle] pub extern "C" fn square(_num: u64) -> u64 { 0xABCDEF01234ABCD } afonso@DESKTOP-1AHKMV2:/tmp/asd$ rustc --target=riscv64gc-unknown-linux-gnu -O --emit=obj --crate-type=lib ./test.rs afonso@DESKTOP-1AHKMV2:/tmp/asd$ llvm-objdump -dr ./test.o ./test.o: file format elf64-littleriscv Disassembly of section .text.square: 0000000000000000 <square>: 0: 17 05 00 00 auipc a0, 0 0000000000000000: R_RISCV_PCREL_HI20 .LCPI0_0 4: 03 35 05 00 ld a0, 0(a0) 0000000000000004: R_RISCV_PCREL_LO12_I .Lpcrel_hi0 8: 82 80 ret afonso@DESKTOP-1AHKMV2:/tmp/asd$ ld.lld ./test.o ld.lld: warning: cannot find entry symbol _start; not setting start address afonso@DESKTOP-1AHKMV2:/tmp/asd$ llvm-objdump -dr ./a.out ./a.out: file format elf64-littleriscv Disassembly of section .text: 000000000001118c <square>: 1118c: 17 15 00 00 auipc a0, 1 11190: 03 35 c5 00 ld a0, 12(a0) 11194: 82 80 ret
</details>
afonso360 edited a comment on issue #7030:
One of the WASI tests found an issue where we were wrongly specifying the maximum range for
c.j
relocations. I'm going to let this run on the fuzzer for a while to check if it finds any other such issues.Question though: is it possible to remove the usage of emit_uncompressed? It seems like the usage here is primarily related to relocations and how PCRelHi20 for example expects a 4-byte instruction. Are there alternative relocations for the 16-bit compressed instructions we could use instead with some refactoring? Or is this matching what other compilers do where for these constructs uncompressed instructions are always emitted?
I don't fully know the answer to that question.
There's a few cases here that we are forcing to be uncompressed:
Calls
We force emit
auipc+jalr
, but we could emit justjalr
,c.jalr
or evenc.j
if we knew that the call target was in range.The way the normal toolchains deal with this is that they emit the uncompressed variant and add two relocations on the call
R_RISCV_CALL + R_RISCV_RELAX
. The second relax relocation lets the linker know that it can transform the call instruction.The RISC-V ABI does not specify any relocations for compressed versions of
jalr
. But that's just because they specify all of the allowed relaxations, and expect the compilers to emit the uncompressed versions and the linker to perform the relaxation.I think we don't have any mechanism for doing this in cranelift. Although adding the relax relocation could be useful for
cg_clif
and other users that pass the resulting code through a normal linker.Loads
I think this is the only place where
PCRelHi20
specifically is used. We don't have a compressed equivalent (i.e. there is noc.auipc
).We only use these for loads & stores from labels (i.e. from the constant pool).
Here we always emit
auipc+ld
with the two relocations, we could emit justld
orc.ld
if we knew that the constants were in range.I don't think we have a way of performing those relaxations, but it would be pretty beneficial.
We also can't mix for example
auipc+c.ld
since due to the reduced range inc.ld
we could end up in a situation where we can't encode the address if it ends up at a certain offset from theauipc
address.I did a short test to check what LLVM emits for loads from the constant pool. And they don't seem to compress those loads. Which I find slightly weird, so I might have just not fed it the proper flags.
<details>
<summary> LLVM pool loading asm</summary>
afonso@DESKTOP-1AHKMV2:/tmp/asd$ cat ./test.rs #[no_mangle] pub extern "C" fn square(_num: u64) -> u64 { 0xABCDEF01234ABCD } afonso@DESKTOP-1AHKMV2:/tmp/asd$ rustc --target=riscv64gc-unknown-linux-gnu -O --emit=obj --crate-type=lib ./test.rs afonso@DESKTOP-1AHKMV2:/tmp/asd$ llvm-objdump -dr ./test.o ./test.o: file format elf64-littleriscv Disassembly of section .text.square: 0000000000000000 <square>: 0: 17 05 00 00 auipc a0, 0 0000000000000000: R_RISCV_PCREL_HI20 .LCPI0_0 4: 03 35 05 00 ld a0, 0(a0) 0000000000000004: R_RISCV_PCREL_LO12_I .Lpcrel_hi0 8: 82 80 ret afonso@DESKTOP-1AHKMV2:/tmp/asd$ ld.lld ./test.o ld.lld: warning: cannot find entry symbol _start; not setting start address afonso@DESKTOP-1AHKMV2:/tmp/asd$ llvm-objdump -dr ./a.out ./a.out: file format elf64-littleriscv Disassembly of section .text: 000000000001118c <square>: 1118c: 17 15 00 00 auipc a0, 1 11190: 03 35 c5 00 ld a0, 12(a0) 11194: 82 80 ret
</details>
alexcrichton commented on issue #7030:
Ok makes sense and sounds like we should leave this as is.
While I'm not super familiar with relaxed relocations my hunch is that Cranelift isn't quite ready for that. I believe we'd have to plumb through all
MachLabel
relocations rather than resolving them during emission because the linker is the only one that would be able to fully resolve everything. Either that or if we wanted to implement relaxation in theMachBuffer
that would probably be a significant chunk of work probably not worth it for our use case.In any case having uncompressed forms to start off with seems fine and we can always revisit and figure out solutions if that becomes a problem.
afonso360 commented on issue #7030:
I've left the fuzzer running overnight on this, and it didn't find any issues with these instructions!
Last updated: Nov 22 2024 at 16:03 UTC