Stream: git-wasmtime

Topic: wasmtime / issue #7030 riscv64: Add more compressed instr...


view this post on Zulip Wasmtime GitHub notifications bot (Sep 14 2023 at 13:25):

afonso360 commented on issue #7030:

One of the WASI tests found an issue where we were wrongly specifying the maximum range for c.j relocations. I'm going to let this run on the fuzzer for a while to check if it finds any other such issues.

Question though: is it possible to remove the usage of emit_uncompressed? It seems like the usage here is primarily related to relocations and how PCRelHi20 for example expects a 4-byte instruction. Are there alternative relocations for the 16-bit compressed instructions we could use instead with some refactoring? Or is this matching what other compilers do where for these constructs uncompressed instructions are always emitted?

I don't fully know the answer to that question.

There's a few cases here that we are forcing to be uncompressed:

Calls

We force emit auipc+jalr, but we could emit just jalr, c.jalr or even c.j if we knew that the call target was in range.

The way the normal toolchains deal with this is that they emit the uncompressed variant and add two relocations on the call R_RISCV_CALL + R_RISCV_RELAX. The second relax relocation lets the linker know that it can transform the call instruction.

The RISC-V ABI does not specify any relocations for compressed versions of jalr. But that's just because they specify all of the allowed relaxations, and expect the compilers to emit the uncompressed versions.

I think we don't have any mechanism for doing this in cranelift. Although adding the relax relocation could be useful for cg_clif and other users that pass the resulting code through a normal linker.

Loads

I think this is the only place where PCRelHi20 specifically is used. We don't have a compressed equivalent (i.e. there is no c.auipc).

We only use these for loads & stores from labels (i.e. from the constant pool).

Here we always emit auipc+ld with the two relocations, we could emit just ld or c.ld if we knew that the constants were in range.

I don't think we have a way of performing those relaxations, but it would be pretty beneficial.

We also can't mix for example auipc+c.ld since due to the reduced range in c.ld we could end up in a situation where we can't encode the address if it ends up at a certain offset from the auipc address.

I did a short test to check what LLVM emits for loads from the constant pool. And they don't seem to compress those loads. Which I find slightly weird, so I might have just not fed it the proper flags.

<details>

<summary> LLVM pool loading asm</summary>

afonso@DESKTOP-1AHKMV2:/tmp/asd$ cat ./test.rs
#[no_mangle]
pub extern "C" fn square(_num: u64) -> u64 {
    0xABCDEF01234ABCD
}

afonso@DESKTOP-1AHKMV2:/tmp/asd$ rustc --target=riscv64gc-unknown-linux-gnu -O --emit=obj --crate-type=lib ./test.rs
afonso@DESKTOP-1AHKMV2:/tmp/asd$ llvm-objdump -dr ./test.o

./test.o:       file format elf64-littleriscv

Disassembly of section .text.square:

0000000000000000 <square>:
       0: 17 05 00 00   auipc   a0, 0
                0000000000000000:  R_RISCV_PCREL_HI20   .LCPI0_0
       4: 03 35 05 00   ld      a0, 0(a0)
                0000000000000004:  R_RISCV_PCREL_LO12_I .Lpcrel_hi0
       8: 82 80         ret
afonso@DESKTOP-1AHKMV2:/tmp/asd$ ld.lld ./test.o
ld.lld: warning: cannot find entry symbol _start; not setting start address
afonso@DESKTOP-1AHKMV2:/tmp/asd$ llvm-objdump -dr ./a.out

./a.out:        file format elf64-littleriscv

Disassembly of section .text:

000000000001118c <square>:
   1118c: 17 15 00 00   auipc   a0, 1
   11190: 03 35 c5 00   ld      a0, 12(a0)
   11194: 82 80         ret

</details>

view this post on Zulip Wasmtime GitHub notifications bot (Sep 14 2023 at 13:26):

afonso360 edited a comment on issue #7030:

One of the WASI tests found an issue where we were wrongly specifying the maximum range for c.j relocations. I'm going to let this run on the fuzzer for a while to check if it finds any other such issues.

Question though: is it possible to remove the usage of emit_uncompressed? It seems like the usage here is primarily related to relocations and how PCRelHi20 for example expects a 4-byte instruction. Are there alternative relocations for the 16-bit compressed instructions we could use instead with some refactoring? Or is this matching what other compilers do where for these constructs uncompressed instructions are always emitted?

I don't fully know the answer to that question.

There's a few cases here that we are forcing to be uncompressed:

Calls

We force emit auipc+jalr, but we could emit just jalr, c.jalr or even c.j if we knew that the call target was in range.

The way the normal toolchains deal with this is that they emit the uncompressed variant and add two relocations on the call R_RISCV_CALL + R_RISCV_RELAX. The second relax relocation lets the linker know that it can transform the call instruction.

The RISC-V ABI does not specify any relocations for compressed versions of jalr. But that's just because they specify all of the allowed relaxations, and expect the compilers to emit the uncompressed versions and the linker to perform the relaxation.

I think we don't have any mechanism for doing this in cranelift. Although adding the relax relocation could be useful for cg_clif and other users that pass the resulting code through a normal linker.

Loads

I think this is the only place where PCRelHi20 specifically is used. We don't have a compressed equivalent (i.e. there is no c.auipc).

We only use these for loads & stores from labels (i.e. from the constant pool).

Here we always emit auipc+ld with the two relocations, we could emit just ld or c.ld if we knew that the constants were in range.

I don't think we have a way of performing those relaxations, but it would be pretty beneficial.

We also can't mix for example auipc+c.ld since due to the reduced range in c.ld we could end up in a situation where we can't encode the address if it ends up at a certain offset from the auipc address.

I did a short test to check what LLVM emits for loads from the constant pool. And they don't seem to compress those loads. Which I find slightly weird, so I might have just not fed it the proper flags.

<details>

<summary> LLVM pool loading asm</summary>

afonso@DESKTOP-1AHKMV2:/tmp/asd$ cat ./test.rs
#[no_mangle]
pub extern "C" fn square(_num: u64) -> u64 {
    0xABCDEF01234ABCD
}

afonso@DESKTOP-1AHKMV2:/tmp/asd$ rustc --target=riscv64gc-unknown-linux-gnu -O --emit=obj --crate-type=lib ./test.rs
afonso@DESKTOP-1AHKMV2:/tmp/asd$ llvm-objdump -dr ./test.o

./test.o:       file format elf64-littleriscv

Disassembly of section .text.square:

0000000000000000 <square>:
       0: 17 05 00 00   auipc   a0, 0
                0000000000000000:  R_RISCV_PCREL_HI20   .LCPI0_0
       4: 03 35 05 00   ld      a0, 0(a0)
                0000000000000004:  R_RISCV_PCREL_LO12_I .Lpcrel_hi0
       8: 82 80         ret
afonso@DESKTOP-1AHKMV2:/tmp/asd$ ld.lld ./test.o
ld.lld: warning: cannot find entry symbol _start; not setting start address
afonso@DESKTOP-1AHKMV2:/tmp/asd$ llvm-objdump -dr ./a.out

./a.out:        file format elf64-littleriscv

Disassembly of section .text:

000000000001118c <square>:
   1118c: 17 15 00 00   auipc   a0, 1
   11190: 03 35 c5 00   ld      a0, 12(a0)
   11194: 82 80         ret

</details>

view this post on Zulip Wasmtime GitHub notifications bot (Sep 14 2023 at 15:22):

alexcrichton commented on issue #7030:

Ok makes sense and sounds like we should leave this as is.

While I'm not super familiar with relaxed relocations my hunch is that Cranelift isn't quite ready for that. I believe we'd have to plumb through all MachLabel relocations rather than resolving them during emission because the linker is the only one that would be able to fully resolve everything. Either that or if we wanted to implement relaxation in the MachBuffer that would probably be a significant chunk of work probably not worth it for our use case.

In any case having uncompressed forms to start off with seems fine and we can always revisit and figure out solutions if that becomes a problem.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 15 2023 at 09:07):

afonso360 commented on issue #7030:

I've left the fuzzer running overnight on this, and it didn't find any issues with these instructions!


Last updated: Oct 23 2024 at 20:03 UTC