Hi! Just to preface this, I’m a complete compiler newbie.
I was wondering if there might be value in adding a RISC-V 32-bit (RISCV32) backend to Cranelift. Since a Riscv64Backend
already exists, I’m assuming it could be adapted to generate RV32 binaries. However, I’d like to confirm whether a dedicated 32-bit backend might actually produce more efficient binaries.
If so, I’ve been exploring the source code, and here is my initial plan. I’d appreciate it if someone could take a look and let me know if this approach seems reasonable.
Just double-checking to avoid potential overlaps or gotchas – does it make sense to add a 32-bit RISC-V backend to Cranelift?
I don't speak for the rest of cranelift folks by any means, but at least personally I'd love to see a 32-bit backend for Cranelift for risc-v. It's perhaps worth cautioning though that this is going to be a relatively significant undertaking since there is no preexisting 32-bit backend that's complete (there's a 32-bit "pulley" backend but it's not fleshed out yet).
Some of the things you'll have to grapple with are:
ValueRegs
type only supports 2 registers, not the 4 needed for riscv32)Those are some things off the top of my head but it's probably not a complete list. We've talked in Cranelift about features to make things like this easier in the past, such as better target-specific legalization support to lower, for example, 64-bit operations to 32-bit operations in the mid-end instead of the backend. That work never finished though and has large-ish remaining open questions. I say this as an example of open-ended design work that doesn't already have an answer in Cranelift and would probably want to be fleshed out along the way.
To be clear though I don't say this to dissuade you, I'd still at least personally love to see this! If you'd like to continue to pursue this though what I might recommend is to attend a Cranelift meeting (they happen weekly on Wednesdays) and we can chat more about it. For example we might want to figure out how to review your work to get it all landed as well (it's hard to get a whole backend in one go). The review part may be sort of hard since cranelift folks are stretched pretty thin right now though.
Thank you for the detailed insights! I was thinking it might be simpler to start with a 32-bit backend for Cranelift that supports only up to 32-bit types (ints, floats, atomics) and excludes Rust’s higher bit types, such as u64
, f64
, and AtomicI64
, as part of an initial implementation. Unsupported types could trigger a compiler error, making the backend’s limitations clear. I hope this approach is acceptable.
I’d also be interested in joining the Cranelift meeting to discuss this further. Is there a way for me to add myself to the invite?
P.S. supporting types with higher bit precision at a higher level could simplify the introduction of additional backends, like Arm-Thumb2
, without overcomplicating the backend itself.
I'd also love to see a RISC-V 32 bit backend! I have some time to review PR's but not as much as I once had.
If we can find a way to lower the 64 bit instructions to 32 bit instructions in the mid end, I think we can share a lot of rules with the RISC-V 64 backend with minimal changes. But doing that looks like its going to be hard. It would also solve the 128bit operations problem.
I think we can reuse pretty much all of the instruction encodings that I recently started moving into a separate file. That file isn't complete, but it should help out.
Starting with just supporting 32bit ops in the backend seems like a good idea.
Afonso Bordado said:
I think we can reuse pretty much all of the instruction encodings that I recently started moving into a separate file. That file isn't complete, but it should help out.
My current plan is to begin with the base instruction set, RV32I, and progressively add the mfac
extensions. Most of the encode.rs
content should be reusable, as it already includes R, I, and S-type encodings. For RISCV32I, I guess I would only need to add the remaining types: B, U, and J.
My understanding of the instruction formats is as follows:
275CE856-BE94-4269-A490-55B94382A560.jpg
Quick question - I am assuming RV64I and RV32I share the same instruction formats and a very similar base instruction set architecture (ISA). Should encode.rs
in riscv64 also include B, U, J formats when it’s complete.
Starting with just supporting 32bit ops in the backend seems like a good idea.
Thank you for clarifying. :folded_hands:
Yeah, the instruction formats are the same, they just aren't present in encode.rs
since that is a fairly recent addition, and I haven't had the time to migrate the rest of the instruction formats to that file.
I was thinking it might be simpler to start with a 32-bit backend for Cranelift that supports only up to 32-bit types
Makes sense to me!
Is there a way for me to add myself to the invite?
There's a bit more info here but tl;dr; is DM Chris Fallin on Zulip
Nick owns the calendar event now fwiw (fitzgen on zulip). And +1 to the above in general; have more thoughts I’ll try to write out later
To add a bit more: the tracking issue for this is #8768, and last time this came up I left a comment that also links four previous times I braindumped a bit on the general state of things for adding new backends
In this case in particular I think there is a strong argument for reusing almost all of the encoding machinery. I wonder even how close we could get to "64-bitness is a backend option", and share the backend altogether? Then essentially we disable all rules that assume 64-bit registers under that flag (and replace them with a lowering in midend as suggested above). Might need a little more parameterization around things like constants but if we can pull it off, that'd be more maintainable than the duplication implied by separate riscv32/riscv64 backends, IMHO
I think that might be doable, I'm going to have a quick peek at our current rules to see which cases are incompatible, but I expect that as long as we don't ever see 128/64bit ops it might not be too bad
Quick question—are MInst
variants (defined in inst.isle) meant to strictly represent instruction formats for the target ISA, or is there more to it? I was reviewing the RV64 implementation and noticed several variants using the same instruction format along with some pseudo-instructions. For example:
;; I-type Layout:
;; 0-------6-7-------11-12------14-15------19-20------------------31
;; | Opcode | rd | width | rs1 | Offset[11:0] |
;; The I-type Instruction Format i.e. uses one register source, one immediate and a destination register.
(AluRRImm12
(alu_op AluOPRRI)
(rd WritableReg)
(rs Reg)
(imm12 Imm12))
;; Loads use the I-type Instruction Format.
;; Each load instruction in RV32I takes two operands
;; - A destination register (e.g., rd), where the data will be loaded.
;; - A base register (e.g., rs1) and an immediate offset, which together specify the memory address to load from.
(Load
(rd WritableReg)
(op LoadOP)
(flags MemFlags)
(from AMode))
;; Uses the I-type Instruction Format. In non-immediate CSR instructions (CSRRW, CSRRS, CSRRC), rs1 is used to specify
;; the register with the value to write.
(CsrReg
(op CsrRegOP)
(rd WritableReg)
(rs Reg)
(csr CSR))
;; Uses the I-type Instruction Format. In immediate CSR instructions (CSRRWI, CSRRSI, CSRRCI), rs1 is replaced by a 5-bit
;; immediate value.
(CsrImm
(op CsrImmOP)
(rd WritableReg)
(imm UImm5)
(csr CSR))
In short, how should I read MInst
(just for my understanding)
Note: I have added my comments to the above just to highlight my point.
MInst is a bit of a mish-mash but one of the main guiding principles of it is how shapes affect register allocation. FOr example AluRRImm12
has a destination and source register while CsrImm
doesn't (I think?).
Overall though it's sort of what works best in the ISLE code, afaik there's not a hard-and-fast rule one way or another
yes, definitely that, and also emission. Think of the base case as "every inst is separate" and then we group together instructions that are all the same except for details we can plumb through, like opcode bits
Quick question: The prelude_lower.isle file contains an internal extractor named has_type. I was trying to find its corresponding Rust implementation in the generated isle-riscv64.rs file in the build-out directory but couldn’t locate it. Am I missing something?
from prelude_lower.isle
;; Extract the type of the instruction's first result and pass along the
;; instruction as well.
(spec (has_type ty arg)
(provide (= result arg))
(require (= ty (widthof arg))))
(decl has_type (Type Inst) Inst)
(extractor (has_type ty inst)
(and (result_type ty)
inst))
If I understand correctly, internal extractors don’t actually exist (in the sense that they dont generate rust code); they simply map to external extractors. In this case, has_type
maps to:
(and (result_type ty)
inst)
where result_type
itself is an internal extractor that maps to first_result
, which is the actual external extractor:
;; Extract the first result value of the given instruction.
(decl first_result (Value) Inst)
(extern extractor first_result first_result)
;; Extract the type of the instruction's first result.
(decl result_type (Type) Inst)
(extractor (result_type ty)
(first_result (value_type ty)))
Am I right? I’m still not sure, however, if and
is a constructor or an ISLE keyword.
For the and
part, I believe it’s part of the ISLE language grammar i.e. it's part of the syntax.
Did I get this right or am I way off in my understanding?
Yep, that's all correct! internal extractors are expanded (inlined) in-place; and and
is a language keyword. All of this results in a call to the value_type
and first_result
Rust function implementations
Quick question: Does it make sense to introduce a new immediate type (e.g., Imm32) specifically for 32-bit architectures? From my understanding, it seems like we could just use Imm64
. I wanted to double-check if I’m overlooking anything here.
For instance, could we use Imm64
in such cases without running into any issues? The following are some examples of what I have in mind:
u64_from_imm64 -> u32_from_imm64 // Extract a u32 from an Imm64
u64_uextend_imm64 -> u32_uextend_imm64 // Zero-extend an Imm64 to a u32
Imm64 is mostly used for iconst, which should keep accepting 64bit immediates even on 32bit platforms. If you do iconst.i64 on a 32bit target it would just store both halves of the immediate in separate registers.
Sorry for the delay; I got pulled into something else.
My plan is to start with RV32I (just the base instruction set), without support for types or values larger than 32 bits (i.e., no 64-bit or higher). For this target, I assume it’s acceptable to explicitly error out if we encounter an iconst.i64
, correct?
As I understand it, lowering an Iconst Opcode
results in emitting an 8-byte value into an in-memory constant pool. Would it make sense to add a 32-bit variant to VCodeConstantData
(like a VCodeConstantData::U32), or is that unnecessary?
P.S. Compiler engineering is quite new to me, so please let me know if I’m overlooking something obvious.
Sure, a 32-bit constant-pool entry could make sense. It might be worth looking at how other compilers handle loading arbitrary 32-bit values too: for example, is it possible to do it with two immediate-form instructions (load high bits then OR in low bits or similar)? Usually RISC ISAs try to make this fast without going to dcache via a memory load and so have a "somewhat canonical" way of loading constants
(To explore that, it might be worthwhile writing some C functions like uint32_t foo() { return 0x12345678; }
and compiling with a RISC-V 32 toolchain, or using Compiler Explorer (godbolt.org, add --target riscv32-unknown-linux-gnu
to the Clang or rustc command line))
You might want to look at the rules we have for the RV64 backend, the immediate loading instructions are exactly the same ones used in RV32.
This now has a lot of rules but it essentially boils down: A combination ofaddi
and/or lui
can produce all values up to 32bits, and for larger stuff we use a load from a constant pool unless we find a shorter pattern.
Chris Fallin said:
(To explore that, it might be worthwhile writing some C functions like
uint32_t foo() { return 0x12345678; }
and compiling with a RISC-V 32 toolchain, or using Compiler Explorer (godbolt.org, add--target riscv32-unknown-linux-gnu
to the Clang or rustc command line))
thanks for this. I'll try this.
Afonso Bordado said:
You might want to look at the rules we have for the RV64 backend, the immediate loading instructions are exactly the same ones used in RV32.
This now has a lot of rules but it essentially boils down: A combination of
addi
and/orlui
can produce all values up to 32bits, and for larger stuff we use a load from a constant pool unless we find a shorter pattern.
I am primarily reusing the RV64 backend, with the main difference being that I’m starting with support for RV32I (excluding all standard extensions). My focus is on creating a bare-minimum RV32 backend to avoid the complexity of handling the entire architecture upfront and to submit a small, manageable PR.
A working implementation can be found here. It reuses the addi and lui rules (both part of the RV32I set) from the RV64 implementation. However, when it came to loading from a constant pool, I used a workaround, which I have yet to fully test.
From your explanation, it looks like the RV64 approach to loading constants might be sufficient to RV32 as well.
Last updated: Dec 23 2024 at 12:05 UTC