@Chris Fallin I'm poking around at the stack limit issue, and I don't think that cranelift has an encoding already for something like subq $8, (%rsi)
, so I'm trying to add one
as
as expected I'm lost in a jungle of encodings and recipes, and was wondering if you'd be able to help out?
I also have no idea how x86 encoding works, so that probably doesn't help
Ah, I am not as much an expert in the old system (I grokked it enough to take the useful parts and build the redesign) but I can certainly help to dig in!
One issue though is that an encoding should correspond to a single CLIF op; i.e. one can't have many-to-one matches (many CLIF ops to one machine instruction)
So the above sub-from-memory is really a load-sub-store which... would be difficult to support if it arrives in that form.
yeah that's fine, this is very raw so far so I'm basically generating raw instructions
So I think the best option would be to invent a purpose-built CLIF op just for this, and give it the right encoding
yeah that's what I've got so far
and I added a new format too for "has output condition codes" (I think)
stack checks just generate the raw instruction
what I'm stuck on is the encoding
in encodings.rs
and recipes.rs
I'm sort of just copying push/pop right now (which have x86-specific things)
but I have no idea how to fill out recipes.rs
and tbh I don't know how to encode this instruction either
I just know that gas gives me 0: 48 83 6e 04 03 subq $3, 4(%rsi)
OK, so in the recipe -- I don't fully grok why we have EvexContext
and all of that or why the input and output are FP regs? But that's aside from the actual question... I'm refilling my L1 cache wrt encodings infra now :-)
oh sorry that's just copy/pasted from above
the above recipe that is
the string should be "TODO":
Ah yes, OK, so we'll need to add the encoding bytes in opcodes.rs
; then follow for example how iadd
is tied to a recipe and opcode at line 1468: e.enc_i32_i64(inst, recipe.opcodes(&OPCODE))
the Rust-code-as-string in the recipe then needs to do ... things ... to emit the immediate following the opcode bytes
and properly embed the register number in the ModRM (?) byte
so, maybe weird question, how do I figure out the opcode?
some gas stuff gives:
0: 48 83 2e 03 subq $3, (%rsi)
4: 48 83 29 03 subq $3, (%rcx)
8: 48 83 69 03 03 subq $3, 3(%rcx)
d: 48 83 a9 2c 01 00 00 03 subq $3, 300(%rcx)
15: 48 83 a9 ff 00 00 00 03 subq $3, 255(%rcx)
1d: 48 83 69 7f 03 subq $3, 127(%rcx)
22: 83 2e 03 subl $3, (%rsi)
25: c3 retq
does that mean the opcode here is 0x83
?
(I have no idea how x86 encoding works)
according to this it claims I want REX.W + 83 /5 ib
so I guess REX.W
is 0x48, but I don't know what /5
orib
are
Oh, yes, 0x83 is the opcode there; 0x48 is the REX byte; 0x2e (in the first instruction) is the ModRM
ModRM?
the register number for the sub will be in the modrm and the high bit (try e.g. r12) will be in the REX byte
ModRM is... modifier, register, memory? Something like that; basically "operands to the x86 instruction". x86 is weird
aha ok
let me know if you still run into issues and I can try to help; I had to mess with those recipes when I first started in Cranelift and it was not easy to figure out
(the /5 above is the "modifier" I think; and ib
is "immediate byte", I am guessing)
ok yeah I';m starting to actually read the manual
"/digit — A digit between 0 and 7 indicates that the ModR/M byte of the instruction uses only the r/m (register or memory) operand. The reg field contains the digit that provides an extension to the instruction's opcode."
"ib, iw, id, io — A 1-byte (ib), 2-byte (iw), 4-byte (id) or 8-byte (io) immediate operand to the instruction that follows the opcode, ModR/M bytes or scale-indexing bytes. The opcode determines if the operand is a signed value. All words, doublewords and quadwords are given with the low-order byte first."
oh man ok I'm still very lost
on recipes.rs
ok I'm throwing things at the wall, filecheck errors are inscrutable to me though
FAIL filetests/filetests/isa/x86/prologue-epilogue.clif: compile
Caused by:
filecheck failed:
#0 check: function %empty(i64 fp [%rbp]) -> i64 fp [%rbp] fast {
#1 nextln: ss0 = incoming_arg 16, offset -16
#2 nextln: block0(v0: i64 [%rbp]):
#3 nextln: x86_push v0
#4 nextln: copy_special %rsp -> %rbp
#5 nextln: v1 = x86_pop.i64
#6 nextln: return v1
#7 nextln: }
> function %empty(i64 fp [%rbp]) -> i64 fp [%rbp] fast {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Matched #0: \bfunction %empty\(i64 fp \[%rbp\]\) \-> i64 fp \[%rbp\] fast \{
> ss0 = incoming_arg 16, offset -16
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Matched #1: \bss0 = incoming_arg 16, offset \-16\b
>
Missed #2: \bblock0\(v0: i64 \[%rbp\]\):
> block0(v0: i64 [%rbp]):
> [Op1pushq#50] x86_push v0
> [RexOp1copysp#8089] copy_special %rsp -> %rbp
> [Op1popq#58,%rbp] v1 = x86_pop.i64
> [Op1ret#c3] return v1
> }
1 tests
Error: 1 failure
what's happening there?
Hmm, so this filetest isn't expecting the stackslot (ss0
) created to name the incoming arg
If this is part of your patch, you can just add it, or alter nextln
to check
to allow content between the function
line and the start of the BB
I think the ss0
is fine (matched), it may be a new line that cranelift-reader inserts before block headers (see the empty >
)
but I think the same thing applies: check: block0...
will reset the matcher so that it skips the newline
oh I remember this now
my editor strips trailing whitespace
and that's significant here...
yeah I would really like filecheck not to be trailing whitespace significant...
ok I've thrown things at the wall
and now I get
[RexOp1umr#8089,%rax] v1 = copy v0
[RexOp1sub_mem#83,%rflags] v2 = x86_sub_mem notrap aligned 184, v1
;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
; error: inst2: RexOp1sub_mem#83 constraints not satisfied in: v2 = x86_sub_mem.i64 notrap aligned 184, v1
function %stack_limit(i64 stack_limit [%rdi], i64 fp [%rbp]) -> i64 fp [%rbp] fast {
ss0 = explicit_slot 168, offset -184
ss1 = incoming_arg 16, offset -16
where are constraints configured here? is that also in recipes.rs?
I think the constraints originate (or at least can originate) from the operands_in
and operands_out
builder methods invoked in recipes.rs
ok cool, thanks!
those are indeed garbage values, I should think on those
Probably you want vec![gpr]
for ins and vec![]
for outs?
if I have iflags going out
is it vec![rflags]
?
ok cool making progress, now have
FAIL filetests/filetests/isa/x86/prologue-epilogue.clif: compile
Caused by:
Expected code size 30, got 29
1 tests
Error: 1 failure
looks like I need to actually fill out the encoding now
ah, yeah, I think you want rflags
out for completeness (to denote that the reg is clobbered), but I don't recall exactly how the flags checker works. Though it shouldn't matter if this is only being generated explicitly in a prologue
ok so I got to the point where the test is failing for the reason I expect it to be failing, basically I need to update the clif for the new instruction
before I do that though I want to actually implement the encoding
so I'm adding a binemit
test for this?
yup
hm ok, so I added a simple one
and I get
thread 'worker #0' panicked at 'called `Option::unwrap()` on a `None` value', cranelift/filetests/src/test_binemit.rs:143:37
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
FAIL foo.clif: panicked in worker #0: called `Option::unwrap()` on a `None` value
1 tests
Error: 1 failure
those pesky None
s... hmm
where the panic is here
I guess I need to fill that in in the backend
somewhere...
Does your patch add new stackslots?
I don't think so
(I am mostly no better than ripgrep
to help you here, though, sorry :-/ )
I just moved it to a filter_map
temporarily
nah it's ok
I will soldier on
fwiw I'll be happy to help bring this up in the new backend :-)
(which I imagine we'll need to do shortly if this becomes the normal translation for stack checks)
bah it seems binemit doesn't match the prologue epilogue
what does your foo.clif look like?
ah but yeah I will need to implement this in the new backend before landing
current the clif is
test binemit
set opt_level=speed_and_size
set is_pic
set enable_probestack=false
target x86_64 haswell
function %stack_limit(i64 stack_limit) {
; asm: xxx
; bin: 30
ss0 = explicit_slot 168
block0(v0: i64):
return
}
but it seems to want that directive to be attached to an instruction
oh I guess I can manually type in the instruction
yup, that sounds right
now I must learn to write clif...
what is the op you need to add?
I'm adding sub $imm, (%r12)
basically
I haven't even gotten to try out the encoding
trying to write up a test which gives me "your encoding is weird" so far
(hm, scanning my brain to see if I remember if (%r12)
addressing is even possible in the old backend...)
I mean syntactically... if you write an encoding for a new instruction it should be possible
yeah I'm just adding a whole new recipe/instruction
time to acquire llvm-mc...
ok so I now have:
function %foo() {
block0:
[-,%rax] v0 = iconst.i32 1
; asm: sub $123, (%rax)
[-,%rbx] v1 = x86_sub_mem 123, v0 ; bin: 30
return
}
which yields
FAIL foo.clif: binemit
Caused by:
No encodings found for: v1 = x86_sub_mem.i32 123, v0
1 tests
Error: 1 failure
if you run into issues with llvm-mc I've been using XED recently and it can disassemble machine code
what does your encodings.rs have in it?
oh that should be iconst.i64
I left it as
recipes.add_template(
Template::new(
EncodingRecipeBuilder::new("sub_mem", &formats.store_imm, 0)
.operands_in(vec![gpr])
.operands_out(vec![reg_rflags])
.emit(
r#"
{{PUT_OP}}(bits | (in_reg0 & 7), rex1(in_reg0), sink);
"#,
),
regs,
)
.rex_kind(RecipePrefixKind::AlwaysEmitRex),
);
which should be the encoding of pushing a register I think
no, I mean where you bind the CLIF to the recipe
it says e.enc_x86_64(x86_sub_mem.bind(I64), rec_x86_sub_mem.opcodes(&SUB_MEM));
like, if you didn't bind x86_sub_mem
to the right type then that can cause the "no encodings" error
so you are binding to I64
there but in foo.clif you use x86_sub_mem.i32
oops yeah, now I get a little different
I suspect one should change to match the other
No matching encodings for v1 = x86_sub_mem.i64 123, v0 in [RexOp1sub_mem#83, RexOp1sub_mem#83]
after changing to iconst.i64
Hm... weird that you are getting two of the same in that list
aha got it!
I needed the output to be %rflags
ok I think I have almost wrangled everything, thanks again for your help @Andrew Brown and @Chris Fallin !
ok so turns out this is all folly, the decrement of the stack limit isn't atomic but it's also being updated from other threads
back to the drawing board!
Last updated: Dec 23 2024 at 12:05 UTC