new recipe · cranelift · Zulip Chat Archive

Stream: cranelift

Topic: new recipe

Alex Crichton (Nov 02 2020 at 17:46):

@Chris Fallin I'm poking around at the stack limit issue, and I don't think that cranelift has an encoding already for something like subq $8, (%rsi), so I'm trying to add one

Alex Crichton (Nov 02 2020 at 17:46):

as expected I'm lost in a jungle of encodings and recipes, and was wondering if you'd be able to help out?

Alex Crichton (Nov 02 2020 at 17:46):

I also have no idea how x86 encoding works, so that probably doesn't help

Chris Fallin (Nov 02 2020 at 17:47):

Ah, I am not as much an expert in the old system (I grokked it enough to take the useful parts and build the redesign) but I can certainly help to dig in!

Chris Fallin (Nov 02 2020 at 17:47):

One issue though is that an encoding should correspond to a single CLIF op; i.e. one can't have many-to-one matches (many CLIF ops to one machine instruction)

Chris Fallin (Nov 02 2020 at 17:48):

So the above sub-from-memory is really a load-sub-store which... would be difficult to support if it arrives in that form.

Alex Crichton (Nov 02 2020 at 17:48):

yeah that's fine, this is very raw so far so I'm basically generating raw instructions

Chris Fallin (Nov 02 2020 at 17:48):

So I think the best option would be to invent a purpose-built CLIF op just for this, and give it the right encoding

Alex Crichton (Nov 02 2020 at 17:48):

yeah that's what I've got so far

wat · alexcrichton/wasmtime@3a421c3

Standalone JIT-style runtime for WebAsssembly, using Cranelift - alexcrichton/wasmtime

Alex Crichton (Nov 02 2020 at 17:49):

and I added a new format too for "has output condition codes" (I think)

wat · alexcrichton/wasmtime@3a421c3

Standalone JIT-style runtime for WebAsssembly, using Cranelift - alexcrichton/wasmtime

Alex Crichton (Nov 02 2020 at 17:49):

stack checks just generate the raw instruction

wat · alexcrichton/wasmtime@3a421c3

Standalone JIT-style runtime for WebAsssembly, using Cranelift - alexcrichton/wasmtime

Alex Crichton (Nov 02 2020 at 17:49):

what I'm stuck on is the encoding

Alex Crichton (Nov 02 2020 at 17:49):

in encodings.rs and recipes.rs

Alex Crichton (Nov 02 2020 at 17:50):

I'm sort of just copying push/pop right now (which have x86-specific things)

Alex Crichton (Nov 02 2020 at 17:50):

but I have no idea how to fill out recipes.rs

Alex Crichton (Nov 02 2020 at 17:50):

and tbh I don't know how to encode this instruction either

Alex Crichton (Nov 02 2020 at 17:50):

I just know that gas gives me 0: 48 83 6e 04 03 subq $3, 4(%rsi)

Chris Fallin (Nov 02 2020 at 17:53):

OK, so in the recipe -- I don't fully grok why we have EvexContext and all of that or why the input and output are FP regs? But that's aside from the actual question... I'm refilling my L1 cache wrt encodings infra now :-)

Alex Crichton (Nov 02 2020 at 17:54):

oh sorry that's just copy/pasted from above

Alex Crichton (Nov 02 2020 at 17:54):

the above recipe that is

Alex Crichton (Nov 02 2020 at 17:54):

the string should be "TODO":

Chris Fallin (Nov 02 2020 at 17:55):

Ah yes, OK, so we'll need to add the encoding bytes in opcodes.rs; then follow for example how iadd is tied to a recipe and opcode at line 1468: e.enc_i32_i64(inst, recipe.opcodes(&OPCODE))

Chris Fallin (Nov 02 2020 at 17:56):

the Rust-code-as-string in the recipe then needs to do ... things ... to emit the immediate following the opcode bytes

Chris Fallin (Nov 02 2020 at 17:56):

and properly embed the register number in the ModRM (?) byte

Alex Crichton (Nov 02 2020 at 17:58):

so, maybe weird question, how do I figure out the opcode?

Alex Crichton (Nov 02 2020 at 17:58):

some gas stuff gives:

  0: 48 83 2e 03                   subq    $3, (%rsi)
  4: 48 83 29 03                   subq    $3, (%rcx)
  8: 48 83 69 03 03                subq    $3, 3(%rcx)
  d: 48 83 a9 2c 01 00 00 03       subq    $3, 300(%rcx)
 15: 48 83 a9 ff 00 00 00 03       subq    $3, 255(%rcx)
 1d: 48 83 69 7f 03                subq    $3, 127(%rcx)
 22: 83 2e 03                      subl    $3, (%rsi)
 25: c3                            retq

Alex Crichton (Nov 02 2020 at 17:58):

does that mean the opcode here is 0x83?

Alex Crichton (Nov 02 2020 at 17:58):

(I have no idea how x86 encoding works)

Alex Crichton (Nov 02 2020 at 17:59):

according to this it claims I want REX.W + 83 /5 ib

Alex Crichton (Nov 02 2020 at 17:59):

so I guess REX.W is 0x48, but I don't know what /5 orib are

Chris Fallin (Nov 02 2020 at 18:01):

Oh, yes, 0x83 is the opcode there; 0x48 is the REX byte; 0x2e (in the first instruction) is the ModRM

Alex Crichton (Nov 02 2020 at 18:01):

ModRM?

Chris Fallin (Nov 02 2020 at 18:01):

the register number for the sub will be in the modrm and the high bit (try e.g. r12) will be in the REX byte

Chris Fallin (Nov 02 2020 at 18:01):

ModRM is... modifier, register, memory? Something like that; basically "operands to the x86 instruction". x86 is weird

Alex Crichton (Nov 02 2020 at 18:02):

aha ok

Andrew Brown (Nov 02 2020 at 18:03):

let me know if you still run into issues and I can try to help; I had to mess with those recipes when I first started in Cranelift and it was not easy to figure out

Chris Fallin (Nov 02 2020 at 18:03):

(the /5 above is the "modifier" I think; and ib is "immediate byte", I am guessing)

Alex Crichton (Nov 02 2020 at 18:04):

ok yeah I';m starting to actually read the manual

Alex Crichton (Nov 02 2020 at 18:04):

"/digit — A digit between 0 and 7 indicates that the ModR/M byte of the instruction uses only the r/m (register or memory) operand. The reg field contains the digit that provides an extension to the instruction's opcode."

Alex Crichton (Nov 02 2020 at 18:04):

"ib, iw, id, io — A 1-byte (ib), 2-byte (iw), 4-byte (id) or 8-byte (io) immediate operand to the instruction that follows the opcode, ModR/M bytes or scale-indexing bytes. The opcode determines if the operand is a signed value. All words, doublewords and quadwords are given with the low-order byte first."

Alex Crichton (Nov 02 2020 at 18:08):

oh man ok I'm still very lost

Alex Crichton (Nov 02 2020 at 18:08):

on recipes.rs

Alex Crichton (Nov 02 2020 at 18:20):

ok I'm throwing things at the wall, filecheck errors are inscrutable to me though

Alex Crichton (Nov 02 2020 at 18:20):

FAIL filetests/filetests/isa/x86/prologue-epilogue.clif: compile

Caused by:
    filecheck failed:
    #0 check: function %empty(i64 fp [%rbp]) -> i64 fp [%rbp] fast {
    #1 nextln: ss0 = incoming_arg 16, offset -16
    #2 nextln: block0(v0: i64 [%rbp]):
    #3 nextln: x86_push v0
    #4 nextln: copy_special %rsp -> %rbp
    #5 nextln: v1 = x86_pop.i64
    #6 nextln: return v1
    #7 nextln: }
    > function %empty(i64 fp [%rbp]) -> i64 fp [%rbp] fast {
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Matched #0: \bfunction %empty\(i64 fp \[%rbp\]\) \-> i64 fp \[%rbp\] fast \{
    >     ss0 = incoming_arg 16, offset -16
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Matched #1: \bss0 = incoming_arg 16, offset \-16\b
    >
    Missed #2: \bblock0\(v0: i64 \[%rbp\]\):
    >                                 block0(v0: i64 [%rbp]):
    > [Op1pushq#50]                       x86_push v0
    > [RexOp1copysp#8089]                 copy_special %rsp -> %rbp
    > [Op1popq#58,%rbp]                   v1 = x86_pop.i64
    > [Op1ret#c3]                         return v1
    > }

1 tests
Error: 1 failure

Alex Crichton (Nov 02 2020 at 18:20):

what's happening there?

Chris Fallin (Nov 02 2020 at 18:25):

Hmm, so this filetest isn't expecting the stackslot (ss0) created to name the incoming arg

Chris Fallin (Nov 02 2020 at 18:26):

If this is part of your patch, you can just add it, or alter nextln to check to allow content between the function line and the start of the BB

Andrew Brown (Nov 02 2020 at 18:30):

I think the ss0 is fine (matched), it may be a new line that cranelift-reader inserts before block headers (see the empty > )

Andrew Brown (Nov 02 2020 at 18:31):

but I think the same thing applies: check: block0... will reset the matcher so that it skips the newline

Alex Crichton (Nov 02 2020 at 18:33):

oh I remember this now

Alex Crichton (Nov 02 2020 at 18:33):

my editor strips trailing whitespace

Alex Crichton (Nov 02 2020 at 18:33):

and that's significant here...

fitzgen (he/him) (Nov 02 2020 at 18:39):

yeah I would really like filecheck not to be trailing whitespace significant...

Alex Crichton (Nov 02 2020 at 18:40):

ok I've thrown things at the wall

Alex Crichton (Nov 02 2020 at 18:40):

and now I get

[RexOp1umr#8089,%rax]               v1 = copy v0
[RexOp1sub_mem#83,%rflags]          v2 = x86_sub_mem notrap aligned 184, v1
;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
; error: inst2: RexOp1sub_mem#83 constraints not satisfied in: v2 = x86_sub_mem.i64 notrap aligned 184, v1
function %stack_limit(i64 stack_limit [%rdi], i64 fp [%rbp]) -> i64 fp [%rbp] fast {
    ss0 = explicit_slot 168, offset -184
    ss1 = incoming_arg 16, offset -16

Alex Crichton (Nov 02 2020 at 18:40):

where are constraints configured here? is that also in recipes.rs?

Chris Fallin (Nov 02 2020 at 18:41):

I think the constraints originate (or at least can originate) from the operands_in and operands_out builder methods invoked in recipes.rs

Alex Crichton (Nov 02 2020 at 18:42):

ok cool, thanks!

Alex Crichton (Nov 02 2020 at 18:42):

those are indeed garbage values, I should think on those

Chris Fallin (Nov 02 2020 at 18:42):

Probably you want vec![gpr] for ins and vec![] for outs?

Alex Crichton (Nov 02 2020 at 18:43):

if I have iflags going out

Alex Crichton (Nov 02 2020 at 18:43):

is it vec![rflags]?

Alex Crichton (Nov 02 2020 at 18:44):

ok cool making progress, now have

FAIL filetests/filetests/isa/x86/prologue-epilogue.clif: compile

Caused by:
    Expected code size 30, got 29
1 tests
Error: 1 failure

looks like I need to actually fill out the encoding now

Chris Fallin (Nov 02 2020 at 18:46):

ah, yeah, I think you want rflags out for completeness (to denote that the reg is clobbered), but I don't recall exactly how the flags checker works. Though it shouldn't matter if this is only being generated explicitly in a prologue

Alex Crichton (Nov 02 2020 at 18:49):

ok so I got to the point where the test is failing for the reason I expect it to be failing, basically I need to update the clif for the new instruction

Alex Crichton (Nov 02 2020 at 18:49):

before I do that though I want to actually implement the encoding

Alex Crichton (Nov 02 2020 at 18:50):

so I'm adding a binemit test for this?

Chris Fallin (Nov 02 2020 at 18:50):

yup

Alex Crichton (Nov 02 2020 at 18:51):

hm ok, so I added a simple one

Alex Crichton (Nov 02 2020 at 18:51):

and I get

Alex Crichton (Nov 02 2020 at 18:51):

thread 'worker #0' panicked at 'called `Option::unwrap()` on a `None` value', cranelift/filetests/src/test_binemit.rs:143:37
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
FAIL foo.clif: panicked in worker #0: called `Option::unwrap()` on a `None` value
1 tests
Error: 1 failure

Chris Fallin (Nov 02 2020 at 18:51):

those pesky Nones... hmm

Alex Crichton (Nov 02 2020 at 18:51):

where the panic is here

bytecodealliance/wasmtime

Standalone JIT-style runtime for WebAssembly, using Cranelift - bytecodealliance/wasmtime

Alex Crichton (Nov 02 2020 at 18:51):

I guess I need to fill that in in the backend

Alex Crichton (Nov 02 2020 at 18:52):

somewhere...

Chris Fallin (Nov 02 2020 at 18:53):

Does your patch add new stackslots?

Alex Crichton (Nov 02 2020 at 18:53):

I don't think so

Chris Fallin (Nov 02 2020 at 18:53):

(I am mostly no better than ripgrep to help you here, though, sorry :-/ )

Alex Crichton (Nov 02 2020 at 18:53):

I just moved it to a filter_map temporarily

Alex Crichton (Nov 02 2020 at 18:53):

nah it's ok

Alex Crichton (Nov 02 2020 at 18:53):

I will soldier on

Chris Fallin (Nov 02 2020 at 18:55):

fwiw I'll be happy to help bring this up in the new backend :-)

Chris Fallin (Nov 02 2020 at 18:55):

(which I imagine we'll need to do shortly if this becomes the normal translation for stack checks)

Alex Crichton (Nov 02 2020 at 18:56):

bah it seems binemit doesn't match the prologue epilogue

Andrew Brown (Nov 02 2020 at 18:56):

what does your foo.clif look like?

Alex Crichton (Nov 02 2020 at 18:56):

ah but yeah I will need to implement this in the new backend before landing

Alex Crichton (Nov 02 2020 at 18:57):

current the clif is

test binemit
set opt_level=speed_and_size
set is_pic
set enable_probestack=false
target x86_64 haswell

function %stack_limit(i64 stack_limit) {
; asm: xxx
; bin: 30
    ss0 = explicit_slot 168
block0(v0: i64):
    return
}

Alex Crichton (Nov 02 2020 at 18:57):

but it seems to want that directive to be attached to an instruction

Alex Crichton (Nov 02 2020 at 18:57):

oh I guess I can manually type in the instruction

Andrew Brown (Nov 02 2020 at 18:58):

yup, that sounds right

Alex Crichton (Nov 02 2020 at 18:58):

now I must learn to write clif...

Andrew Brown (Nov 02 2020 at 18:58):

what is the op you need to add?

Alex Crichton (Nov 02 2020 at 18:59):

I'm adding sub $imm, (%r12) basically

Alex Crichton (Nov 02 2020 at 18:59):

I haven't even gotten to try out the encoding

Alex Crichton (Nov 02 2020 at 18:59):

trying to write up a test which gives me "your encoding is weird" so far

Andrew Brown (Nov 02 2020 at 19:01):

(hm, scanning my brain to see if I remember if (%r12) addressing is even possible in the old backend...)

Andrew Brown (Nov 02 2020 at 19:01):

I mean syntactically... if you write an encoding for a new instruction it should be possible

Alex Crichton (Nov 02 2020 at 19:02):

yeah I'm just adding a whole new recipe/instruction

Alex Crichton (Nov 02 2020 at 19:02):

time to acquire llvm-mc...

Alex Crichton (Nov 02 2020 at 19:04):

ok so I now have:

function %foo() {
block0:
    [-,%rax]            v0 = iconst.i32 1
    ; asm: sub $123, (%rax)
    [-,%rbx]            v1 = x86_sub_mem 123, v0  ; bin: 30
    return
}

which yields

FAIL foo.clif: binemit

Caused by:
    No encodings found for: v1 = x86_sub_mem.i32 123, v0
1 tests
Error: 1 failure

Andrew Brown (Nov 02 2020 at 19:05):

if you run into issues with llvm-mc I've been using XED recently and it can disassemble machine code

Andrew Brown (Nov 02 2020 at 19:05):

what does your encodings.rs have in it?

Alex Crichton (Nov 02 2020 at 19:05):

oh that should be iconst.i64

Alex Crichton (Nov 02 2020 at 19:05):

I left it as

recipes.add_template(
    Template::new(
        EncodingRecipeBuilder::new("sub_mem", &formats.store_imm, 0)
            .operands_in(vec![gpr])
            .operands_out(vec![reg_rflags])
            .emit(
                r#"
                    {{PUT_OP}}(bits | (in_reg0 & 7), rex1(in_reg0), sink);
                "#,
            ),
        regs,
    )
    .rex_kind(RecipePrefixKind::AlwaysEmitRex),
);

Alex Crichton (Nov 02 2020 at 19:05):

which should be the encoding of pushing a register I think

Andrew Brown (Nov 02 2020 at 19:06):

no, I mean where you bind the CLIF to the recipe

Alex Crichton (Nov 02 2020 at 19:06):

it says e.enc_x86_64(x86_sub_mem.bind(I64), rec_x86_sub_mem.opcodes(&SUB_MEM));

Andrew Brown (Nov 02 2020 at 19:06):

like, if you didn't bind x86_sub_mem to the right type then that can cause the "no encodings" error

Andrew Brown (Nov 02 2020 at 19:07):

so you are binding to I64 there but in foo.clif you use x86_sub_mem.i32

Alex Crichton (Nov 02 2020 at 19:07):

oops yeah, now I get a little different

Andrew Brown (Nov 02 2020 at 19:07):

I suspect one should change to match the other

Alex Crichton (Nov 02 2020 at 19:07):

No matching encodings for v1 = x86_sub_mem.i64 123, v0 in [RexOp1sub_mem#83, RexOp1sub_mem#83]

Alex Crichton (Nov 02 2020 at 19:07):

after changing to iconst.i64

Andrew Brown (Nov 02 2020 at 19:10):

Hm... weird that you are getting two of the same in that list

Alex Crichton (Nov 02 2020 at 19:10):

aha got it!

Alex Crichton (Nov 02 2020 at 19:10):

I needed the output to be %rflags

Alex Crichton (Nov 02 2020 at 19:58):

ok I think I have almost wrangled everything, thanks again for your help @Andrew Brown and @Chris Fallin !

Alex Crichton (Nov 02 2020 at 23:06):

ok so turns out this is all folly, the decrement of the stack limit isn't atomic but it's also being updated from other threads

Alex Crichton (Nov 02 2020 at 23:06):

back to the drawing board!

Last updated: Apr 09 2025 at 00:13 UTC