Stream: cranelift

Topic: new recipe


view this post on Zulip Alex Crichton (Nov 02 2020 at 17:46):

@Chris Fallin I'm poking around at the stack limit issue, and I don't think that cranelift has an encoding already for something like subq $8, (%rsi), so I'm trying to add one

view this post on Zulip Alex Crichton (Nov 02 2020 at 17:46):

as

view this post on Zulip Alex Crichton (Nov 02 2020 at 17:46):

as expected I'm lost in a jungle of encodings and recipes, and was wondering if you'd be able to help out?

view this post on Zulip Alex Crichton (Nov 02 2020 at 17:46):

I also have no idea how x86 encoding works, so that probably doesn't help

view this post on Zulip Chris Fallin (Nov 02 2020 at 17:47):

Ah, I am not as much an expert in the old system (I grokked it enough to take the useful parts and build the redesign) but I can certainly help to dig in!

view this post on Zulip Chris Fallin (Nov 02 2020 at 17:47):

One issue though is that an encoding should correspond to a single CLIF op; i.e. one can't have many-to-one matches (many CLIF ops to one machine instruction)

view this post on Zulip Chris Fallin (Nov 02 2020 at 17:48):

So the above sub-from-memory is really a load-sub-store which... would be difficult to support if it arrives in that form.

view this post on Zulip Alex Crichton (Nov 02 2020 at 17:48):

yeah that's fine, this is very raw so far so I'm basically generating raw instructions

view this post on Zulip Chris Fallin (Nov 02 2020 at 17:48):

So I think the best option would be to invent a purpose-built CLIF op just for this, and give it the right encoding

view this post on Zulip Alex Crichton (Nov 02 2020 at 17:48):

yeah that's what I've got so far

Standalone JIT-style runtime for WebAsssembly, using Cranelift - alexcrichton/wasmtime

view this post on Zulip Alex Crichton (Nov 02 2020 at 17:49):

and I added a new format too for "has output condition codes" (I think)

Standalone JIT-style runtime for WebAsssembly, using Cranelift - alexcrichton/wasmtime

view this post on Zulip Alex Crichton (Nov 02 2020 at 17:49):

stack checks just generate the raw instruction

Standalone JIT-style runtime for WebAsssembly, using Cranelift - alexcrichton/wasmtime

view this post on Zulip Alex Crichton (Nov 02 2020 at 17:49):

what I'm stuck on is the encoding

view this post on Zulip Alex Crichton (Nov 02 2020 at 17:49):

in encodings.rs and recipes.rs

view this post on Zulip Alex Crichton (Nov 02 2020 at 17:50):

I'm sort of just copying push/pop right now (which have x86-specific things)

view this post on Zulip Alex Crichton (Nov 02 2020 at 17:50):

but I have no idea how to fill out recipes.rs

view this post on Zulip Alex Crichton (Nov 02 2020 at 17:50):

and tbh I don't know how to encode this instruction either

view this post on Zulip Alex Crichton (Nov 02 2020 at 17:50):

I just know that gas gives me 0: 48 83 6e 04 03 subq $3, 4(%rsi)

view this post on Zulip Chris Fallin (Nov 02 2020 at 17:53):

OK, so in the recipe -- I don't fully grok why we have EvexContext and all of that or why the input and output are FP regs? But that's aside from the actual question... I'm refilling my L1 cache wrt encodings infra now :-)

view this post on Zulip Alex Crichton (Nov 02 2020 at 17:54):

oh sorry that's just copy/pasted from above

view this post on Zulip Alex Crichton (Nov 02 2020 at 17:54):

the above recipe that is

view this post on Zulip Alex Crichton (Nov 02 2020 at 17:54):

the string should be "TODO":

view this post on Zulip Chris Fallin (Nov 02 2020 at 17:55):

Ah yes, OK, so we'll need to add the encoding bytes in opcodes.rs; then follow for example how iadd is tied to a recipe and opcode at line 1468: e.enc_i32_i64(inst, recipe.opcodes(&OPCODE))

view this post on Zulip Chris Fallin (Nov 02 2020 at 17:56):

the Rust-code-as-string in the recipe then needs to do ... things ... to emit the immediate following the opcode bytes

view this post on Zulip Chris Fallin (Nov 02 2020 at 17:56):

and properly embed the register number in the ModRM (?) byte

view this post on Zulip Alex Crichton (Nov 02 2020 at 17:58):

so, maybe weird question, how do I figure out the opcode?

view this post on Zulip Alex Crichton (Nov 02 2020 at 17:58):

some gas stuff gives:

  0: 48 83 2e 03                   subq    $3, (%rsi)
  4: 48 83 29 03                   subq    $3, (%rcx)
  8: 48 83 69 03 03                subq    $3, 3(%rcx)
  d: 48 83 a9 2c 01 00 00 03       subq    $3, 300(%rcx)
 15: 48 83 a9 ff 00 00 00 03       subq    $3, 255(%rcx)
 1d: 48 83 69 7f 03                subq    $3, 127(%rcx)
 22: 83 2e 03                      subl    $3, (%rsi)
 25: c3                            retq

view this post on Zulip Alex Crichton (Nov 02 2020 at 17:58):

does that mean the opcode here is 0x83?

view this post on Zulip Alex Crichton (Nov 02 2020 at 17:58):

(I have no idea how x86 encoding works)

view this post on Zulip Alex Crichton (Nov 02 2020 at 17:59):

according to this it claims I want REX.W + 83 /5 ib

view this post on Zulip Alex Crichton (Nov 02 2020 at 17:59):

so I guess REX.W is 0x48, but I don't know what /5 orib are

view this post on Zulip Chris Fallin (Nov 02 2020 at 18:01):

Oh, yes, 0x83 is the opcode there; 0x48 is the REX byte; 0x2e (in the first instruction) is the ModRM

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:01):

ModRM?

view this post on Zulip Chris Fallin (Nov 02 2020 at 18:01):

the register number for the sub will be in the modrm and the high bit (try e.g. r12) will be in the REX byte

view this post on Zulip Chris Fallin (Nov 02 2020 at 18:01):

ModRM is... modifier, register, memory? Something like that; basically "operands to the x86 instruction". x86 is weird

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:02):

aha ok

view this post on Zulip Andrew Brown (Nov 02 2020 at 18:03):

let me know if you still run into issues and I can try to help; I had to mess with those recipes when I first started in Cranelift and it was not easy to figure out

view this post on Zulip Chris Fallin (Nov 02 2020 at 18:03):

(the /5 above is the "modifier" I think; and ib is "immediate byte", I am guessing)

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:04):

ok yeah I';m starting to actually read the manual

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:04):

"/digit — A digit between 0 and 7 indicates that the ModR/M byte of the instruction uses only the r/m (register or memory) operand. The reg field contains the digit that provides an extension to the instruction's opcode."

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:04):

"ib, iw, id, io — A 1-byte (ib), 2-byte (iw), 4-byte (id) or 8-byte (io) immediate operand to the instruction that follows the opcode, ModR/M bytes or scale-indexing bytes. The opcode determines if the operand is a signed value. All words, doublewords and quadwords are given with the low-order byte first."

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:08):

oh man ok I'm still very lost

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:08):

on recipes.rs

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:20):

ok I'm throwing things at the wall, filecheck errors are inscrutable to me though

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:20):

FAIL filetests/filetests/isa/x86/prologue-epilogue.clif: compile

Caused by:
    filecheck failed:
    #0 check: function %empty(i64 fp [%rbp]) -> i64 fp [%rbp] fast {
    #1 nextln: ss0 = incoming_arg 16, offset -16
    #2 nextln: block0(v0: i64 [%rbp]):
    #3 nextln: x86_push v0
    #4 nextln: copy_special %rsp -> %rbp
    #5 nextln: v1 = x86_pop.i64
    #6 nextln: return v1
    #7 nextln: }
    > function %empty(i64 fp [%rbp]) -> i64 fp [%rbp] fast {
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Matched #0: \bfunction %empty\(i64 fp \[%rbp\]\) \-> i64 fp \[%rbp\] fast \{
    >     ss0 = incoming_arg 16, offset -16
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Matched #1: \bss0 = incoming_arg 16, offset \-16\b
    >
    Missed #2: \bblock0\(v0: i64 \[%rbp\]\):
    >                                 block0(v0: i64 [%rbp]):
    > [Op1pushq#50]                       x86_push v0
    > [RexOp1copysp#8089]                 copy_special %rsp -> %rbp
    > [Op1popq#58,%rbp]                   v1 = x86_pop.i64
    > [Op1ret#c3]                         return v1
    > }

1 tests
Error: 1 failure

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:20):

what's happening there?

view this post on Zulip Chris Fallin (Nov 02 2020 at 18:25):

Hmm, so this filetest isn't expecting the stackslot (ss0) created to name the incoming arg

view this post on Zulip Chris Fallin (Nov 02 2020 at 18:26):

If this is part of your patch, you can just add it, or alter nextln to check to allow content between the function line and the start of the BB

view this post on Zulip Andrew Brown (Nov 02 2020 at 18:30):

I think the ss0 is fine (matched), it may be a new line that cranelift-reader inserts before block headers (see the empty > )

view this post on Zulip Andrew Brown (Nov 02 2020 at 18:31):

but I think the same thing applies: check: block0... will reset the matcher so that it skips the newline

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:33):

oh I remember this now

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:33):

my editor strips trailing whitespace

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:33):

and that's significant here...

view this post on Zulip fitzgen (he/him) (Nov 02 2020 at 18:39):

yeah I would really like filecheck not to be trailing whitespace significant...

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:40):

ok I've thrown things at the wall

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:40):

and now I get

[RexOp1umr#8089,%rax]               v1 = copy v0
[RexOp1sub_mem#83,%rflags]          v2 = x86_sub_mem notrap aligned 184, v1
;~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
; error: inst2: RexOp1sub_mem#83 constraints not satisfied in: v2 = x86_sub_mem.i64 notrap aligned 184, v1
function %stack_limit(i64 stack_limit [%rdi], i64 fp [%rbp]) -> i64 fp [%rbp] fast {
    ss0 = explicit_slot 168, offset -184
    ss1 = incoming_arg 16, offset -16

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:40):

where are constraints configured here? is that also in recipes.rs?

view this post on Zulip Chris Fallin (Nov 02 2020 at 18:41):

I think the constraints originate (or at least can originate) from the operands_in and operands_out builder methods invoked in recipes.rs

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:42):

ok cool, thanks!

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:42):

those are indeed garbage values, I should think on those

view this post on Zulip Chris Fallin (Nov 02 2020 at 18:42):

Probably you want vec![gpr] for ins and vec![] for outs?

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:43):

if I have iflags going out

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:43):

is it vec![rflags]?

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:44):

ok cool making progress, now have

FAIL filetests/filetests/isa/x86/prologue-epilogue.clif: compile

Caused by:
    Expected code size 30, got 29
1 tests
Error: 1 failure

looks like I need to actually fill out the encoding now

view this post on Zulip Chris Fallin (Nov 02 2020 at 18:46):

ah, yeah, I think you want rflags out for completeness (to denote that the reg is clobbered), but I don't recall exactly how the flags checker works. Though it shouldn't matter if this is only being generated explicitly in a prologue

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:49):

ok so I got to the point where the test is failing for the reason I expect it to be failing, basically I need to update the clif for the new instruction

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:49):

before I do that though I want to actually implement the encoding

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:50):

so I'm adding a binemit test for this?

view this post on Zulip Chris Fallin (Nov 02 2020 at 18:50):

yup

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:51):

hm ok, so I added a simple one

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:51):

and I get

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:51):

thread 'worker #0' panicked at 'called `Option::unwrap()` on a `None` value', cranelift/filetests/src/test_binemit.rs:143:37
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
FAIL foo.clif: panicked in worker #0: called `Option::unwrap()` on a `None` value
1 tests
Error: 1 failure

view this post on Zulip Chris Fallin (Nov 02 2020 at 18:51):

those pesky Nones... hmm

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:51):

where the panic is here

Standalone JIT-style runtime for WebAssembly, using Cranelift - bytecodealliance/wasmtime

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:51):

I guess I need to fill that in in the backend

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:52):

somewhere...

view this post on Zulip Chris Fallin (Nov 02 2020 at 18:53):

Does your patch add new stackslots?

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:53):

I don't think so

view this post on Zulip Chris Fallin (Nov 02 2020 at 18:53):

(I am mostly no better than ripgrep to help you here, though, sorry :-/ )

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:53):

I just moved it to a filter_map temporarily

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:53):

nah it's ok

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:53):

I will soldier on

view this post on Zulip Chris Fallin (Nov 02 2020 at 18:55):

fwiw I'll be happy to help bring this up in the new backend :-)

view this post on Zulip Chris Fallin (Nov 02 2020 at 18:55):

(which I imagine we'll need to do shortly if this becomes the normal translation for stack checks)

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:56):

bah it seems binemit doesn't match the prologue epilogue

view this post on Zulip Andrew Brown (Nov 02 2020 at 18:56):

what does your foo.clif look like?

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:56):

ah but yeah I will need to implement this in the new backend before landing

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:57):

current the clif is

test binemit
set opt_level=speed_and_size
set is_pic
set enable_probestack=false
target x86_64 haswell

function %stack_limit(i64 stack_limit) {
; asm: xxx
; bin: 30
    ss0 = explicit_slot 168
block0(v0: i64):
    return
}

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:57):

but it seems to want that directive to be attached to an instruction

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:57):

oh I guess I can manually type in the instruction

view this post on Zulip Andrew Brown (Nov 02 2020 at 18:58):

yup, that sounds right

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:58):

now I must learn to write clif...

view this post on Zulip Andrew Brown (Nov 02 2020 at 18:58):

what is the op you need to add?

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:59):

I'm adding sub $imm, (%r12) basically

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:59):

I haven't even gotten to try out the encoding

view this post on Zulip Alex Crichton (Nov 02 2020 at 18:59):

trying to write up a test which gives me "your encoding is weird" so far

view this post on Zulip Andrew Brown (Nov 02 2020 at 19:01):

(hm, scanning my brain to see if I remember if (%r12) addressing is even possible in the old backend...)

view this post on Zulip Andrew Brown (Nov 02 2020 at 19:01):

I mean syntactically... if you write an encoding for a new instruction it should be possible

view this post on Zulip Alex Crichton (Nov 02 2020 at 19:02):

yeah I'm just adding a whole new recipe/instruction

view this post on Zulip Alex Crichton (Nov 02 2020 at 19:02):

time to acquire llvm-mc...

view this post on Zulip Alex Crichton (Nov 02 2020 at 19:04):

ok so I now have:

function %foo() {
block0:
    [-,%rax]            v0 = iconst.i32 1
    ; asm: sub $123, (%rax)
    [-,%rbx]            v1 = x86_sub_mem 123, v0  ; bin: 30
    return
}

which yields

FAIL foo.clif: binemit

Caused by:
    No encodings found for: v1 = x86_sub_mem.i32 123, v0
1 tests
Error: 1 failure

view this post on Zulip Andrew Brown (Nov 02 2020 at 19:05):

if you run into issues with llvm-mc I've been using XED recently and it can disassemble machine code

view this post on Zulip Andrew Brown (Nov 02 2020 at 19:05):

what does your encodings.rs have in it?

view this post on Zulip Alex Crichton (Nov 02 2020 at 19:05):

oh that should be iconst.i64

view this post on Zulip Alex Crichton (Nov 02 2020 at 19:05):

I left it as

recipes.add_template(
    Template::new(
        EncodingRecipeBuilder::new("sub_mem", &formats.store_imm, 0)
            .operands_in(vec![gpr])
            .operands_out(vec![reg_rflags])
            .emit(
                r#"
                    {{PUT_OP}}(bits | (in_reg0 & 7), rex1(in_reg0), sink);
                "#,
            ),
        regs,
    )
    .rex_kind(RecipePrefixKind::AlwaysEmitRex),
);

view this post on Zulip Alex Crichton (Nov 02 2020 at 19:05):

which should be the encoding of pushing a register I think

view this post on Zulip Andrew Brown (Nov 02 2020 at 19:06):

no, I mean where you bind the CLIF to the recipe

view this post on Zulip Alex Crichton (Nov 02 2020 at 19:06):

it says e.enc_x86_64(x86_sub_mem.bind(I64), rec_x86_sub_mem.opcodes(&SUB_MEM));

view this post on Zulip Andrew Brown (Nov 02 2020 at 19:06):

like, if you didn't bind x86_sub_mem to the right type then that can cause the "no encodings" error

view this post on Zulip Andrew Brown (Nov 02 2020 at 19:07):

so you are binding to I64 there but in foo.clif you use x86_sub_mem.i32

view this post on Zulip Alex Crichton (Nov 02 2020 at 19:07):

oops yeah, now I get a little different

view this post on Zulip Andrew Brown (Nov 02 2020 at 19:07):

I suspect one should change to match the other

view this post on Zulip Alex Crichton (Nov 02 2020 at 19:07):

No matching encodings for v1 = x86_sub_mem.i64 123, v0 in [RexOp1sub_mem#83, RexOp1sub_mem#83]

view this post on Zulip Alex Crichton (Nov 02 2020 at 19:07):

after changing to iconst.i64

view this post on Zulip Andrew Brown (Nov 02 2020 at 19:10):

Hm... weird that you are getting two of the same in that list

view this post on Zulip Alex Crichton (Nov 02 2020 at 19:10):

aha got it!

view this post on Zulip Alex Crichton (Nov 02 2020 at 19:10):

I needed the output to be %rflags

view this post on Zulip Alex Crichton (Nov 02 2020 at 19:58):

ok I think I have almost wrangled everything, thanks again for your help @Andrew Brown and @Chris Fallin !

view this post on Zulip Alex Crichton (Nov 02 2020 at 23:06):

ok so turns out this is all folly, the decrement of the stack limit isn't atomic but it's also being updated from other threads

view this post on Zulip Alex Crichton (Nov 02 2020 at 23:06):

back to the drawing board!


Last updated: Jan 24 2025 at 00:11 UTC