Stream: wasmtime

Topic: table.set and dtors


view this post on Zulip Alex Crichton (Jun 25 2020 at 20:55):

@fitzgen (he/him) hm I'm not entirely sure what you mean by RefCell and deferred destruction?

view this post on Zulip fitzgen (he/him) (Jun 25 2020 at 20:55):

TableElements has a RefCell<Vec<VMExternRef>>

view this post on Zulip fitzgen (he/him) (Jun 25 2020 at 20:56):

and deferred destruction would be just saving all the ref_count == 0 references in a side table, and running their dtors at a known-safe point in time

view this post on Zulip fitzgen (he/him) (Jun 25 2020 at 20:56):

I like your (more obvious) idea of just swapping it out of the table first. don't know why I didn't think of that >.<

view this post on Zulip fitzgen (he/him) (Jun 25 2020 at 20:56):

although it is one more basic block. I don't think thats a big deal tho

view this post on Zulip Alex Crichton (Jun 25 2020 at 20:56):

ah ok

view this post on Zulip Alex Crichton (Jun 25 2020 at 20:57):

to be clear, this is what I'm thinking -- https://gist.github.com/alexcrichton/fe9a5aff25cdcf943234f433babbb55e

GitHub Gist: instantly share code, notes, and snippets.

view this post on Zulip Alex Crichton (Jun 25 2020 at 20:57):

using your pseudocode

view this post on Zulip Alex Crichton (Jun 25 2020 at 20:57):

if that makes sense

view this post on Zulip Alex Crichton (Jun 25 2020 at 20:57):

so we shouldn't need deferred destruction or need to worry about ref cells and such

view this post on Zulip fitzgen (he/him) (Jun 25 2020 at 20:57):

ah nice, yeah that doesn't need a new basic block either

view this post on Zulip Alex Crichton (Jun 25 2020 at 20:57):

also, out of curiosity, does cranelift still do the ebb thing?

view this post on Zulip Alex Crichton (Jun 25 2020 at 20:58):

where you can have multiple exits from a block?

view this post on Zulip fitzgen (he/him) (Jun 25 2020 at 20:58):

not anymore

view this post on Zulip Alex Crichton (Jun 25 2020 at 20:58):

b/c that could simplify the blocks in a few places I think

view this post on Zulip Alex Crichton (Jun 25 2020 at 20:58):

ah ok, that makes more sense

view this post on Zulip fitzgen (he/him) (Jun 25 2020 at 20:59):

fwiw, I was adding fallthroughs manually, instead of jumps, but

  1. sometimes blocks get shuffled around, and these would break
  2. there is eventually a pass that turns jumps to the next block into a fallthrough, and it works well

view this post on Zulip fitzgen (he/him) (Jun 25 2020 at 20:59):

so I think we don't need to worry much about "lots of little blocks with many jumps between them" as long as we insert them into the layout in the right order

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:01):

makes sense

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:01):

I should learn more about what optimizations cranelift does

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:01):

@fitzgen (he/him) so the frob-the-refcount, does that turn into one add instruction on x86 with addressing modes?

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:01):

is that an optimization that cranelift does?

view this post on Zulip fitzgen (he/him) (Jun 25 2020 at 21:02):

it should be, but I'm not sure it does

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:02):

ah ok, wasn't sure if we needed to represent that in IR or not

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:02):

I should dig in at some point in the future, just not now lol

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:04):

@fitzgen (he/him) so here's another question, I don't know how stackmaps work, so I'm unsure. So the gist I sent, what happens if the dtor calls Store::gc?

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:04):

we loaded the pointer to drop from the table

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:04):

which presumably has the r64 type

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:04):

but we're dropping it so it shouldn't be traced

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:04):

even though it's "live" as a function argument

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:04):

because it's a function argument does that mean it's not included in the stack map?

view this post on Zulip fitzgen (he/him) (Jun 25 2020 at 21:06):

with the modified barrier pseudocode?

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:06):

yeah

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:06):

so like if cranelift sees function_call(some_r64_type)

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:07):

does the stack map at that function call include some_r64_type?

view this post on Zulip fitzgen (he/him) (Jun 25 2020 at 21:07):

it would, yes

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:08):

hm that may also be problematic

view this post on Zulip fitzgen (he/him) (Jun 25 2020 at 21:08):

this may mean we have to do deffered destruction

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:08):

and means we'll need to cast to i64 to call the drop intrinsic I think?

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:08):

or could we load the pointer as i64 instead?

view this post on Zulip fitzgen (he/him) (Jun 25 2020 at 21:08):

perhaps

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:08):

this may also be an issue for table.get where we're manually frobbing the reference count

view this post on Zulip fitzgen (he/him) (Jun 25 2020 at 21:09):

not sure I follow

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:10):

I sort of forget how the reference counts work out during gc

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:10):

er wait

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:10):

nevermind

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:10):

I'm worried about call activations_table_insert_with_gc(elem) but that's guaranteed to not call user code

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:11):

so we're ok there, it's just call drop_externref_in_place(current_elem) I'm worried about in terms of stack maps

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:11):

er, no, scratch that too, activations_table_insert_with_gc will do a GC

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:12):

it's at least something to consider, what happens when we gc during that call and we insert it from the stack plus manually

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:12):

and same thing for the drop, what happens if we gc while dropping, making sure we don't accidentally try to keep it in the activations table

view this post on Zulip fitzgen (he/him) (Jun 25 2020 at 21:13):

we should be fine with activations_table_insert_with_gc, since both the reference will be in the stack maps, and the first thing we do is in the impl is VMExternRef::clone_from_raw

view this post on Zulip fitzgen (he/him) (Jun 25 2020 at 21:13):

but yeah, asking these questions is great, and has already found a few bugs (thanks!)

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:13):

the wonders of code review :P

view this post on Zulip fitzgen (he/him) (Jun 25 2020 at 21:14):

Alex Crichton said:

fitzgen (he/him) so the frob-the-refcount, does that turn into one add instruction on x86 with addressing modes?

on x86 we would want to turn this into an xadd right?

I don't see any mention of xadd in the codebase

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:14):

xadd I thought was atomic_add

view this post on Zulip fitzgen (he/him) (Jun 25 2020 at 21:14):

and I don't remember it from when I ported simple_preopt to peepmatic

view this post on Zulip fitzgen (he/him) (Jun 25 2020 at 21:15):

its atomic if it is prefixed with lock

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:15):

I'm thinking of addl $1, (%rdi)

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:15):

https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=e8f2c6373f89f7cf087e973544b598f3

view this post on Zulip fitzgen (he/him) (Jun 25 2020 at 21:16):

hmm....

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:18):

I think you're right about activations_table_insert_with_gc as well, that'll clone_from_raw the pointer it reads from the stack and shove it in a set. The intrinsic also clone_from_raws, however, and since the ownership is in the original table anyway I think that all works out just fine

view this post on Zulip fitzgen (he/him) (Jun 25 2020 at 21:21):

<removed>

view this post on Zulip fitzgen (he/him) (Jun 25 2020 at 21:21):

wait I forgot to enable opts

view this post on Zulip fitzgen (he/him) (Jun 25 2020 at 21:21):

$ cat ~/scratch/add.clif; cargo run --manifest-path cranelift/Cargo.toml -- compile --target x86_64 -dp ~/scratch/add.clif -D --set opt_level=speed
function %add(i64) {
    block0(v0: i64):
        v1 = load.i32 aligned notrap v0
        v2 = iadd_imm v1, 1
        store aligned notrap v2, v1
        return
}
    Finished dev [unoptimized + debuginfo] target(s) in 0.08s
     Running `target/debug/clif-util compile --target x86_64 -dp /home/fitzgen/scratch/add.clif -D --set opt_level=speed`
function %add(i64 [%rdi], i64 fp [%rbp]) -> i64 fp [%rbp] fast {
    ss0 = incoming_arg 16, offset -16

                                block0(v0: i64 [%rdi], v4: i64 [%rbp]):
[RexOp1pushq#50]                    x86_push v4
[RexOp1copysp#8089]                 copy_special %rsp -> %rbp
[RexOp1ld#8b,%rax]                  v1 = load.i32 notrap aligned v0
[DynRexOp1umr#89,%rcx]              v3 = copy v1
[DynRexOp1r_ib#83,%rcx]             v2 = iadd_imm v3, 1
[RexOp1st#89]                       store notrap aligned v2, v1
[RexOp1popq#58,%rbp]                v5 = x86_pop.i64
[Op1ret#c3]                         return v5
}

.byte 64, 85, 72, 137, 229, 64, 139, 7, 137, 193, 131, 193, 1, 64, 137, 8, 64, 93, 195

Disassembly of 19 bytes:
   0:   40 55                   push    rbp
   2:   48 89 e5                mov rbp, rsp
   5:   40 8b 07                mov eax, dword ptr [rdi]
   8:   89 c1                   mov ecx, eax
   a:   83 c1 01                add ecx, 1
   d:   40 89 08                mov dword ptr [rax], ecx
  10:   40 5d                   pop rbp
  12:   c3                      ret


trap: stk_ovf at 0

view this post on Zulip fitzgen (he/him) (Jun 25 2020 at 21:22):

@Alex Crichton looks like cranelift can't do it

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:23):

aww :(

view this post on Zulip Alex Crichton (Jun 25 2020 at 21:23):

it'd be neat if we could easily look at the asm for wasm modules too at some point

view this post on Zulip fitzgen (he/him) (Jun 25 2020 at 21:28):

filed https://github.com/bytecodealliance/wasmtime/issues/1925

Right now, this function %add(i64) { block0(v0: i64): v1 = load.i32 aligned notrap v0 v2 = iadd_imm v1, 1 store aligned notrap v2, v1 return } compiles down into Disassembly of 19 bytes: 0: 40 55 p...

view this post on Zulip Yury Delendik (Jun 25 2020 at 22:36):

Alex Crichton said:

it'd be neat if we could easily look at the asm for wasm modules too at some point

Will wasmtime wasm2obj .. + objdump -d ... work?

view this post on Zulip Alex Crichton (Jun 26 2020 at 14:03):

@Yury Delendik I haven't used that too too much but it probably works just fine, that's a good poitn!

view this post on Zulip Yury Delendik (Jun 26 2020 at 14:05):

I'm trying to move compiler to emit ELF image directly, that's will improve wasm2obj further. (I'm using objdump regularly)

view this post on Zulip Alex Crichton (Jun 26 2020 at 14:12):

oh nice

view this post on Zulip fitzgen (he/him) (Jun 26 2020 at 16:10):

FWIW, RUST_LOG=debug dumps a bunch of cranelift ir too, but not the final disassembly

view this post on Zulip Alex Crichton (Jul 02 2020 at 13:59):

@fitzgen (he/him) I think the fuzz corpus may need to be updated for the new tables ops fuzz test? -- https://oss-fuzz-build-logs.storage.googleapis.com/log-5a0b58fe-4a35-46d5-b4d2-221905e7d208.txt

view this post on Zulip fitzgen (he/him) (Jul 02 2020 at 16:06):

we should fix the build to not require a corpus ahead of time, and just make an empty one if necessary; I'll make a PR

view this post on Zulip fitzgen (he/him) (Jul 02 2020 at 16:15):

https://github.com/google/oss-fuzz/pull/4065

This will prevent build failures like https://oss-fuzz-build-logs.storage.googleapis.com/log-5a0b58fe-4a35-46d5-b4d2-221905e7d208.txt in the future.

Last updated: Jan 24 2025 at 00:11 UTC