Stream: wasmtime

Topic: Function References PR


view this post on Zulip Ivan Mikushin (Apr 12 2023 at 17:28):

I've noticed this PR (https://github.com/bytecodealliance/wasmtime/pull/5288) hasn't had any activity since Mar 3. Function references is listed as a pre-requisite to Wasm GC, so seems pretty important to get this in. Is anybody working on this? Any idea if this is still getting merged any time soon?

This patch, written by @dhil and I, implements the (complete) function references proposal, forming the wasmtime side of the existing pull request bytecodealliance/wasm-tools#701. It is missing s...

view this post on Zulip fitzgen (he/him) (Apr 12 2023 at 17:58):

@Luna Phipps-Costin @Daniel Hillerström: are you still working on that pull request? anything we can do to unblock you?

view this post on Zulip Daniel Hillerström (Apr 12 2023 at 19:11):

@fitzgen (he/him) yes, I am actively working on it. I am currently blocked on a paper deadline this Saturday :-)) I think what I need help with addressing is where to add the error paths for the embedder side of things (c.f. https://github.com/bytecodealliance/wasmtime/pull/5288#discussion_r1126569577). I am quite eager to land the patch as it will make it much easier for me to keep up with changes to wasmtime on my continuations branch

This patch, written by @dhil and I, implements the (complete) function references proposal, forming the wasmtime side of the existing pull request bytecodealliance/wasm-tools#701. It is missing s...

view this post on Zulip fitzgen (he/him) (Apr 12 2023 at 20:28):

I've been thinking a lot about the embedder API side of things as I work on the RFC for wasm-gc, will share more soon

unfortunately, I don't think Alex's suggestion to skip the embedder API as a temporary stopgap to unblock other work is actually feasible since we will need the embedder API to get .wast tests running

view this post on Zulip Daniel Hillerström (Apr 12 2023 at 20:33):

OK, what path forward do you suggest?

view this post on Zulip fitzgen (he/him) (Apr 12 2023 at 21:18):

I think we will need to properly address the embedder API, I have a couple ideas, but need to talk it over with folks to decide which path is best. should have more guidance by the time you're done with your paper deadline :)

view this post on Zulip Daniel Hillerström (Apr 13 2023 at 07:29):

Excellent thanks! Maybe we can meet at the end of next week, or the week after, such that I can learn more about your ideas?

view this post on Zulip fitzgen (he/him) (Apr 13 2023 at 15:53):

how would monday at 11:30am pacific work?

view this post on Zulip Daniel Hillerström (Apr 13 2023 at 20:43):

@fitzgen (he/him) that ought to work for me, thanks! (Monday April 17)

view this post on Zulip Ivan Mikushin (Apr 13 2023 at 22:34):

(deleted)

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:08):

@fitzgen (he/him) re https://github.com/bytecodealliance/wasmtime/pull/7943 and the FuncType field of TypedFunc, might be faster to do this here -- two comments of yours indicate we can remove it via WasmTy for TypedFunc<...>, but isn't the type still needed to thread through to call_raw?

While we supported the function references proposal inside Wasm, we didn't support it on the "outside" in the Wasmtime embedder APIs. So much of the work here is exposing typed function references,...

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:09):

e.g. somehow I thought we needed to have type information to flow into the runtime type-checks of the arguments

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:10):

the only reason the type is needed to thread through to call_raw is for runtime checks, which would no longer be needed if we stop doing runtime checks :)

and we could stop doing runtime checks, but still allow static typing of functions that use concrete type refernces via impl WasmTy for TypedFunc

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:10):

we'll still need runtime checks for gc types though right?

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:10):

e.g. (ref $struct_type)

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:10):

so not for anything currently but needed in the future soon

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:10):

I think we could have TypedStruct<(...)>

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:11):

oh you're thinking we can skip the derive?

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:11):

or make it much simpler?

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:12):

I suppose you're thinking like TypedStruct<(i32, Mut<i32>)> for mutable fields?

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:12):

yeah, it would not be turning repr(C) rust structs into GC types and vice versa

it would be "I have type checked that the instance of this struct has the type parameters fields"

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:12):

something like that

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:12):

unclear what to do wrt recursion groups and nominal types

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:12):

I was just thinking that heh

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:12):

e.g. all the canonicalization that wasm-gc does

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:13):

that'd be hard to reflect into Rust

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:13):

I guess even if we restricted it to structural stuff, and had cyclic edges be supertypes, which would allow reading offsets of the underlying object directly, we would still need a dynamic type check when passing into a typed function

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:14):

ok so maybe this isn't worth pursuing

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:14):

or at least, not worth undoing what we have now

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:14):

so while TypedFunc typechecks could go away we'll still need the type information in the future

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:14):

might still be worth it even if it doesn't 100% solve the problem

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:14):

Alex Crichton said:

so while TypedFunc typechecks could go away we'll still need the type information in the future

right

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:14):

it might be useful to have it available for simple cases yeah

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:15):

right, but we can cross that bridge when we get to it

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:15):

so back to this PR

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:15):

realloc doesn't have an Instance

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:15):

do you think there's any way we could like pass around the type index and then lazily turn that into a whole function type?

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:15):

I should go look at realloc again

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:15):

it has a *mut wasmtime_runtime::ComponentInstance

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:15):

oh wait though realloc is a vmfuncref?

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:16):

so there's a type in there?

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:16):

we could stash the raw u32 of the VMSharedTypeIndex in there

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:16):

oh is it?

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:16):

checking

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:16):

er wait hang on

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:16):

this is true of all call_raw

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:16):

we pass in vmfuncref so we can probably drop the &FuncType parameter

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:16):

b/c we can load it lazily?

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:17):

well going from VMSharedTypeIndex to FuncType still involves locking

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:17):

I wonder if that may be the way to go though so long as it's lazy

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:17):

to increment the associated registered reference count

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:18):

keep integers/etc on the fast path

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:19):

it is really annoying that wasmtime::FuncType and wasmtime_types::WasmFuncType are not the same thing

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:19):

is there anything actually preventing them from being the sa,me?

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:19):

we could probably make them the same at this point

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:19):

or well

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:19):

I think ValType and WasmValType would still have to be different

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:19):

FuncType has ValType which recursively includes FuncType

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:19):

yeah

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:20):

time to whip out ValType<T> again

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:20):

but that's okay because we only expose params/returns publicly via impl ExactSizeIterator

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:20):

so we could lazily do the translation at that level

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:20):

I'm always down for merging the layers of abstration

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:20):

it always feels like we have too many

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:21):

(also it is annoying that we don't have VMSharedTypeIndex in the wasm_types crate for use in EngineOrModuleInternalIndex

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:21):

)

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:21):

okay, I agree this should be explored, but

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:21):

we could move that though?

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:21):

that one's easy to move

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:21):

I don't really want to do that in this PR

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:21):

true

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:21):

Alex Crichton said:

we could move that though?

Yeah I think so

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:22):

fitzgen (he/him) said:

so we could lazily do the translation at that level

lmao we already do this lazily

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:22):

we get the referenced WasmFuncType via our Arc and then lazily translate

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:23):

so if it works for this PR I think it would be best to drop the &FuncType from call_raw

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:23):

and somehow lazily load the type info from Engine if needed

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:24):

alternatively you could eagerly load and see if you can measure a perf hit

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:24):

okay so maybe I can:

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:24):

or your idea, but:

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:24):

do we have a script or something for measuring RPS of wasmtime serve?

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:24):

nah

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:24):

I've never acrtually done that

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:24):

"pick your favorite benchmarking tool"

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:25):

you could also measure the call benchmark here too

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:25):

that will probably show it being disproportionately bad though given how streamlined it is today

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:25):

yeah, should do that either way (I suspect the way I implemented the dynamic checks could be improved to help the compiler boil some stuff away)

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:26):

I think the call benchmarks aren't sufficient here tho

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:26):

I feel like it's best though that we start with an index+engine combo and then derive info from there

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:26):

because they are single threaded

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:26):

oh you'd be surprised

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:26):

uncontended locks are surprisingly expensive in that context

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:26):

and if we are lazily constructing FuncTypes from VMSharedTypeIndexs that involves locking at the engine level

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:26):

"how good is your rwlock with only readers"

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:27):

right, it might show some overhead, but not the "full" overhead

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:27):

agreed

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:27):

we could in theory build a fancier data structure though

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:28):

e.g. VMSharedTypeIndex could be proof of "this slot in this data structure is unchanging"

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:28):

and then we could go from &Engine to &Slot via that proof

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:28):

so with some elbow grease I think we could get the locks out of the way

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:28):

I don't think so: that would require RAII which isn't possible with integrating with JIT code

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:28):

I'm thinking the Module is "the RAII"

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:29):

where e.g. we keep Module in a Store to keep it alive today

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:29):

we could remove the locking by putting the registration count behind the Arc, and then having some other thing sweep the registry's hash consing map

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:29):

(remove the locking for cloning and dropping and such, at least)

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:29):

ish

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:30):

not look up

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:30):

I haven't thought this through fully

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:30):

but like if we allowed 1K types maximum we could just have a Vec preallocated and you index into that

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:30):

obviously not suitable but ... hm...

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:30):

well anyway I think we should try the route of "don't pass in type info"

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:30):

and see how that fares perf-wise

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:31):

we can at least be perf-neutral with scalars

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:31):

Alex Crichton said:

I'm thinking the Module is "the RAII"

but then VMSharedTypeIndex itself isn't proof of anything, re are relying on the module existing and that uses have an associated module or whatever, which is pretty much teh situation we are already in

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:31):

well I don't want to get too distracted with a half-baked idea I have

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:32):

brb

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:32):

Alex Crichton said:

we can at least be perf-neutral with scalars

we do have to call .next() on the param types iter for every param, so to be actually neutral it would require LLVM to be able to boil that away, which I am not sure it can or not

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:34):

okay well lets start with looking up the funcref's FuncType and doing the read lock and measuring perf of that

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:34):

and then we can decide what to do from there

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:37):

(and I'm not gonna get to the benchmarking until after I finish writing these missing embedder API subtyping tests)

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:44):

I'm imagining we'd pass in &LazyParam which has &Engine, usize, and VMSharedTypeIndex and then .get() loads everything

view this post on Zulip Alex Crichton (Feb 16 2024 at 17:44):

so that way outside the loop is "just" bumping usize

view this post on Zulip fitzgen (he/him) (Feb 16 2024 at 17:45):

yeah that makes sense


Last updated: Nov 22 2024 at 16:03 UTC