Stream: git-wasmtime

Topic: wasmtime / PR #6262 wasmtime: Overhaul trampolines


view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 18:21):

fitzgen opened PR #6262 from fitzgen:trampoline-overhaul to bytecodealliance:main:

This commit splits VMCallerCheckedFuncRef::func_ptr into three new function pointers: VMCallerCheckedFuncRef::{wasm,array,native}_call. Each one has a dedicated calling convention, so callers just choose the version that works for them. This is as opposed to the previous behavior where we would chain together many trampolines that converted between calling conventions, sometimes up to four on the way into Wasm and four more on the way back out. See [0] for details.

[0] https://github.com/bytecodealliance/rfcs/blob/main/accepted/tail-calls.md#a-review-of-our-existing-trampolines-calling-conventions-and-call-paths

Thanks to @bjorn3 for the initial idea of having multiple function pointers for different calling conventions.

This is generally a nice ~5-10% speed up to our call benchmarks across the board: both Wasm-to-host and host-to-Wasm. The one exception is typed calls from Wasm to the host, which have a minor regression. We hypothesize that this is because the old hand-written assembly trampolines did not maintain a call frame and do a tail call, but the new Cranelift-generated trampolines do maintain a call frame and do a regular call. The regression is only a couple nanoseconds, which seems well-explained by these differences explain, and ultimately is not a big deal.

However, this does lead to a ~5% code size regression for compiled modules. Before, we compiled a trampoline per escaping function's signature and we deduplicated these trampolines by signature. Now we compile two trampolines per escaping function: one for if the host calls via the array calling convention and one for it the host calls via the native calling convention. Additionally, we compile a trampoline for every type in the module, in case there is a native calling convention function from the host that we call_indirect of that type. Much of this is in the .eh_frame section in the compiled module, because each of our trampolines needs an entry there. Note that the .eh_frame section is not required for Wasmtime's correctness, and you can disable its generation to shrink compiled module code size; we just emit it to play nice with external unwinders and profilers. We believe there are code size gains available for follow up work to offset this code size regression in the future.

Backing up a bit: the reason each Wasm module needs to provide these Wasm-to-native trampolines is because wasmtime::Func::wrap and friends allow embedders to create functions even when there is no compiler available, so they cannot bring their own trampoline. Instead the Wasm module has to supply it. This in turn means that we need to look up and patch in these Wasm-to-native trampolines during roughly instantiation time. But instantiation is super hot, and we don't want to add more passes over imports or any extra work on this path. So we integrate with wasmtime::InstancePre to patch these trampolines in ahead of time.

<!--
Please make sure you include the following information:

Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.html

Please ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 18:21):

fitzgen requested abrown for a review on PR #6262.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 18:21):

fitzgen requested wasmtime-compiler-reviewers for a review on PR #6262.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 18:21):

fitzgen requested pchickey for a review on PR #6262.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 18:21):

fitzgen requested wasmtime-core-reviewers for a review on PR #6262.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 18:21):

fitzgen requested wasmtime-default-reviewers for a review on PR #6262.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 18:21):

fitzgen requested alexcrichton for a review on PR #6262.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 18:23):

fitzgen updated PR #6262.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 18:31):

fitzgen updated PR #6262.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 19:04):

fitzgen updated PR #6262.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 19:30):

fitzgen updated PR #6262.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 20:03):

fitzgen updated PR #6262.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 22:46):

alexcrichton submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 22:46):

alexcrichton submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 22:46):

alexcrichton created PR review comment:

I feel like I'm surely going to forget to write this down, so could you update RELEASES.md in this PR with the various changes made? I'm imagining the ## Changed section for this for example.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 22:46):

alexcrichton created PR review comment:

Could this construction of the callee be deduplicated with the above? (and perhaps with call too)

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 22:46):

alexcrichton created PR review comment:

Mind adding a comment why these are added? Or otherwise leaving them out of this PR?

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 22:46):

alexcrichton created PR review comment:

Should this perhaps be #[doc(hidden)]? (this'll end up in the wasmtime crate)

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 22:46):

alexcrichton created PR review comment:

Could this be preemptively extracted to a helper function?

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 22:46):

alexcrichton created PR review comment:

I think this may be the s390x issue as this'll want to be a native-endian load

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 22:46):

alexcrichton created PR review comment:

Mind adding a note here about this being sorted as well?

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 22:46):

alexcrichton created PR review comment:

TODO

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 22:46):

alexcrichton created PR review comment:

To cut down on the encoding space necessary for this, could these two arrays get fused? And to go a bit further could this perhaps be stored in the funcs array directly?

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 22:46):

alexcrichton created PR review comment:

It's probably ok to jettison this now

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 22:46):

alexcrichton created PR review comment:

Could this and the two types below have some more documentation about the ABI? E.g. for the structs below a definition of the ABI and for this function pointer a definition of what each of the arguments are?

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 22:46):

alexcrichton created PR review comment:

Could the inner field here be made private?

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 22:46):

alexcrichton created PR review comment:

Looks like this file became executable?

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 22:46):

alexcrichton created PR review comment:

The large doc block above about why this can be None I think might be good to insert here instead?

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 22:46):

alexcrichton created PR review comment:

Could the reformatting here and below be undone?

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 22:46):

alexcrichton created PR review comment:

Shouldn't this always be Some(...) by the time it reaches the VMFunctionImport phase?

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 22:46):

alexcrichton created PR review comment:

I worry about this because it can accidentally paper over non-send/sync in kind: FuncKind and ty: Option<Box<FuncType>>. The latter is probably tested elsewhere but there's not a ton of protection here against FuncKind.

I don't have a great way to solve this other than unergonomic wrappers, but I'd probably still prefer that to the impl here on the whole struct.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 22:46):

alexcrichton created PR review comment:

I know we split this into a two-phase operation, but given that the push/fill is always back-to-back it may be better to just remove this fill_func_refs perhaps? (folding it into the push operation that is)

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 22:46):

alexcrichton created PR review comment:

In retrospect given that this isn't really perf critical or used all that often, perhaps this would be better as a simple Vec<Box<....>>?

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 22:46):

alexcrichton created PR review comment:

I think the only use case for this is the profilers, and they don't have a great use for this. The profilers already have a lot of duplication of looking at all of these trampolines which isn't great, too.

This is fine to defer to another PR, but I wanted to write this down: these should all get removed and all the profilers should switch to using this logic which generically reads an ELF file and registers everything. There's specific support for DWARF in the jitdump profiler but I think that's all best removed at this point.

Again, future PR, just want to leave a reminder to myself to do this or open an issue later.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 22:46):

alexcrichton created PR review comment:

Like with FuncData, this feels a bit too broad to me. Could these impls be placed on VMCallerCheckedFuncRef instead?

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 22:46):

alexcrichton created PR review comment:

cc @saulecabrera we opted to leave these as todo!() in winch, but this effectively breaks compilation with winch in Wasmtime (and tests are additionally disabled lower in this PR).

Is that ok with you though?

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 22:46):

alexcrichton created PR review comment:

Like the above comments, I think this should be removed in favor of something more targeted ideally.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 22:46):

alexcrichton created PR review comment:

Also, could you add documentation about this parameter? Mostly that this isn't "number of arguments" but rather "capacity of the args_and_results area" which I realize is perhaps obvious given the name but is not necessarily obvious given that this is generally a "call a function" function where one typically expects to pass a list of arguments.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 22:46):

alexcrichton created PR review comment:

Or should we perhaps abort() if this ever gets seen?

view this post on Zulip Wasmtime GitHub notifications bot (Apr 21 2023 at 23:02):

saulecabrera created PR review comment:

That's ok with me, thanks for the heads up. I'll work on the new version of the trampolines for winch once this PR lands.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 24 2023 at 19:28):

fitzgen created PR review comment:

Because we don't have a trampoline for every function, these have to become Options in the funcs array, and while it is a code size win, it is only 0.0160605% savings on spidermonkey.wasm. Nonetheless I kind of like the rearranged structure anyways, since looking up a trampoline is an O(1) operation now.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 24 2023 at 19:29):

fitzgen created PR review comment:

I'll do this in a follow up.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 24 2023 at 19:31):

fitzgen created PR review comment:

Eh, I'd rather just keep it around until we completely remove the asm trampolines.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 24 2023 at 19:46):

fitzgen created PR review comment:

Oh good call, because the case that the above comment is talking about can't happen with function imports, only funcrefs.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 24 2023 at 20:24):

fitzgen created PR review comment:

Ah unfortunately the borrows don't really work out becuase store is already borrowed immutably for the func_refs() method and so we can't also get and pass in the store's module registry.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 24 2023 at 21:15):

fitzgen created PR review comment:

Eh, it is still used whenever passing funcrefs into Wasm via any method other than function imports that are pre-resolved in InstancePre. That is a lot of paths, even if it isn't the hottest path. Since we've already got it, I'm inclined to leave it.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 24 2023 at 21:46):

fitzgen created PR review comment:

Ah yes, sorry I meant to tag you @saulecabrera but I forgot. Thanks for your help getting this PR landed, and willingness to work on follow ups!

view this post on Zulip Wasmtime GitHub notifications bot (Apr 24 2023 at 22:25):

fitzgen updated PR #6262.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 25 2023 at 20:35):

fitzgen updated PR #6262.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 25 2023 at 20:35):

alexcrichton submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 25 2023 at 20:35):

fitzgen has enabled auto merge for PR #6262.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2023 at 17:32):

fitzgen updated PR #6262.

view this post on Zulip Wasmtime GitHub notifications bot (Apr 27 2023 at 18:45):

fitzgen merged PR #6262.


Last updated: Oct 23 2024 at 20:03 UTC