dicej requested alexcrichton for a review on PR #10106.
dicej opened PR #10106 from dicej:async-fused-test
to bytecodealliance:main
:
This is another piece of #9582 which I'm splitting out to make review easier.
This test exercises fused adapter generation for various flavors of intercomponent async->async, async->sync, and sync->async calls.
The remaining changes fill in some TODOs to make the test pass.
<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
dicej requested wasmtime-core-reviewers for a review on PR #10106.
dicej updated PR #10106.
dicej updated PR #10106.
dicej updated PR #10106.
alexcrichton submitted PR review:
I think this is as far as I'm gonna get today. I haven't gotten to most of
trampoline.rs
yet which I realize is most of the guts of this PR, but I'll do that on Monday
alexcrichton created PR review comment:
Just confirming this isn't a typo, but you're working with
MAX_FLAT_PARAMS
andty.params
, but storing the result inresults
alexcrichton created PR review comment:
Also nowadays if you'd like I think it'd be reasonable to move this to close to the impl of the trait for
StoreInner<T>
.Also in terms of
#[cfg]
by moving this to a location that's already guarded by this feature you won't have to use the fully qualified paths for all types below to avoid unused import warnings.
alexcrichton created PR review comment:
Could this impl move to some location which is already gated by
feature = "component-model-async"
? I suspect this'll want to import things and/or work with things only present when the feature is enabled, so keeping everything localized might help cut down on#[cfg]
.
alexcrichton created PR review comment:
Mind leaving some comments as to what this is doing? I suspect the
if async { ... }
is intentional but natively it looks like this is mixing up results/params (which I suspect it isn't, but a comment would help dispel any doubt)
alexcrichton created PR review comment:
This is going to be a little tricky, but ideally we'd move these
#[cfg]
to the function level to have these functions entirely disappear when the feature is disabled.I say this is tricky because it's a bit subtle how this works. The corresponding
foreach_builtin_function!
macro for core wasm has support for this but it requires updating callers to expand their macro-matching syntax to include macro-matchers for cfgs.The other gotcha is that the numbering of libcalls isn't affected by
#[cfg]
. This helps ensure that one Wasmtime with a certain set of features can load compiled code into a different Wasmtime with a different set of features. This means thatVMComponentLibcalls
, for example, has the same structure regardless of#[cfg]
. The way this is solved for core wasm is that theVMBuiltinFunctions
structure ignores the#[cfg]
on each item, but the macro-generated trampoline is where the do-the-call-or-unreachable!()
lives.In the end the goal though is to be able to avoid defining this function entirely when it's not configured.
alexcrichton created PR review comment:
Huh I completely forgot about this... Another option would be to add more callbacks here, but having a separate trait as well I think is also totally reasonable. Probably best to keep everything packaged up in a single trait to help with the
#[cfg]
alexcrichton created PR review comment:
Like above, the
call_libcall
helper might be able to help clean this up a little further
alexcrichton created PR review comment:
I suspect you probably have a preference for this style of import but at some point in the future we're going to have to have a reckoning about this. It's not great that we have multiple styles of imports throughout the codebase in the sense that it's a bit jarring to see the inconsistency. While the true fix is probably along the lines "wait for rustfmt to format imports" in the interim if you're working near other imports mind matching the preexisting style?
alexcrichton created PR review comment:
This looks pretty similar to some code in
signature.rs
(and I'm similarly a bit confused about params/results a little). Is this similar enough that it might warrant a helper of some kind to do this internally?(similar for
dst_flat
below, some sort of helper that takeslower_opts
orasync_
which handles async-vs-not)
alexcrichton created PR review comment:
Was this an intentional decision, or a copy/paste? The comment is true for anything that is exclusively imported by fused adapter modules (e.g.
*_transfer
) but for normal component model intrinsics (which I think this represents) this is definitely a case we'll need to handle.
alexcrichton created PR review comment:
You might be able to replace
host::task_return
+load_libcall
+ this with thecall_libcall
helper perhaps?
alexcrichton created PR review comment:
Question about this: how come these exports are here? Accessing extra exports of adapter modules seems like it'd be a pretty big change and I'm also not entirely sure what they'd be used for. Not really necessary to put a comment here, but I'm curious to peel the curtain back a bit more on how the exports are going to be used in future PRs.
(I also have a theory that this is here to get past wasm validation to
ref.func
the function or something like that, and if that's what's happening here there are other alternatives to solve that)
alexcrichton created PR review comment:
Two things on this:
- For
let params = ...
above, it's unclear to me what this is doing. For example I'm not sure why this would switch toNone
for unknown types as opposed to return an error or panic or something. Additionally this looks like something that's better done in perhaps atranslate.rs
orinline.rs
step?- For
let ty = ...
this looks like it's doing a hash map lookup of a calculated type. Would it be possible to calculate that when the type is originally created early on intranslate.rs
and/orinline.rs
and then thread that through to here? For example I'd expect that this function here would havety: Option<SomeIndex>
as an argument and then that's what's encoded into au32
here.
alexcrichton created PR review comment:
Perhaps something like
raise_if_negative_one
? It looks a bit odd above that an i64 return value is passed into a method that saysif_i32
.
alexcrichton created PR review comment:
How come the
::record
constructor couldn't be used like before?
dicej submitted PR review.
dicej created PR review comment:
Yeah, it's a little confusing since
task.return
allows you to return up toMAX_FLAT_PARAMS
on the stack. I'll leave a comment to explain.
dicej submitted PR review.
dicej created PR review comment:
Sure, no problem. I originally _did_ match the existing style, but then found it convenient to use this style so I could put all the imports under a single
#[cfg(...)]
annotation. I'll switch it back.
dicej submitted PR review.
dicej created PR review comment:
Yeah, I was wondering if there was a cleaner way to do this. I'm open to suggestions.
dicej submitted PR review.
dicej created PR review comment:
Correct, it's all about
ref.func
validation. If there's a better way to address that, I'm definitely open to it.
dicej submitted PR review.
dicej created PR review comment:
Yeah, I've already done that (and removed other duplication) in a follow-up PR I'm currently finishing up. Can it wait for that PR?
dicej created PR review comment:
I believe we discussed this at some point when pairing in a Slack huddle and decided then to use the trap approach. So far it seems to be working for every test I throw at it (including cases that don't involve fused adapters at all), so if there's a scenario where this is the wrong behavior, I'd like to understand what it is and add a test case for it.
dicej submitted PR review.
dicej created PR review comment:
Agreed about the
None
default; I don't remember why I did that. I'll switch it to a panic.And yes, I can move this code to an earlier stage, e.g.
translate.rs
orinline.rs
.
dicej submitted PR review.
dicej created PR review comment:
Per my response above: I've already done this in a follow-up PR.
dicej submitted PR review.
dicej created PR review comment:
Ah, I think this is an artifact of when Luke changed the async lift ABI; previously parameter passing worked differently for async vs. sync lifts, and now they're the same. In the churn I neglected to unify the code. I'll do that.
dicej submitted PR review.
dicej created PR review comment:
OK, now I remember the conversation. I guess the scenario where this would be a problem is if a guest exported its
task.return
intrinsic to the host and the host called it. I'll see if I can create a test case for that.
dicej submitted PR review.
dicej created PR review comment:
I won't be able to actually test this until I've filled more of the runtime code (coming in future PRs). Would it be okay if I left a TODO comment here for the purposes of this PR?
I'd also like to propose changing the component model spec to prohibit re-exporting imported intrinsics. I'm guessing we can't catch that reliably during validation due to e.g.
call_indirect
shenanigans, but we'd at least be justified in trapping at runtime. @lukewagner do you have an opinion?
dicej submitted PR review.
dicej created PR review comment:
Sounds like we're agreed having a separate trait makes sense, so I'm leaving this as-is.
dicej submitted PR review.
dicej created PR review comment:
Alternatively, if it's straightforward to support
Abi::Array
properly here we can just do that; I'll probably need some guidance on how to do that, though.
dicej edited PR review comment.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Ah ok, if it's only about
ref.func
, then what you'll want to use is a "declare" element segment. You'll call this function in wasm-encoder basically. That way you won't have to export the functions, but it will still be valid toref.func
the functions.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Ah yeah I think you booted up on the context here. It's the reexport case that I'm worried about (and you're right that I don't think we can reasonably forbid it at the embedder level). I think it's probably pretty easy to add
Abi::Array
support, it's mostly just around refactoring to use helpers to access the arguments/store returns, otherwise it's all the same internals.I think it'd be reasonable to leave this as a TODO, though, in the interim. Mind filing an issue about that? (can probably start a
component-model-async
label too perhaps)
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Sounds good :+1:
alexcrichton submitted PR review.
alexcrichton created PR review comment:
On these functions you can add
#[cfg(feature = "component-model-async")]
. After doing that you'll need to:
alexcrichton submitted PR review:
Ok I'm trying to decipher
trampoline.rs
but if it's ok with you I'd prefer to ask for more documentation first before diving further into these methods. Currently there's very little documentation on how things are set up and understanding enough to be able to review this PR I think would require cross-referencing both all the details in the spec along with Wasmtime internal implementation details. A lot of this is also pretty specific to Wasmtime itself in the sense that it's all internal adapter details and we're implementing halves of the spec in some places.Would you be ok writing up some docs for this? Ideally I'd find it most helpful to have a high-level description of how the adapters piece together and what the expected flow between them is. My hope is that such documentation would also be pretty valuable for future readers to understand this as well.
alexcrichton created PR review comment:
Mind throwing "async" in the name of these local variables (and
return_adapter
below) to clarify they're async-related things?
alexcrichton created PR review comment:
Mind adding a private
Compiler::new
constructor? That should help deduplicate this and theCompiler
constructions above too
alexcrichton created PR review comment:
Mind leaving some high-level comments on each of these cases about what sort of adapters are being created? Mostly to help guide what adapters are calling what and what the purpose of the various adapter functions are.
alexcrichton created PR review comment:
I'm a bit perplexed by this and I've been trying to think about this but to no avail.
High-level request: mind writing up some longer-form documentation in this module or on a function here about adapters in the async world? For example I see the use of globals here and it makes me wonder:
- Why are globals needed here? For example what function will pick up these globals later?
- What prevents these globals from being set again before the "other side" picks up the results?
I suspect there's answers to these and it's where I think some docs could help. I realize as well that I didn't do a great job of documenting most of this file the first time I wrote it, so I'm also sort of using this opportunity to request some docs while it's still fresh in your head as well.
At a lower-level: why is
param_locals
passed twice here? It feels surprising and I'm not sure the code below was meant to handle the same values in/out, but I suspect you've been testing with this as well. Mind leaving some comments as to why it's ok to do that here? I'm lacking the high-level picture of what these adapters are doing so that might also help piece together why it's ok to pass the two sides in here too.(also if you're up for it adding docs to preexisting methods like
translate_results
would be most welcome, but I understand if you'd prefer to not or defer that to later)
alexcrichton created PR review comment:
Mind adding a helper method like
self.global_set(u32)
which automatically does theflush_code
plusBody::GlobalSet
?
alexcrichton created PR review comment:
(I'm also not sure at-a-glance, for example, what the
param_globals
/result_globals
variables are, so either docs here or on the methods below I think would be useful)
dicej requested abrown for a review on PR #10106.
dicej updated PR #10106.
dicej requested wasmtime-default-reviewers for a review on PR #10106.
dicej submitted PR review.
dicej created PR review comment:
I just pushed an update which uses https://github.com/bytecodealliance/wasm-tools/pull/1989 and makes this code obsolete. Now we compare
TypeTupleIndex
es, so no need to intern core function types anymore.
dicej updated PR #10106.
dicej submitted PR review.
dicej created PR review comment:
dicej updated PR #10106.
dicej updated PR #10106.
dicej submitted PR review.
dicej created PR review comment:
I just pushed an update that adds some comments. Let me know if anything needs to be clarified or added.
dicej submitted PR review.
dicej created PR review comment:
Yeah, it's confusing and needs docs; I'll add them.
Why are globals needed here? For example what function will pick up these globals later?
Quoting the comment I just pushed:
// Like the async->async case above, for the sync->async case we // also need `async-start` and `async-return` helper functions to // allow the callee to asynchronously "pull" the parameters and // "push" the results when it is ready. // // However, since the caller is using the synchronous ABI, the // parameters may have been passed via the stack rather than linear // memory. In that case, we use global variables to store them such // that they can be retrieved by the `async-start` function. // Similarly, the `async-return` function may write its result to // global variables from which the adapter function can read and // return them via the stack to the caller.
What prevents these globals from being set again before the "other side" picks up the results?
Since these globals are only used for sync-lowered imports, and the caller instance can only call one of those at a time, we know nobody will touch them.
At a lower-level: why is param_locals passed twice here?
Hmm, I think you found a bug (and a hole it the test coverage). I think my _intention_ was that
translate_results
would use the last element of itsparam_locals
parameter as the destination pointer in the case where the result does not fit on the stack. But that doesn't make sense because it's a pointer to the callee's memory -- it only makes sense as a source pointer. I think I was just confused, and the code is wrong.I added tests to tests/all/component_model/import.rs to cover the various combinations of passing parameters and returning results via the stack and the heap, but am now realizing I'm missing component composition tests for those scenarios.
Also now realizing that testing async-lifted exports that return more than MAX_FLAT_RESULTS is not sufficient. Since
task.return
accepts up toMAX_FLAT_PARAMS
, I need to exceed that also.I'll add those tests and fix any issues I find.
dicej submitted PR review.
dicej created PR review comment:
What prevents these globals from being set again before the "other side" picks up the results?
Since these globals are only used for sync-lowered imports, and the caller instance can only call one of those at a time, we know nobody will touch them.
Actually, I'm going to take this back. For sync-lowered imports called from an async-without-callback-lifted export, we _could_ have more than one running concurrently, in which case we'd have a problem. The good news is that we've decided not to include the async-without-callback ABI in WASIp3, so I don't need to solve that right away, but we do need to make sure we reject such exports until we've solved it.
This shouldn't be a problem for sync-lowered imports called from sync-lifted exports or async-with-callback-lifted exports since neither can be reentered until the import returns.
dicej edited PR review comment.
dicej edited PR review comment.
dicej submitted PR review.
dicej created PR review comment:
Are you concerned at all about the cost of calling
flush_code
for eachGlobalSet
rather than once for all of them?
dicej submitted PR review.
dicej created PR review comment:
Oh, I see -- it doesn't matter because pushing the
GlobalSet
s won't add anything toself.code
, soflush_code
will be a no-op anyway.
dicej updated PR #10106.
dicej submitted PR review.
dicej created PR review comment:
FYI, I'm currently tackling the arduous but inevitable task of rebasing my
async
branch onto this one. Tons of conflicts, as expected. Once I've finished that, I'll make sure all the existing tests pass, then add new tests and docs per the above discussion, and finally copy any relevant changes/fixes back into this PR.
dicej edited PR review comment.
alexcrichton submitted PR review:
I'm sorry I know I sound like a broken record but I'm personally still very lost trying to understand this. On one hand one way I can fix this is to go study the component-model specification, try to piece together what an expected implementation would be, and then try to connect those dots back to this implementation. On the other hand though I also think it'd be valuable to have enough local documentation to not require that because although I can do that it would also be required for any future readers as well.
If you feel like I'm requesting too much documentation though or there's something that is well outside the purview of this module's documentation and it's more basic fundamental understanding please let me know though. I don't think we should just mirror the entire specification in comments into this repository, but at the same time I still think there are critical implementation pieces lacking comments such as the protocol between the async-start/return adapters as well as the host-provided async-enter/exit functions.
alexcrichton created PR review comment:
Mind also leaving a comment here too?
alexcrichton created PR review comment:
No need to do this here but for the future might be good to add this attribute to the future/stream/error transfer down below too
alexcrichton created PR review comment:
Mind leaving a small comment here explaining a bit about what's going on?
alexcrichton created PR review comment:
Mind leaving a comment here for why this is starting from scratch when allocating globals? (e.g. why it's not reusing
counts
from above when parameters were handled)
alexcrichton created PR review comment:
Could this iterator be a method on
lower_sig
itself? That'd help move out some of this complexity and makes it a bit easier to see that theif
here is unrelated to async/sync and it's just about signatures.
alexcrichton created PR review comment:
These constants seem like they might be used by the host runtime as well, is that right? If so should they be
pub
and somewhere accessible from the outside of this crate?
alexcrichton created PR review comment:
This in theory can move to a "declare"
elem
segment, right?If no, mind commenting why it's exported?
If so, feel free to leave this here but can you add a comment/TODO saying it should become an element segment?
alexcrichton created PR review comment:
Is this a copy/paste typo with
task.return
when this function is about async start?
alexcrichton created PR review comment:
One question I'm left with after reading these docs is what's done with async-{start,return}, e.g. who's calling them. Definitely feel free to say "go read the CM docs Alex" for this, but it's at least to me still a big missing piece of the puzzle in understanding these trampolines.
alexcrichton created PR review comment:
I fear I'm personally still pretty lost here. In a sync->async call what I was expecting is that the results would be stored in globals and async-start would read from those globals when called by somone (the host? unsure). The results would then be stored-to in async-return. Here I was expecting something to block (as the caller is sync) until the nested async call was finished and then the globals would be read.
That doesn't appear to be what's happening though? I commented above that's it's not clear what async-start/async-return are doing but the setting of globals here rather than reading is what really threw me astray... Mind leaving more comments as to what the host intrinsics are doing and the various protocol that this is adhering to?
alexcrichton created PR review comment:
Mind adding a comment here as a small recap of what these intrinsics are doing? For example it looks a little odd to immediately call one then the other but I presume the host is doing something as part of enter/exit which may continue to execute wasm code?
alexcrichton created PR review comment:
Mind leaving comments as to what these two locals are?
alexcrichton created PR review comment:
I'm personally pretty nervous about the use of
global
s here because it seems like it'll be easy to get wires crossed by accident and overwrite globals or use uninitialized globals. To help alleviate this concern though, what do you think about some debug-only code?In wasmtime-cranelift we perform some extra debug checks in
cfg!(debug_assertions)
in the generated code itself. I'm wondering if we could do something similar here where indebug_assertions
mode there's an "in_use" global which is asserted to be zero when we store to it and then it's asserted to be 1 when we read from it. (and stores switch from 0->1 where reads switch from 1->0). That'd help make me more confident that although I don't fully understand everything here we have at least a layer of defense against bugs in debug mode for testing/fuzzing.
alexcrichton created PR review comment:
Ok I'll admit I'm still very lost as to what's going on here. My current understanding of async-start is that an async task calls it to get the parameters of the original async call. Here it looks like the adapter is performing the "translate from one component to another call", but I'm pretty confused why this signature's locals would ever be used. Why would a parameter be passed in if you're trying to learn the original parameters of the initial caller?
The globals bits make sense to me where the initial caller put some stuff in globals and that's read here, but I'm not sure how this works without the globals?
alexcrichton created PR review comment:
Mind leaving a comment as to what these are doing? For example why nparams is being passed and expanding a bit on what it means to leave room for the guest context?
To be clear: I absolutely agree that more docs and comments are needed -- I've only added a bit of that so far but plan to add more. Feel free to ignore this PR until that's done.
Last updated: Feb 28 2025 at 03:10 UTC