dicej edited PR #4537 from component-fuzz-generator to main:
This is the first part of my work to address #4307. We now generate 1000
arbitrary types and tests for those types at build time. Each test includes a
component which imports and exports functions that take and return its
respective type. The exported function calls the imported function, which is
implemented by the host, and the host verifies that both the host function
argument and the guest function return value match the original input value.In terms of #4307, this includes the test case generator and the static API
oracle. I'll follow up with a dynamic API oracle in a subsequent PR.Signed-off-by: Joel Dice <joel.dice@fermyon.com>
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
dicej edited PR #4537 from component-fuzz-generator to main:
This addresses #4307.
For the static API we generate 100 arbitrary test cases at build time, each of
which includes 0-5 parameter types, a result type, and a WAT fragment containing
an imported function and an exported function. The exported function calls the
imported function, which is implemented by the host. At runtime, the fuzz test
selects a test case at random and feeds it zero or more sets of arbitrary
parameters and results, checking that values which flow host-to-guest and
guest-to-host make the transition unchanged.The fuzz test for the dynamic API follows a similar pattern, the only difference
being that test cases are generated at runtime.Signed-off-by: Joel Dice <joel.dice@fermyon.com>
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
dicej updated PR #4537 from component-fuzz-generator to main.
alexcrichton submitted PR review.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
It's ok to omit this if it's empty
alexcrichton created PR review comment:
Is this necessary for this PR? (I think this might affect
cargo fuzz build --no-default-featuresas still trying to build the spec interpreter)
alexcrichton created PR review comment:
Like you mentioned in the issue this can lead to too-deep recursion in the text format. I think a better strategy might be to generate a
(type $name (record ...))for each layer of aTypeperhaps? That would avoid the*.wastlimitations while still helping to fuzz super-deep type hierarchies in Wasmtime (which I think would actually be quite useful, so I'd rescind my other suggestion of limiting the depth here)
alexcrichton created PR review comment:
Was this for correctnes or a refactoring? Fine either way of course, just curious for my own edification
alexcrichton created PR review comment:
Currently the general style of this crate is that the
oracleshere don't takeUnstructuredbut rather take a list of inputs. I think that would be possible to do here? This would take aTestCaseas input as well as aVec<(...)>for inputs/outputs of each run.
alexcrichton created PR review comment:
as a stylistic nit could this
]go on a separate line?
alexcrichton created PR review comment:
Currently the way time allotment on oss-fuzz works is that we're given a fixed time slice to balance between all of our fuzzing executables. In that sense the more fuzzers we have the more they take time away from other fuzzers. Since these are fuzzing the same thing in spirit could these two fuzzers be folded into one? The fuzz input could be used to determine whether the dynamic or static API could be fuzzed then I think.
alexcrichton created PR review comment:
Could this perhaps be replaced with something that dynamically fills out more of the
bytesvector when an error is encountered? Otherwise this could run into a possible issue where unrelated changes to the fuzz generator could cause this build to fail if just-the-wrong-seed is chosen.
alexcrichton created PR review comment:
I'm a little worried about the impact that this will have on build times because it means that
cargo fuzz buildwill build all of Wasmtime once for the host and then once again for the "target". This is a case where from Cargo's perspective when you executecargo build --target $the_host_targetit's considered a cross-compilation and packages shared between the two are built twice.Improving this situation though is likely to not really be that easy. The "easy" thing to do is to split out the bits you need from
wasmtime-fuzzinginto their own separate crate. Splitting crates though is not always the easiest and ideally the pieces you're using to generate test cases here would ideally reuse the bits fromwasmtime-environas well to avoid duplicating things.Overall though I do think that this is something that should be solved before landing due to how heavyweight this dependency is when fuzzing. This'll link in
v8, for example, to the build script which is no small feat.
dicej updated PR #4537 from component-fuzz-generator to main.
dicej submitted PR review.
dicej created PR review comment:
Oops, I didn't see this until after I wrote https://github.com/bytecodealliance/wasmtime/pull/4537/commits/8fceba5be1f00c72582b5deaf6b490e421dfc887. Should I undo that and use this approach instead?
dicej submitted PR review.
dicej created PR review comment:
Not sure where that came from, honestly. I should be able to remove it.
dicej submitted PR review.
dicej created PR review comment:
Correctness. The original code meant that each argument would overwrite any that came before it in the list. This was the first bug the fuzz tests found.
alexcrichton created PR review comment:
Aah yes indeed I see the issue now!
alexcrichton submitted PR review.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Ah sorry for giving conflicting advice! I think it would be best to exercise as interesting type hierarchies as we can (including weirdly deep ones) so I think it'd be better to skip the
*.wast-based limits. That being said if you'd prefer to get something working first I'm definitely ok having a follow-up issue for this.
dicej created PR review comment:
Ugh, https://github.com/bytecodealliance/wasmtime/commit/8fceba5be1f00c72582b5deaf6b490e421dfc887 didn't really address the issue anyway, so I'll try your approach.
dicej submitted PR review.
dicej submitted PR review.
dicej created PR review comment:
I'm not sure how to do that, given that we can't create an instance of
Valwithout first getting theTypes from aFuncfrom anInstance, and the fuzzer has no way to do that.
dicej created PR review comment:
I guess I do know a way -- I used to have a
Valueenum ingenerator::component_typeswhich could be constructed independently and then later converted to aValwhen theTypewas available. I had removed that because it was redundant in the current approach, but I could restore it.
dicej submitted PR review.
dicej submitted PR review.
dicej created PR review comment:
@alexcrichton How strongly do you feel about this? I think it would require adding a substantial amount of code.
dicej updated PR #4537 from component-fuzz-generator to main.
dicej updated PR #4537 from component-fuzz-generator to main.
dicej submitted PR review.
dicej created PR review comment:
This wasn't too painful to address. See my latest commit.
dicej updated PR #4537 from component-fuzz-generator to main.
alexcrichton created PR review comment:
Given recent spec clarifications it's ok for thi sto have 0 flags, so I think it's ok to remove the "non empty" part here
alexcrichton created PR review comment:
I think we'll want to allow this to have zero fields as well as
Flagsbelow to exercise the zero-size case.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Similar to the cases below I think that we'll want to remove this to allow testing zero-sized tuples as well.
alexcrichton created PR review comment:
Ah sorry I forgot to review this in the initial first pass. We won't want to have this implementation though because the
IntoComponentFuncimpl is only intended for host functions where the component-type of the function can be statically inferred. As you have probably surmised with the lock/etc in typechecking for these functions the "dynamic" interface here doesn't really match the interface for static functions.Instead what we'll want to do is something along the lines of the difference between
wasmtime::Func::{wrap,new}where here you're defining thenewequivalent. That will grow a new functioncomponent::Linker::func_newwhich will take the type of the function as its argument as well as a closure. The return values of the closure are then type-checked against the original type at runtime before yielding the result back to the original component.This may be worth splitting to its own PR if you'd prefer. I'm also happy to review it here as well though.
alexcrichton created PR review comment:
Could this be updated to match the canonical definition which aligns-up the summation of the discriminant and the max variant size rather than aligning-up each individual summation? I think it probably ends up being the same thing but figure it may be best to align (heh) with the spec where we can.
alexcrichton created PR review comment:
Similar comment as to above about aligning the result of the summation instead of each individual component to match the spec (although again I don't think it affects the results)
alexcrichton created PR review comment:
One thing I did for the
fact-valid-modulefuzzer was to just specify this as ausizeas well.
alexcrichton created PR review comment:
Could the seed be printed here to assist debugging in case the build script later returns an error?
alexcrichton created PR review comment:
Oh right sorry yes I forgot that we need the component on-hand to generate the values. Yeah in that case the refactor isn't worth it and the current structure is good :+1:
alexcrichton submitted PR review.
dicej updated PR #4537 from component-fuzz-generator to main.
dicej created PR review comment:
I was thinking a zero-sized tuple was just a
unitand thus redundant. Should we also allow zero-sized records, variants, unions, etc.? I.e. should I remove theNonEmptyArraytype entirely?
dicej submitted PR review.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
We should allow zero-size records, yeah, but variants/unions/enums all require at least one case so the
NonEmptyArraythere is correct. You're right that in Rust an empty tuple will translate to a(), but for fuzzing here I think it's best to exercise as many cases as possible.
dicej updated PR #4537 from component-fuzz-generator to main.
dicej created PR review comment:
Would you mind sketching out an example of how your proposed API would be used? I'm specifically interested in where the user would get the type of the function to pass to
component::Linker::func_new. Would they query theComponentfor the things it imports, find the function of interest, and get the type from that?
dicej submitted PR review.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Yeah that is indeed a good question. In core wasm users would do
FuncType::new(...)and create their own function type. In the fullness of time I'd expect the same operation to be available to components where function types could be manufactured in addition to being learned from components. That's a big missing piece of the embedding API for components right now though so, yeah, getting it from a component seems the most reasonable for now.
dicej updated PR #4537 from component-fuzz-generator to main.
dicej created PR review comment:
@alexcrichton I've just pushed an update per your suggested API. Let me know what you think.
dicej submitted PR review.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Personally I think that this needs to change in the future to taking the function type as the argument. That being said this works well enough for now and it's ok to extend this later, so I think this should stay as-is.
Otherwise though eventually this will require filling out more of the
component::typessubmodule in the same manner aswasmtime::typeswhich is able to describe imports/exports/etc. I think the implementation will be roughly the same as what you have already withArc<Types>andFooIndexpaired together, but again it's ok to defer this to later.
alexcrichton created PR review comment:
Eventually I think this will need to be filled out when the function type is an argument to
func_new. Additionally for now I think it would be good to go ahead and fill out to have an "extra layer" of defense against bugs.For now I think the
new_dynamichere could take aTypeFuncIndexand the implementation could basically test that the two function indices are the same (since everything comes from within the same component right now)
alexcrichton created PR review comment:
Given that
func_newis somewhat rare and already fairly verbose I think it's ok to skip the variant-without-StoreContextMutand go straight to requiring the argument duringfunc_new
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Similar to
TypedFunc::callvsFunc::call, is it possible to share this code with the existing entrypoint above? Given how different the parameters are represented it may be too onerous though.
alexcrichton created PR review comment:
Above the 2x2 matrix of "params direct/indirect" x "results direct/indirect" is expressed with 4 different cases but the main reason for that is to get all the fiddly bits around the types to work out. Otherwise here though this may be more amenable to a straightforward "load the params, do the call, store the result" without having to specify 4 different cases (which leads to a fair bit of duplication between everything)
alexcrichton created PR review comment:
Are these new
'staticbounds still required?
alexcrichton created PR review comment:
This looks like it's basically copying the results of
ty, but couldtybe used as-is? The pairing ofindex: TypeFuncIndexand&Arc<Types>I think should be enough to type-check everything
alexcrichton created PR review comment:
Could this specify both the rhs and the lhs of the transmute? (stylistically I prefer to do that given how dangerous
transmuteis in general)
dicej created PR review comment:
We need to represent them as
component::Types incall_host_dynamicin order to useType::flatten_count,Type::next_field,Val::lift,Val::store, etc. I figured since we'd eventually convert them tocomponent::Types anyway, might as well do it once here rather than every timecall_host_dynamicis invoked.
dicej submitted PR review.
dicej submitted PR review.
dicej created PR review comment:
That was my first thought, but I couldn't see an elegant way to do it. With
Func::callI was able to reusecall_rawby using[ValRaw: MAX_FLAT_PARAMS]as theLowerParamsgeneric parameter. I can't use that trick withcall_host, though, since the space for the parameters has already been allocated by the caller.There probably is a way to refactor
call_hostto work in both contexts, but I expect it would be pretty ugly.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Ok that seems reasonable, this can be iterated on later if necessary.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Ok no worries, but in that case can you duplicate some of the more interesting test cases for imports for
func_new-defined imports as well?
dicej created PR review comment:
It will be required now that we're making
HostFunc::typecheckmore than a no-op for the dynamic case, since I'll need to convert the type of that field back to aBox<dyn Fn...>so it can store a closure which captures theTypeFuncIndex.
dicej submitted PR review.
dicej updated PR #4537 from component-fuzz-generator to main.
alexcrichton submitted PR review.
dicej updated PR #4537 from component-fuzz-generator to main.
alexcrichton merged PR #4537.
Last updated: Dec 06 2025 at 06:05 UTC