dicej edited PR #4537 from component-fuzz-generator
to main
:
This is the first part of my work to address #4307. We now generate 1000
arbitrary types and tests for those types at build time. Each test includes a
component which imports and exports functions that take and return its
respective type. The exported function calls the imported function, which is
implemented by the host, and the host verifies that both the host function
argument and the guest function return value match the original input value.In terms of #4307, this includes the test case generator and the static API
oracle. I'll follow up with a dynamic API oracle in a subsequent PR.Signed-off-by: Joel Dice <joel.dice@fermyon.com>
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
dicej edited PR #4537 from component-fuzz-generator
to main
:
This addresses #4307.
For the static API we generate 100 arbitrary test cases at build time, each of
which includes 0-5 parameter types, a result type, and a WAT fragment containing
an imported function and an exported function. The exported function calls the
imported function, which is implemented by the host. At runtime, the fuzz test
selects a test case at random and feeds it zero or more sets of arbitrary
parameters and results, checking that values which flow host-to-guest and
guest-to-host make the transition unchanged.The fuzz test for the dynamic API follows a similar pattern, the only difference
being that test cases are generated at runtime.Signed-off-by: Joel Dice <joel.dice@fermyon.com>
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
dicej updated PR #4537 from component-fuzz-generator
to main
.
alexcrichton submitted PR review.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
It's ok to omit this if it's empty
alexcrichton created PR review comment:
Is this necessary for this PR? (I think this might affect
cargo fuzz build --no-default-features
as still trying to build the spec interpreter)
alexcrichton created PR review comment:
Like you mentioned in the issue this can lead to too-deep recursion in the text format. I think a better strategy might be to generate a
(type $name (record ...))
for each layer of aType
perhaps? That would avoid the*.wast
limitations while still helping to fuzz super-deep type hierarchies in Wasmtime (which I think would actually be quite useful, so I'd rescind my other suggestion of limiting the depth here)
alexcrichton created PR review comment:
Was this for correctnes or a refactoring? Fine either way of course, just curious for my own edification
alexcrichton created PR review comment:
Currently the general style of this crate is that the
oracles
here don't takeUnstructured
but rather take a list of inputs. I think that would be possible to do here? This would take aTestCase
as input as well as aVec<(...)>
for inputs/outputs of each run.
alexcrichton created PR review comment:
as a stylistic nit could this
]
go on a separate line?
alexcrichton created PR review comment:
Currently the way time allotment on oss-fuzz works is that we're given a fixed time slice to balance between all of our fuzzing executables. In that sense the more fuzzers we have the more they take time away from other fuzzers. Since these are fuzzing the same thing in spirit could these two fuzzers be folded into one? The fuzz input could be used to determine whether the dynamic or static API could be fuzzed then I think.
alexcrichton created PR review comment:
Could this perhaps be replaced with something that dynamically fills out more of the
bytes
vector when an error is encountered? Otherwise this could run into a possible issue where unrelated changes to the fuzz generator could cause this build to fail if just-the-wrong-seed is chosen.
alexcrichton created PR review comment:
I'm a little worried about the impact that this will have on build times because it means that
cargo fuzz build
will build all of Wasmtime once for the host and then once again for the "target". This is a case where from Cargo's perspective when you executecargo build --target $the_host_target
it's considered a cross-compilation and packages shared between the two are built twice.Improving this situation though is likely to not really be that easy. The "easy" thing to do is to split out the bits you need from
wasmtime-fuzzing
into their own separate crate. Splitting crates though is not always the easiest and ideally the pieces you're using to generate test cases here would ideally reuse the bits fromwasmtime-environ
as well to avoid duplicating things.Overall though I do think that this is something that should be solved before landing due to how heavyweight this dependency is when fuzzing. This'll link in
v8
, for example, to the build script which is no small feat.
dicej updated PR #4537 from component-fuzz-generator
to main
.
dicej submitted PR review.
dicej created PR review comment:
Oops, I didn't see this until after I wrote https://github.com/bytecodealliance/wasmtime/pull/4537/commits/8fceba5be1f00c72582b5deaf6b490e421dfc887. Should I undo that and use this approach instead?
dicej submitted PR review.
dicej created PR review comment:
Not sure where that came from, honestly. I should be able to remove it.
dicej submitted PR review.
dicej created PR review comment:
Correctness. The original code meant that each argument would overwrite any that came before it in the list. This was the first bug the fuzz tests found.
alexcrichton created PR review comment:
Aah yes indeed I see the issue now!
alexcrichton submitted PR review.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Ah sorry for giving conflicting advice! I think it would be best to exercise as interesting type hierarchies as we can (including weirdly deep ones) so I think it'd be better to skip the
*.wast
-based limits. That being said if you'd prefer to get something working first I'm definitely ok having a follow-up issue for this.
dicej created PR review comment:
Ugh, https://github.com/bytecodealliance/wasmtime/commit/8fceba5be1f00c72582b5deaf6b490e421dfc887 didn't really address the issue anyway, so I'll try your approach.
dicej submitted PR review.
dicej submitted PR review.
dicej created PR review comment:
I'm not sure how to do that, given that we can't create an instance of
Val
without first getting theType
s from aFunc
from anInstance
, and the fuzzer has no way to do that.
dicej created PR review comment:
I guess I do know a way -- I used to have a
Value
enum ingenerator::component_types
which could be constructed independently and then later converted to aVal
when theType
was available. I had removed that because it was redundant in the current approach, but I could restore it.
dicej submitted PR review.
dicej submitted PR review.
dicej created PR review comment:
@alexcrichton How strongly do you feel about this? I think it would require adding a substantial amount of code.
dicej updated PR #4537 from component-fuzz-generator
to main
.
dicej updated PR #4537 from component-fuzz-generator
to main
.
dicej submitted PR review.
dicej created PR review comment:
This wasn't too painful to address. See my latest commit.
dicej updated PR #4537 from component-fuzz-generator
to main
.
alexcrichton created PR review comment:
Given recent spec clarifications it's ok for thi sto have 0 flags, so I think it's ok to remove the "non empty" part here
alexcrichton created PR review comment:
I think we'll want to allow this to have zero fields as well as
Flags
below to exercise the zero-size case.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Similar to the cases below I think that we'll want to remove this to allow testing zero-sized tuples as well.
alexcrichton created PR review comment:
Ah sorry I forgot to review this in the initial first pass. We won't want to have this implementation though because the
IntoComponentFunc
impl is only intended for host functions where the component-type of the function can be statically inferred. As you have probably surmised with the lock/etc in typechecking for these functions the "dynamic" interface here doesn't really match the interface for static functions.Instead what we'll want to do is something along the lines of the difference between
wasmtime::Func::{wrap,new}
where here you're defining thenew
equivalent. That will grow a new functioncomponent::Linker::func_new
which will take the type of the function as its argument as well as a closure. The return values of the closure are then type-checked against the original type at runtime before yielding the result back to the original component.This may be worth splitting to its own PR if you'd prefer. I'm also happy to review it here as well though.
alexcrichton created PR review comment:
Could this be updated to match the canonical definition which aligns-up the summation of the discriminant and the max variant size rather than aligning-up each individual summation? I think it probably ends up being the same thing but figure it may be best to align (heh) with the spec where we can.
alexcrichton created PR review comment:
Similar comment as to above about aligning the result of the summation instead of each individual component to match the spec (although again I don't think it affects the results)
alexcrichton created PR review comment:
One thing I did for the
fact-valid-module
fuzzer was to just specify this as ausize
as well.
alexcrichton created PR review comment:
Could the seed be printed here to assist debugging in case the build script later returns an error?
alexcrichton created PR review comment:
Oh right sorry yes I forgot that we need the component on-hand to generate the values. Yeah in that case the refactor isn't worth it and the current structure is good :+1:
alexcrichton submitted PR review.
dicej updated PR #4537 from component-fuzz-generator
to main
.
dicej created PR review comment:
I was thinking a zero-sized tuple was just a
unit
and thus redundant. Should we also allow zero-sized records, variants, unions, etc.? I.e. should I remove theNonEmptyArray
type entirely?
dicej submitted PR review.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
We should allow zero-size records, yeah, but variants/unions/enums all require at least one case so the
NonEmptyArray
there is correct. You're right that in Rust an empty tuple will translate to a()
, but for fuzzing here I think it's best to exercise as many cases as possible.
dicej updated PR #4537 from component-fuzz-generator
to main
.
dicej created PR review comment:
Would you mind sketching out an example of how your proposed API would be used? I'm specifically interested in where the user would get the type of the function to pass to
component::Linker::func_new
. Would they query theComponent
for the things it imports, find the function of interest, and get the type from that?
dicej submitted PR review.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Yeah that is indeed a good question. In core wasm users would do
FuncType::new(...)
and create their own function type. In the fullness of time I'd expect the same operation to be available to components where function types could be manufactured in addition to being learned from components. That's a big missing piece of the embedding API for components right now though so, yeah, getting it from a component seems the most reasonable for now.
dicej updated PR #4537 from component-fuzz-generator
to main
.
dicej created PR review comment:
@alexcrichton I've just pushed an update per your suggested API. Let me know what you think.
dicej submitted PR review.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Personally I think that this needs to change in the future to taking the function type as the argument. That being said this works well enough for now and it's ok to extend this later, so I think this should stay as-is.
Otherwise though eventually this will require filling out more of the
component::types
submodule in the same manner aswasmtime::types
which is able to describe imports/exports/etc. I think the implementation will be roughly the same as what you have already withArc<Types>
andFooIndex
paired together, but again it's ok to defer this to later.
alexcrichton created PR review comment:
Eventually I think this will need to be filled out when the function type is an argument to
func_new
. Additionally for now I think it would be good to go ahead and fill out to have an "extra layer" of defense against bugs.For now I think the
new_dynamic
here could take aTypeFuncIndex
and the implementation could basically test that the two function indices are the same (since everything comes from within the same component right now)
alexcrichton created PR review comment:
Given that
func_new
is somewhat rare and already fairly verbose I think it's ok to skip the variant-without-StoreContextMut
and go straight to requiring the argument duringfunc_new
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Similar to
TypedFunc::call
vsFunc::call
, is it possible to share this code with the existing entrypoint above? Given how different the parameters are represented it may be too onerous though.
alexcrichton created PR review comment:
Above the 2x2 matrix of "params direct/indirect" x "results direct/indirect" is expressed with 4 different cases but the main reason for that is to get all the fiddly bits around the types to work out. Otherwise here though this may be more amenable to a straightforward "load the params, do the call, store the result" without having to specify 4 different cases (which leads to a fair bit of duplication between everything)
alexcrichton created PR review comment:
Are these new
'static
bounds still required?
alexcrichton created PR review comment:
This looks like it's basically copying the results of
ty
, but couldty
be used as-is? The pairing ofindex: TypeFuncIndex
and&Arc<Types>
I think should be enough to type-check everything
alexcrichton created PR review comment:
Could this specify both the rhs and the lhs of the transmute? (stylistically I prefer to do that given how dangerous
transmute
is in general)
dicej created PR review comment:
We need to represent them as
component::Type
s incall_host_dynamic
in order to useType::flatten_count
,Type::next_field
,Val::lift
,Val::store
, etc. I figured since we'd eventually convert them tocomponent::Type
s anyway, might as well do it once here rather than every timecall_host_dynamic
is invoked.
dicej submitted PR review.
dicej submitted PR review.
dicej created PR review comment:
That was my first thought, but I couldn't see an elegant way to do it. With
Func::call
I was able to reusecall_raw
by using[ValRaw: MAX_FLAT_PARAMS]
as theLowerParams
generic parameter. I can't use that trick withcall_host
, though, since the space for the parameters has already been allocated by the caller.There probably is a way to refactor
call_host
to work in both contexts, but I expect it would be pretty ugly.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Ok that seems reasonable, this can be iterated on later if necessary.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Ok no worries, but in that case can you duplicate some of the more interesting test cases for imports for
func_new
-defined imports as well?
dicej created PR review comment:
It will be required now that we're making
HostFunc::typecheck
more than a no-op for the dynamic case, since I'll need to convert the type of that field back to aBox<dyn Fn...>
so it can store a closure which captures theTypeFuncIndex
.
dicej submitted PR review.
dicej updated PR #4537 from component-fuzz-generator
to main
.
alexcrichton submitted PR review.
dicej updated PR #4537 from component-fuzz-generator
to main
.
alexcrichton merged PR #4537.
Last updated: Jan 24 2025 at 00:11 UTC