fitzgen edited issue #10327.
fitzgen added the fuzzing label to Issue #10327.
fitzgen added the wasm-proposal:gc label to Issue #10327.
github-actions[bot] commented on issue #10327:
Subscribe to Label Action
cc @fitzgen
<details>
This issue or pull request has been labeled: "fuzzing"Thus the following users have been cc'd because of the following labels:
- fitzgen: fuzzing
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
Learn more.
</details>
fitzgen edited issue #10327:
GC Fuzzing
This is a sketch of a series of milestones, roughly estimated to be about one PR each, for GC fuzzing. We don't need to do exactly these milestones in exactly this order. It is okay if, in the process of implementing them, we find reason to do milestone X before Y and decide to swap their order. It is okay if one milestone becomes three PRs, or two milestones become one PR. This is just a sketch to provide some light structure. Also, these milestones represent a lot of work, and while the goal is certainly to get everything here finished during your internship, if it turns out that this is more work that can be completed in that amount of time, that is okay.
Milestones:
[X] Port the
table_opsfuzzer fromarbitrarytomutatis
<details>
- Replacing
arbitrary-based generator with amutatis-based fuzzer- Oracle should be able to (largely) remain unmodified
- Avoid adding new features or changing existing functionality as much as possible
- Start handling the scenario where:
- We originally have a valid series of fuzzer ops A,
- then a random mutation inserts, removes, or changes an op in A to produce fuzzer ops A',
- but A' is now no longer a valid sequence of fuzzer ops (for example, an initial op whose result was the input for a subsequent op was removed, for example).
- Now we have to figure out how to handle invalid test cases:
- Do we throw it away and continue to the next one?
- Do we "patch up" the test case in a post-processing pass that fixes it up such that it becomes valid?
- I'm thinking we probably want to do a fix-up pass to avoid wasting cycles on test cases we won't execute, ultimately hurting our overall fuzzing throughput, but I'm not 100% sure that is the right choice, and I am all ears for counter arguments and alternative ideas.
- As far as
mutatis-related prior art goes, the allocator fuzzer handled this stuff in the oracle: it simply avoided performing the fuzzer's allocation operations when they did not make sense (i.e. it ignores the "deallocate pointer with id X" operation if there isn't currently a live allocation with id X): https://github.com/fitzgen/deallocate-zeroed/blob/52c2b9b7305dd1a947b3342128fcf0d03143223e/crates/fuzzing/src/lib.rs#L656- We want to always generate valid Wasm, otherwise we can't even execute the subset of ops that are valid, so I think we will need to do the equivalent filtering/fixing up in the generator rather than in the oracle.
- This fix-up pass could be its own pass, mutating the fuzzer ops AST structure directly, before we generate a Wasm binary from that AST structure. Or, alternatively, it could be integrated into Wasm-binary-from-AST generation step.
- Goal: Become familiar with
mutatisand move the fuzzer towards our north star vision for it
</details>[ ] Add support for empty
(rec ...)groups containing zero or more emptystructtypes.
<details>
- Add generation support for these
(rec ...)groups with zero or more emptystructtypes- Add a fuzzer op that allocates a random one of our
structtypes. We should avoid emitting this op when we have not defined anystructtypes.- Add a function import that takes a
structrefargument to our Wasm binary skeleton- Add a fuzzer op that calls that new function
- Add a function import for each defined
structtype that takes a reference of that type of struct specifically- Add a fuzzer op that calls these new typed functions
- Add support for generating locals, globals, and tables that contain
structrefs and concretely typedstructreferences- Add new or extend existing fuzzer ops for accessing locals/globals/tables of
structrefs and concretely typedstructreferences- Define a mutation that adds an empty
structtype to an existing(rec ...)group- Define a mutation that removes a
structtype from an existing(rec ...)group- Define a mutation that moves a
structtype within an existing(rec ...)group- Define a mutation that moves a
structtype from an existing(rec ...)group to another- Define a mutation that duplicates a
(rec ...)group- Define a mutation that removes a whole
(rec ...)group- Define a mutation that merges two
(rec ...)groups- Define a mutation that splits a
(rec ...)group in two, if possible- Will need to migrate the abstract representation of the value stack from a couner of how many
externrefs are on the stack to an actual stack of types- Note: there is a whole lot here, even just exercising empty
structdefinitions, so it may make sense to split this up into a few PRs- Goal: start exercising type canonicalization, lay down initial infrastructure for
(rec ...)groups andstructtypes.
</details>[ ] Add support for exercising subtyping relationships
<details>
- Add support for on
structtype to be a subtype of anotherstructtype- Subtypes must be defined after their supertypes, so this will require we do some validation/filtering/fixups in the face of mutations that reorder types. (This restriction prevents cyclic subtyping where A subtypes B and B subtypes A.)
- Want to think hard about how we represent types, rec groups, and type references in the AST at this point and how we are architecting the program. These choices are going to have a big impact on everything we do from here on out.
- Do we just have a list of rec groups that directly reflect the order we will put them into the Wasm binary?
- Do we create arbitrary type graphs and then do a topological sort on them to ensure they are valid? If we do this approach, what happens in the future when we find a cycle that is not within the same rec group? Or do we not even have rec groups here, just types, and then generate rec groups as part of the projection out from the graph?
- Do we reference types by id and ignore references when there is not type associated with that id?
- Or do we simply reference types by their type index, as in Wasm? And then ignore invalid references?
- Goal: Continue to fill out type system features that we exercise, figure out the architecture we want for dealing with references between types.
</details>[ ] Add support for casting and testing references
<details>
- Add a fuzzer op for casts that should always succeed (from a subtype to a supertype). This should become a
ref.castinstruction in the Wasm binary.Add a fuzzer op for fallible downcasts. This should not trap and kill the whole program if the downcast fails. In the fullness of time, we will want to generate blocks that do things with the downcasted value that are only executed if the downcast succeeds, something like this:
```
block (param ...)
;; Break out of this block if we fail to cast the reference on top of the
;; stack to its subtype.
ref.br_on_cast_fail ...;; Do stuff with the subtype reference...
end;; Control joins again here and the program continues.
```And we will want to exercise both the
br_on_cast_failandbr_on_castinstructions, which will require slightly different shapes of blocks.Initially, we can simply downcast to a nullable subtype reference, where we get null if the downcast fails:
block (param (ref null $my_super)) (result (ref null $my_sub)) local.tee $my_temp ;; Test if the downcast will succeed. ref.test ... if (result (ref null $my_sub)) ;; The downcast will succeed, do a downcast-or-trap operation which we ;; know will not trap. local.get $my_temp ref.cast ... else ;; The downcast would fail, so just create a null reference instead. ref.null ... end endGoal: Exercise subtyping checks and casting logic.
</details>[ ] Add support for POD and
externreffields instructs
<details>
- Add generation and mutation support for
i8/i16/i32/i64/f32/f64/v128/externrefstruct fields. Reference fields will be trickier and can be added in another milestone.- Add fuzzer ops for
struct.getandstruct.set- Maybe: Add fuzzer ops for creating POD constants? Or should we just create zero constants on demand when necessary in the AST-to-Wasm-binary generation step? We aren't trying to exercise arithmetic in this fuzzer, that is well exercised by other fuzzers already, so the only interesting case for POD data is really when it comes from another
struct's field. Other than that interesting case, we just need a some value in order to allocate a struct containing these values.- Add mutators to add, remove, and modify a field in a defined
structtype- Goal: start exercising accessing, reading from, and writing to GC objects
</details>[ ] Add support for concretely-typed references as
structfields
<details>
- Extend the items from the previous milestone to full type references between different defined
structtypes- Also add support for abstract
structreftypes- Non-nullable type references will be tricky: if we have a cycle of non-nullable references, then those types are not currently inhabitable.
- Can detect and correct this in our fix-up pass, perhaps
- Okay to punt on this in a first PR, but we definitely would want to start handling them in a closely following PR. This is an important corner of the spec, and we perform optimizations based on non-nullability in the compiler, and one that none of our other fuzzers are exercising this right now (wasm-smith does not support non-nullable references yet).
- This phase is when we will really start seeing the impact of our architecture choices around how to represent types and references between them in the AST.
- Goal: start exercising references bet
[message truncated]
fitzgen edited issue #10327:
GC Fuzzing
This is a sketch of a series of milestones, roughly estimated to be about one PR each, for GC fuzzing. We don't need to do exactly these milestones in exactly this order. It is okay if, in the process of implementing them, we find reason to do milestone X before Y and decide to swap their order. It is okay if one milestone becomes three PRs, or two milestones become one PR. This is just a sketch to provide some light structure. Also, these milestones represent a lot of work, and while the goal is certainly to get everything here finished during your internship, if it turns out that this is more work that can be completed in that amount of time, that is okay.
Incremental Milestones
[X] Port the
table_opsfuzzer fromarbitrarytomutatis
<details>
- Replacing
arbitrary-based generator with amutatis-based fuzzer- Oracle should be able to (largely) remain unmodified
- Avoid adding new features or changing existing functionality as much as possible
- Start handling the scenario where:
- We originally have a valid series of fuzzer ops A,
- then a random mutation inserts, removes, or changes an op in A to produce fuzzer ops A',
- but A' is now no longer a valid sequence of fuzzer ops (for example, an initial op whose result was the input for a subsequent op was removed, for example).
- Now we have to figure out how to handle invalid test cases:
- Do we throw it away and continue to the next one?
- Do we "patch up" the test case in a post-processing pass that fixes it up such that it becomes valid?
- I'm thinking we probably want to do a fix-up pass to avoid wasting cycles on test cases we won't execute, ultimately hurting our overall fuzzing throughput, but I'm not 100% sure that is the right choice, and I am all ears for counter arguments and alternative ideas.
- As far as
mutatis-related prior art goes, the allocator fuzzer handled this stuff in the oracle: it simply avoided performing the fuzzer's allocation operations when they did not make sense (i.e. it ignores the "deallocate pointer with id X" operation if there isn't currently a live allocation with id X): https://github.com/fitzgen/deallocate-zeroed/blob/52c2b9b7305dd1a947b3342128fcf0d03143223e/crates/fuzzing/src/lib.rs#L656- We want to always generate valid Wasm, otherwise we can't even execute the subset of ops that are valid, so I think we will need to do the equivalent filtering/fixing up in the generator rather than in the oracle.
- This fix-up pass could be its own pass, mutating the fuzzer ops AST structure directly, before we generate a Wasm binary from that AST structure. Or, alternatively, it could be integrated into Wasm-binary-from-AST generation step.
- Goal: Become familiar with
mutatisand move the fuzzer towards our north star vision for it
</details>[ ] Add support for empty
(rec ...)groups containing zero or more emptystructtypes.
<details>
- Add generation support for these
(rec ...)groups with zero or more emptystructtypes- Add a fuzzer op that allocates a random one of our
structtypes. We should avoid emitting this op when we have not defined anystructtypes.- Add a function import that takes a
structrefargument to our Wasm binary skeleton- Add a fuzzer op that calls that new function
- Add a function import for each defined
structtype that takes a reference of that type of struct specifically- Add a fuzzer op that calls these new typed functions
- Add support for generating locals, globals, and tables that contain
structrefs and concretely typedstructreferences- Add new or extend existing fuzzer ops for accessing locals/globals/tables of
structrefs and concretely typedstructreferences- Define a mutation that adds an empty
structtype to an existing(rec ...)group- Define a mutation that removes a
structtype from an existing(rec ...)group- Define a mutation that moves a
structtype within an existing(rec ...)group- Define a mutation that moves a
structtype from an existing(rec ...)group to another- Define a mutation that duplicates a
(rec ...)group- Define a mutation that removes a whole
(rec ...)group- Define a mutation that merges two
(rec ...)groups- Define a mutation that splits a
(rec ...)group in two, if possible- Will need to migrate the abstract representation of the value stack from a couner of how many
externrefs are on the stack to an actual stack of types- Note: there is a whole lot here, even just exercising empty
structdefinitions, so it may make sense to split this up into a few PRs- Goal: start exercising type canonicalization, lay down initial infrastructure for
(rec ...)groups andstructtypes.
</details>[ ] Add support for exercising subtyping relationships
<details>
- Add support for on
structtype to be a subtype of anotherstructtype- Subtypes must be defined after their supertypes, so this will require we do some validation/filtering/fixups in the face of mutations that reorder types. (This restriction prevents cyclic subtyping where A subtypes B and B subtypes A.)
- Want to think hard about how we represent types, rec groups, and type references in the AST at this point and how we are architecting the program. These choices are going to have a big impact on everything we do from here on out.
- Do we just have a list of rec groups that directly reflect the order we will put them into the Wasm binary?
- Do we create arbitrary type graphs and then do a topological sort on them to ensure they are valid? If we do this approach, what happens in the future when we find a cycle that is not within the same rec group? Or do we not even have rec groups here, just types, and then generate rec groups as part of the projection out from the graph?
- Do we reference types by id and ignore references when there is not type associated with that id?
- Or do we simply reference types by their type index, as in Wasm? And then ignore invalid references?
- Goal: Continue to fill out type system features that we exercise, figure out the architecture we want for dealing with references between types.
</details>[ ] Add support for casting and testing references
<details>
- Add a fuzzer op for casts that should always succeed (from a subtype to a supertype). This should become a
ref.castinstruction in the Wasm binary.Add a fuzzer op for fallible downcasts. This should not trap and kill the whole program if the downcast fails. In the fullness of time, we will want to generate blocks that do things with the downcasted value that are only executed if the downcast succeeds, something like this:
```
block (param ...)
;; Break out of this block if we fail to cast the reference on top of the
;; stack to its subtype.
ref.br_on_cast_fail ...;; Do stuff with the subtype reference...
end;; Control joins again here and the program continues.
```And we will want to exercise both the
br_on_cast_failandbr_on_castinstructions, which will require slightly different shapes of blocks.Initially, we can simply downcast to a nullable subtype reference, where we get null if the downcast fails:
block (param (ref null $my_super)) (result (ref null $my_sub)) local.tee $my_temp ;; Test if the downcast will succeed. ref.test ... if (result (ref null $my_sub)) ;; The downcast will succeed, do a downcast-or-trap operation which we ;; know will not trap. local.get $my_temp ref.cast ... else ;; The downcast would fail, so just create a null reference instead. ref.null ... end endGoal: Exercise subtyping checks and casting logic.
</details>[ ] Add support for POD and
externreffields instructs
<details>
- Add generation and mutation support for
i8/i16/i32/i64/f32/f64/v128/externrefstruct fields. Reference fields will be trickier and can be added in another milestone.- Add fuzzer ops for
struct.getandstruct.set- Maybe: Add fuzzer ops for creating POD constants? Or should we just create zero constants on demand when necessary in the AST-to-Wasm-binary generation step? We aren't trying to exercise arithmetic in this fuzzer, that is well exercised by other fuzzers already, so the only interesting case for POD data is really when it comes from another
struct's field. Other than that interesting case, we just need a some value in order to allocate a struct containing these values.- Add mutators to add, remove, and modify a field in a defined
structtype- Goal: start exercising accessing, reading from, and writing to GC objects
</details>[ ] Add support for concretely-typed references as
structfields
<details>
- Extend the items from the previous milestone to full type references between different defined
structtypes- Also add support for abstract
structreftypes- Non-nullable type references will be tricky: if we have a cycle of non-nullable references, then those types are not currently inhabitable.
- Can detect and correct this in our fix-up pass, perhaps
- Okay to punt on this in a first PR, but we definitely would want to start handling them in a closely following PR. This is an important corner of the spec, and we perform optimizations based on non-nullability in the compiler, and one that none of our other fuzzers are exercising this right now (wasm-smith does not support non-nullable references yet).
- This phase is when we will really start seeing the impact of our architecture choices around how to represent types and references between them in the AST.
- Goal: start exercising
[message truncated]
fitzgen edited issue #10327:
GC Fuzzing
This is a sketch of a series of milestones, roughly estimated to be about one PR each, for GC fuzzing. We don't need to do exactly these milestones in exactly this order. It is okay if, in the process of implementing them, we find reason to do milestone X before Y and decide to swap their order. It is okay if one milestone becomes three PRs, or two milestones become one PR. This is just a sketch to provide some light structure. Also, these milestones represent a lot of work, and while the goal is certainly to get everything here finished during your internship, if it turns out that this is more work that can be completed in that amount of time, that is okay.
Incremental Milestones
[X] Port the
table_opsfuzzer fromarbitrarytomutatis
<details>
- Replacing
arbitrary-based generator with amutatis-based fuzzer- Oracle should be able to (largely) remain unmodified
- Avoid adding new features or changing existing functionality as much as possible
- Start handling the scenario where:
- We originally have a valid series of fuzzer ops A,
- then a random mutation inserts, removes, or changes an op in A to produce fuzzer ops A',
- but A' is now no longer a valid sequence of fuzzer ops (for example, an initial op whose result was the input for a subsequent op was removed, for example).
- Now we have to figure out how to handle invalid test cases:
- Do we throw it away and continue to the next one?
- Do we "patch up" the test case in a post-processing pass that fixes it up such that it becomes valid?
- I'm thinking we probably want to do a fix-up pass to avoid wasting cycles on test cases we won't execute, ultimately hurting our overall fuzzing throughput, but I'm not 100% sure that is the right choice, and I am all ears for counter arguments and alternative ideas.
- As far as
mutatis-related prior art goes, the allocator fuzzer handled this stuff in the oracle: it simply avoided performing the fuzzer's allocation operations when they did not make sense (i.e. it ignores the "deallocate pointer with id X" operation if there isn't currently a live allocation with id X): https://github.com/fitzgen/deallocate-zeroed/blob/52c2b9b7305dd1a947b3342128fcf0d03143223e/crates/fuzzing/src/lib.rs#L656- We want to always generate valid Wasm, otherwise we can't even execute the subset of ops that are valid, so I think we will need to do the equivalent filtering/fixing up in the generator rather than in the oracle.
- This fix-up pass could be its own pass, mutating the fuzzer ops AST structure directly, before we generate a Wasm binary from that AST structure. Or, alternatively, it could be integrated into Wasm-binary-from-AST generation step.
- Goal: Become familiar with
mutatisand move the fuzzer towards our north star vision for it</details>
[ ] Add support for empty
(rec ...)groups containing zero or more emptystructtypes.
<details>
- Add generation support for these
(rec ...)groups with zero or more emptystructtypes- Add a fuzzer op that allocates a random one of our
structtypes. We should avoid emitting this op when we have not defined anystructtypes.- Add a function import that takes a
structrefargument to our Wasm binary skeleton- Add a fuzzer op that calls that new function
- Add a function import for each defined
structtype that takes a reference of that type of struct specifically- Add a fuzzer op that calls these new typed functions
- Add support for generating locals, globals, and tables that contain
structrefs and concretely typedstructreferences- Add new or extend existing fuzzer ops for accessing locals/globals/tables of
structrefs and concretely typedstructreferences- Define a mutation that adds an empty
structtype to an existing(rec ...)group- Define a mutation that removes a
structtype from an existing(rec ...)group- Define a mutation that moves a
structtype within an existing(rec ...)group- Define a mutation that moves a
structtype from an existing(rec ...)group to another- Define a mutation that duplicates a
(rec ...)group- Define a mutation that removes a whole
(rec ...)group- Define a mutation that merges two
(rec ...)groups- Define a mutation that splits a
(rec ...)group in two, if possible- Will need to migrate the abstract representation of the value stack from a couner of how many
externrefs are on the stack to an actual stack of types- Note: there is a whole lot here, even just exercising empty
structdefinitions, so it may make sense to split this up into a few PRs- Goal: start exercising type canonicalization, lay down initial infrastructure for
(rec ...)groups andstructtypes.</details>
[ ] Add support for exercising subtyping relationships
<details>
- Add support for on
structtype to be a subtype of anotherstructtype- Subtypes must be defined after their supertypes, so this will require we do some validation/filtering/fixups in the face of mutations that reorder types. (This restriction prevents cyclic subtyping where A subtypes B and B subtypes A.)
- Want to think hard about how we represent types, rec groups, and type references in the AST at this point and how we are architecting the program. These choices are going to have a big impact on everything we do from here on out.
- Do we just have a list of rec groups that directly reflect the order we will put them into the Wasm binary?
- Do we create arbitrary type graphs and then do a topological sort on them to ensure they are valid? If we do this approach, what happens in the future when we find a cycle that is not within the same rec group? Or do we not even have rec groups here, just types, and then generate rec groups as part of the projection out from the graph?
- Do we reference types by id and ignore references when there is not type associated with that id?
- Or do we simply reference types by their type index, as in Wasm? And then ignore invalid references?
- Goal: Continue to fill out type system features that we exercise, figure out the architecture we want for dealing with references between types.
</details>
[ ] Add support for casting and testing references
<details>
- Add a fuzzer op for casts that should always succeed (from a subtype to a supertype). This should become a
ref.castinstruction in the Wasm binary.Add a fuzzer op for fallible downcasts. This should not trap and kill the whole program if the downcast fails. In the fullness of time, we will want to generate blocks that do things with the downcasted value that are only executed if the downcast succeeds, something like this:
```
block (param ...)
;; Break out of this block if we fail to cast the reference on top of the
;; stack to its subtype.
ref.br_on_cast_fail ...;; Do stuff with the subtype reference...
end;; Control joins again here and the program continues.
```And we will want to exercise both the
br_on_cast_failandbr_on_castinstructions, which will require slightly different shapes of blocks.Initially, we can simply downcast to a nullable subtype reference, where we get null if the downcast fails:
block (param (ref null $my_super)) (result (ref null $my_sub)) local.tee $my_temp ;; Test if the downcast will succeed. ref.test ... if (result (ref null $my_sub)) ;; The downcast will succeed, do a downcast-or-trap operation which we ;; know will not trap. local.get $my_temp ref.cast ... else ;; The downcast would fail, so just create a null reference instead. ref.null ... end endGoal: Exercise subtyping checks and casting logic.
</details>
[ ] Add support for POD and
externreffields instructs
<details>
- Add generation and mutation support for
i8/i16/i32/i64/f32/f64/v128/externrefstruct fields. Reference fields will be trickier and can be added in another milestone.- Add fuzzer ops for
struct.getandstruct.set- Maybe: Add fuzzer ops for creating POD constants? Or should we just create zero constants on demand when necessary in the AST-to-Wasm-binary generation step? We aren't trying to exercise arithmetic in this fuzzer, that is well exercised by other fuzzers already, so the only interesting case for POD data is really when it comes from another
struct's field. Other than that interesting case, we just need a some value in order to allocate a struct containing these values.- Add mutators to add, remove, and modify a field in a defined
structtype- Goal: start exercising accessing, reading from, and writing to GC objects
</details>
[ ] Add support for concretely-typed references as
structfields
<details>
- Extend the items from the previous milestone to full type references between different defined
structtypes- Also add support for abstract
structreftypes- Non-nullable type references will be tricky: if we have a cycle of non-nullable references, then those types are not currently inhabitable.
- Can detect and correct this in our fix-up pass, perhaps
- Okay to punt on this in a first PR, but we definitely would want to start handling them in a closely following PR. This is an important corner of the spec, and we perform optimizations based on non-nullability in the compiler, and one that none of our other fuzzers are exercising this right now (wasm-smith does not support non-nullable references yet).
- This phase is when we will really start seeing the impact of our architecture choices around how to represent types and references between them in the AST.
- Goal: s
[message truncated]
Last updated: Dec 13 2025 at 19:03 UTC