Stream: git-wasmtime

Topic: wasmtime / Issue #611 Initial Fuzzing Infrastructure


view this post on Zulip Wasmtime GitHub notifications bot (Apr 04 2020 at 06:21):

Hyperion101010 commented on Issue #611:

@fitzgen sir this was a gsoc2020 project idea, I worked in the application period and submitted a proposal. Given the time I had at I hand i wasn't able to get complete idea about the different vulnerabilities like ABI abstractions, Heap and Stack safety. I want to voluntarily contribute for the idea, but couldn't do the same before I clear out some doubts.
I would like to start understanding the fuzzing process more closely and contributing by writing fuzzers perhaps. During the application process I wrote mails for the project details, but I never got any reply which is completely fine given the situation we have now.
Is there any way we can do a conversation for the doubts I have, I see that there used to be a IRC channel for wasmtime one year ago, but now they migrated to Matrix which unfortunately doesn't has any such channel. If you are available on any channel of Mozilla/(other open source org) please let me know.
Good day!

view this post on Zulip Wasmtime GitHub notifications bot (Apr 04 2020 at 07:14):

bjorn3 commented on Issue #611:

https://bytecodealliance.zulipchat.com/ is the primary discussion channel.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 03 2021 at 20:43):

bjorn3 commented on Issue #611:

I think this can be closed.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 03 2021 at 20:51):

cfallin closed Issue #611:

I plan on laying out some foundational fuzzing infrastructure for Wasmtime in the next few weeks. I'd like to use this issue as a kind of meta issue to keep track of this work. I'd also appreciate feedback on the plan from anyone with experience fuzzing or domain knowledge of a particular thing we plan on fuzzing.

Goals

Strategy

Breadth not Depth

At least initially, let's build out a few different fuzzing approaches enough that they start identifying bugs, but not spend a ton of time building bespoke tools tailored for exactly the problems we have at hand.

My assumptions are that

  1. we have low-hanging fruit available, since we haven't done a ton of fuzzing for a bunch of corners yet, and
  2. different fuzzing approaches tend to uncover different sets of bugs.

Therefore, by making a bunch of different just-good-enough fuzzers, we will repeatedly discover new, unique low-hanging fruit bugs.

Additionally, this gives us a nice foundation that we can spring board off of in the future when we decide to go deeper in any particular direction.

Decouple Generators and Oracles

A generator creates test cases (usually given an RNG or a random byte stream input). An oracle determines if executing a test case uncovered a bug. In general, it is good software engineering to separate concerns, but separating these two parts specifically allows us to:

Implementation

In general, I recommend that we use libFuzzer to drive our fuzzing. It is coverage-guided, which means it can find interesting code paths more quickly than testing purely random inputs will. It also has a nice Rust interface in the form of cargo-fuzz.

Any custom generators we create should take libFuzzer-provided input bytes and then re-interpret that as a sequence of random values to drive choices inside the generator. This lets us combine the benefits of smart, structure-aware generators with those of coverage-guided fuzzing. We can implement this by implementing our custom generators in terms of the arbitrary crate's Arbitrary trait.

As far as test case reduction goes, when a generator is creating Wasm files, it should be relatively easy to use binaryen's wasm-reduce on the Wasm file, or use creduce on the WAT disassembly. We can, however, do some small things to make the process turnkey:

For generators that are creating custom in-memory data structures by implementing the Arbitrary trait, test case reduction requires we implement some custom logic. The Arbitrary trait supports defining a custom shrink method that takes &self and returns an iterator of smaller instances of Self. We can use this to create custom test case reduction for each of our custom test case generators.

Finally, any custom generator we create (and any generator we wrap that supports turning the generation of individual test case features on/off) should support swarm testing. Swarm testing is where we randomly turn on/off the generation of various test case features (such as, should a generator create Wasm test cases that use call_indirect or not?) so that we are more likely to generate pathological test cases where bugs are more likely to be found. This is relatively easy implement and should yield

Fuzzing Wasmtime's Embedding API

This is a case where, unfortunately, we can't really use existing off-the-shelf solutions.

Generators

Oracles

Wasm Execution Fuzzing

We should fuzz our execution of Wasm. Yes, Cranelift has some fuzzing in SpiderMonkey, but we should also make sure that all of our Wasmtime-specific JIT'ing machinery is well fuzzed, as well as our WASI implementation and sandboxing.

Generators

Oracles

More Stuff to Explore in the Future

Questions


Last updated: Oct 23 2024 at 20:03 UTC