Stream: git-wasmtime

Topic: wasmtime / issue #4322 Differential fuzzing with wasm-smi...


view this post on Zulip Wasmtime GitHub notifications bot (Jun 25 2022 at 16:29):

alexcrichton opened issue #4322:

Wasmtime recently had https://github.com/bytecodealliance/wasmtime/issues/4315 filed against it which discovered that there were two separate bugs in the SIMD implementation on x86_64. This discovery comes after "months of continuous oss-fuzzing" for the simd feature. I wanted to file an issue here with some investigation of why this happened because this theoretically should not happen.

Specifically here the bug was a buggy instruction lowering (two different ones). One fix (https://github.com/bytecodealliance/wasmtime/pull/4318) surfaced by corrupting an input register which I think only causes issues if the input is attempted to be reused elsewhere (e.g. a constant reused somewhere else). I don't know precisely but my impression was that this involved some register pressure, a "big" function, and constants to line up. This specific bug I could see as very difficult to discover via wasm-smith. The second bug, however, (https://github.com/bytecodealliance/wasmtime/pull/4317) was a trivial bug in the select instruction which showed up with the smallest of tests for select. The fact that wasm-smith never discovered this is alarming to me.

Digging in it appears to be a confluence of factors which makes wasm-smith basically unable to find these bugs:

I unfortunately don't know if there's really a "fix" for issues like this. We could throw a bunch more heuristics at wasm-smith but at some point we probably need a somewhat fundamental new strategy for fuzzing here to get significantly more coverage.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 26 2022 at 00:43):

cfallin commented on issue #4322:

I suspect that for individual instruction semantics, something like #3251 might be a reasonable solution. If we focus on generating test cases for individual Wasm instructions, we can drive the random code generation in the other direction: e.g. we have a v128.select, we know what the stack signature is, let's provide those inputs. Along the same lines, "unit-testing" individual instructions lets us more easily run through their input data-space with many test vectors (invocations) for one function.

Without coverage feedback libFuzzer may not know that some of those inputs are only sensitive to particular values (e.g. zero), but we could maybe bias probabilities too, if needed. (Custom entropy based on... similar bits? Closeness of some args to others? Surely the propcheck/quickcheck folks have some useful heuristics here...)

All the above won't give us coverage of multiple-instruction patterns (I think this objection was also raised in #3251) but unit-testing instruction lowerings seems like a sufficiently useful and unique use-case to me that it's worth its own fuzz target and approach...

view this post on Zulip Wasmtime GitHub notifications bot (Jun 27 2022 at 20:38):

alexcrichton labeled issue #4322:

Wasmtime recently had https://github.com/bytecodealliance/wasmtime/issues/4315 filed against it which discovered that there were two separate bugs in the SIMD implementation on x86_64. This discovery comes after "months of continuous oss-fuzzing" for the simd feature. I wanted to file an issue here with some investigation of why this happened because this theoretically should not happen.

Specifically here the bug was a buggy instruction lowering (two different ones). One fix (https://github.com/bytecodealliance/wasmtime/pull/4318) surfaced by corrupting an input register which I think only causes issues if the input is attempted to be reused elsewhere (e.g. a constant reused somewhere else). I don't know precisely but my impression was that this involved some register pressure, a "big" function, and constants to line up. This specific bug I could see as very difficult to discover via wasm-smith. The second bug, however, (https://github.com/bytecodealliance/wasmtime/pull/4317) was a trivial bug in the select instruction which showed up with the smallest of tests for select. The fact that wasm-smith never discovered this is alarming to me.

Digging in it appears to be a confluence of factors which makes wasm-smith basically unable to find these bugs:

I unfortunately don't know if there's really a "fix" for issues like this. We could throw a bunch more heuristics at wasm-smith but at some point we probably need a somewhat fundamental new strategy for fuzzing here to get significantly more coverage.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 27 2022 at 20:40):

alexcrichton commented on issue #4322:

I had forgotten about that issue! That's an excellent point though. I also filed https://github.com/bytecodealliance/wasmtime/issues/4338 as possible other wasm fuzzers we could integrate. Otherwise though other wasm-smith improvements include https://github.com/bytecodealliance/wasm-tools/issues/266.

While this may not be the most useful issue to keep open I'm tempted to leave it here as "if anyone searches the fuzzing tag in the Wasmtime repo this'll trigger them to think about improving wasm-smith"

view this post on Zulip Wasmtime GitHub notifications bot (Jun 27 2022 at 20:50):

abrown commented on issue #4322:

Hey, I had completely forgotten about #3251 but I believe when I added allowed_instructions to wasm-smith it was motivated in part by that issue. The idea would be to configure wasm-smith to only allow could be configured to generate functions with, e.g., numeric and vector instructions (see InstructionKind for the categories allowed) and set max_instructions to, e.g., 3 (to test possible interactions with other instructions). With the small test case generated, we could then also generate instructions in the other direction @cfallin mentions, by randomizing the function arguments. What do you all think of that as a new fuzz target?

view this post on Zulip Wasmtime GitHub notifications bot (Jun 27 2022 at 20:50):

abrown edited a comment on issue #4322:

Hey, I had completely forgotten about #3251 but I believe when I added allowed_instructions to wasm-smith it was motivated in part by that issue. The idea would be to configure wasm-smith to generate functions with, e.g., numeric and vector instructions (see InstructionKind for the categories allowed) and set max_instructions to, e.g., 3 (to test possible interactions with other instructions). With the small test case generated, we could then also generate instructions in the other direction @cfallin mentions, by randomizing the function arguments. What do you all think of that as a new fuzz target?

view this post on Zulip Wasmtime GitHub notifications bot (Jun 27 2022 at 20:51):

fitzgen commented on issue #4322:

We could also add a "no control flow" mode to wasm-smith, which might end up pretty similar to the "test a single instruction" thing in practice.

In fact, @abrown added the InstructionKinds filtering and we don't really use that in our fuzz targets yet. Covers a lot of the same stuff. We could actually probably use this to avoid control instructions already.

I think we also only pass zeros in as arguments to whatever functions we generate, and we could probably do much better than that as well (at least try zero, max, and one random bit pattern).

view this post on Zulip Wasmtime GitHub notifications bot (Jun 27 2022 at 20:51):

abrown edited a comment on issue #4322:

Hey, I had completely forgotten about #3251 but I believe when I added allowed_instructions to wasm-smith it was motivated in part by that issue. The idea would be to configure wasm-smith to generate functions with, e.g., numeric and vector instructions (see InstructionKind for the categories allowed) and set max_instructions to, e.g., 3 (to test possible interactions with other instructions). With the small test case generated, we could then also randomly generate in the value space that @cfallin mentions, by randomizing the function arguments. What do you all think of that as a new fuzz target?

view this post on Zulip Wasmtime GitHub notifications bot (Jun 27 2022 at 20:52):

fitzgen commented on issue #4322:

@abrown jinx ;)

What do you all think of that as a new fuzz target?

Yes, definitely.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 27 2022 at 20:52):

alexcrichton commented on issue #4322:

One thing I think we'd also want to change about wasm-smith is the signatures of functions as well. Right now we drop all values that don't correspond to the function's type so differential fuzzing has a hard time picking these up. Ideally we want a mode where we generate the function body first and then we wrap that in a function of the appropriate type.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 27 2022 at 20:54):

abrown commented on issue #4322:

Hm... how? I wasn't really aware of the limitations on function signatures.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 27 2022 at 20:54):

fitzgen commented on issue #4322:

We just choose a signature, and then generate a body. Alex is suggesting we generate a body and then derive a signature from that.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 27 2022 at 20:55):

fitzgen commented on issue #4322:

Makes sense for returns, not necessarily for parameters though, since generating a body needs to know what locals are available.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 27 2022 at 20:56):

abrown commented on issue #4322:

Or the first instruction dictates what locals _must_ be available which should bubble up to the signature... This is the part that I'm having trouble seeing in wasm-smith. It doesn't seem designed for this kind of thing?

view this post on Zulip Wasmtime GitHub notifications bot (Jun 27 2022 at 21:17):

abrown commented on issue #4322:

Ok, how about this: we add SingleFunctionModule to wasm-smith which takes a Config, an Unstructured, and a Signature. The Signature could be something like:

pub struct Signature {
  params: Vec<ValTy>,
  returns: Vec<ValTy>,
}

ValTy would be some set-like struct that could be Arbitrary-generated and would cover the Wasm types (i32, f32, v128, etc.). So one could either manually set the parameters of the function (I think this has come up in the past as a need) OR one could generate this Signature with a range of types. (Or something like that... looking for feedback here).

The SingleFunctionModule would then... well, what _would_ it do? Generate instructions to use all of the arguments before we get to max_instructions? We don't have to re-local.get the arguments, right?

view this post on Zulip Wasmtime GitHub notifications bot (Jun 27 2022 at 21:34):

alexcrichton commented on issue #4322:

One idea is to perhaps:

I like the idea though of using a completely new dedicated SingleFunctionModule (or similar) generator for this. That module could also have a list of values to invoke the exported function with perhaps?

view this post on Zulip Wasmtime GitHub notifications bot (Jun 27 2022 at 21:35):

abrown commented on issue #4322:

So you mean "arbitrary list of parameters" not "arbitrary list of parameter types"?

view this post on Zulip Wasmtime GitHub notifications bot (Jun 27 2022 at 21:39):

alexcrichton commented on issue #4322:

Oh sorry I meant types there, but I was thinking we could in actuality do:

whether that second step belongs in wasm-smith or wasmtime I dunno but either seems fine by me

view this post on Zulip Wasmtime GitHub notifications bot (Dec 02 2022 at 00:09):

alexcrichton commented on issue #4322:

The single-instruction generator from Andrew came out of this and nothing else has turned up in the meantime. Additionally Andrew did a lot of refactoring to have one differential fuzzer so I think we're in a better place now (hopefully). I'm going to subsequently close this.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 02 2022 at 00:09):

alexcrichton closed issue #4322:

Wasmtime recently had https://github.com/bytecodealliance/wasmtime/issues/4315 filed against it which discovered that there were two separate bugs in the SIMD implementation on x86_64. This discovery comes after "months of continuous oss-fuzzing" for the simd feature. I wanted to file an issue here with some investigation of why this happened because this theoretically should not happen.

Specifically here the bug was a buggy instruction lowering (two different ones). One fix (https://github.com/bytecodealliance/wasmtime/pull/4318) surfaced by corrupting an input register which I think only causes issues if the input is attempted to be reused elsewhere (e.g. a constant reused somewhere else). I don't know precisely but my impression was that this involved some register pressure, a "big" function, and constants to line up. This specific bug I could see as very difficult to discover via wasm-smith. The second bug, however, (https://github.com/bytecodealliance/wasmtime/pull/4317) was a trivial bug in the select instruction which showed up with the smallest of tests for select. The fact that wasm-smith never discovered this is alarming to me.

Digging in it appears to be a confluence of factors which makes wasm-smith basically unable to find these bugs:

I unfortunately don't know if there's really a "fix" for issues like this. We could throw a bunch more heuristics at wasm-smith but at some point we probably need a somewhat fundamental new strategy for fuzzing here to get significantly more coverage.


Last updated: Oct 23 2024 at 20:03 UTC