wasmtime / PR #8142 PCC: fuzz static (no dynamic checks) ... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / PR #8142 PCC: fuzz static (no dynamic checks) ...

Wasmtime GitHub notifications bot (Mar 14 2024 at 23:13):

cfallin opened PR #8142 from cfallin:fuzz-pcc to bytecodealliance:main:

This PR updates the fuzzing configuration generation logic to turn on proof-carrying code (PCC) when randomly selected. When PCC is on, it forces the memory configuration to one that PCC can handle: a statically-sized 4GiB memory with 2GiB guard region, so no dynamic checks occur. I have fuzzed this for ~100 CPU-hours so far (on x86-64) and haven't found anything, with a local change that forces the PCC arbitrary-bool to true to concentrate fuzzing effort in just the new bit. (I've confirmed that manually introducing failures into the PCC code does properly bubble up.)

The commits start with one that updates the pcc_memory integration test to put all the various loads together in one body such that the optimizer will merge some of their address-computation logic. This was the root of failures I was seeing when last working on PCC in the fall, and that still exist for dynamic-check cases. A subsequent commit reduces the coverage of that test to only the 4GiB/2GiB static case for now so it passes; in this PR I am arguing that we should support this and push it forward first, and fix the dynamic case later (more below).

This PR includes rebases of two commits from @elliottt's #7511, and then a last change to force the memory configuration to the supported static one only when PCC is selected by fuzzing.

Why only the static case? I've made approximately three attempts to get the general dynamic case to work, be fully general, handle all edge-cases, and be scalable (not blow up quadratically): combined static+dynamic ranges in all facts, a more flexible set-of-lower/upper-bound-expressions engine (#7965), and just recently an implementation of the "order graph" I described in the last comment on that PR. All of these suffer from some combination of maddening complexity and a quadratic worst-case when N accesses merge together and we have to symbolically reason about all of them together -- to the point that I know when to cut my losses and would rather ship the static case without digging further. I do think there's maybe something more possible, but it needs more thought and time when not under pressure (and under demand by a few other projects too).

Assuming this sticks and fuzzbugs if any are manageable, I plan to eventually propose an "on but permissive by default" approach (with warnings but no failures), then eventually an "on and enforcing by default", but all of those things will come after more experience and discussions with folks. For now I just want the fuzzers to do their thing!

Wasmtime GitHub notifications bot (Mar 14 2024 at 23:13):

cfallin requested wasmtime-compiler-reviewers for a review on PR #8142.

Wasmtime GitHub notifications bot (Mar 14 2024 at 23:13):

cfallin requested abrown for a review on PR #8142.

Wasmtime GitHub notifications bot (Mar 14 2024 at 23:13):

cfallin requested alexcrichton for a review on PR #8142.

Wasmtime GitHub notifications bot (Mar 14 2024 at 23:13):

cfallin requested wasmtime-core-reviewers for a review on PR #8142.

Wasmtime GitHub notifications bot (Mar 14 2024 at 23:13):

cfallin requested elliottt for a review on PR #8142.

Wasmtime GitHub notifications bot (Mar 14 2024 at 23:44):

github-actions[bot] commented on PR #8142:

Subscribe to Label Action

cc @fitzgen

<details>
This issue or pull request has been labeled: "cranelift", "fuzzing"

Thus the following users have been cc'd because of the following labels:

fitzgen: fuzzing

To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.

Learn more.
</details>

Wasmtime GitHub notifications bot (Mar 15 2024 at 00:41):

jameysharp submitted PR review:

This looks sensible to me! I'm excited to start having PCC in our testing rotation.

Wasmtime GitHub notifications bot (Mar 15 2024 at 02:07):

cfallin merged PR #8142.

Wasmtime GitHub notifications bot (Mar 24 2024 at 17:21):

alexcrichton commented on PR #8142:

I have fuzzed this for ~100 CPU-hours so far (on x86-64) and haven't found anything

I wanted to query about this, are you sure this was on x86-64? I'm pretty surprised how quickly so many fuzz bugs are being found related to pcc on oss-fuzz if this much fuzzing was done before this PR. That makes me worry about the effectiveness of local fuzzing vs oss-fuzz.

Are you able to perhaps double-check to make sure that this PR was fuzzed as-is on x64?

Wasmtime GitHub notifications bot (Mar 24 2024 at 20:36):

cfallin commented on PR #8142:

I wanted to query about this, are you sure this was on x86-64?

Yes.

Are you able to perhaps double-check to make sure that this PR was fuzzed as-is on x64?

Yes, it was.

Wasmtime GitHub notifications bot (Mar 24 2024 at 21:41):

alexcrichton commented on PR #8142:

Which fuzzer were you using? Checking out the merge commit here it takes ~10 seconds locally from an empty corpus to find an error with the instantiate fuzzer for me.

Wasmtime GitHub notifications bot (Mar 24 2024 at 22:13):

cfallin commented on PR #8142:

differential; I just verified that, on an x86-64 machine (18-core/36-thread Xeon), with commit acbbf342b704d88b87312beda1c1a3b45b0ba283 (the merge commit for this PR), the command cargo +nightly fuzz run -s none differential -j32 (same as I tested before) runs for quite a while without finding anything.

so

-s none is my usual thing to get better throughput, because a compiler logic bug shouldn't require ASan to find (tell me if this is wrong?), and

differential maybe is slower because it actually runs the code? or is throwing away test cases?

As mentioned above I did verify it was hitting PCC codepaths because injected bugs were found.

I do see pretty low throughput, e.g. from early in the run
#87140: cov: 69340 ft: 55743 corp: 1144 exec/s 56 oom/timeout/crash: 0/0/0 time: 91s job: 55 dft_time: 0
or 56 executions/second. Maybe my mistake is not using instantiate; I use differential out of {my, my commandline history database}'s habit, mostly...

Wasmtime GitHub notifications bot (Mar 24 2024 at 22:22):

cfallin commented on PR #8142:

(For completeness, I ran differential for ~10 minutes before killing, on 32 threads; no fuzzbugs; reached 1.4M testcase executions)

To add to that, I too get a pretty immediate fuzzbug (after 8 seconds) when running the instantiate target; so indeed, I'll eschew differential entirely unless its comparison powers are actually needed.

Wasmtime GitHub notifications bot (Mar 25 2024 at 14:27):

alexcrichton commented on PR #8142:

Weird! I wonder if your preexisting corpus might be holding you back somehow?

I'm having weird issues running v8 locally where it causes a segfault and the reproduction case never reproduces. If I do ALLOWED_ENGINES=-v8 then locally I find a fuzz bug on the merge commit pretty quickly.

I'm not sure how to explain the fuzzing behavior here... but regardless glad oss-fuzz is running all the fuzzers still and has proven worthwhile!

I was mostly worried that we had some bad mutations or ... something ... but I dunno if I can really draw any conclusion from this...

Last updated: Apr 18 2025 at 19:03 UTC