Hi all -- I'm looking at how to strengthen the fuzzing story for the new CL backend. As part of this, I think we should be running the regalloc checker-based fuzz targets on some continuous basis. Wasmtime is already on Google's oss-fuzz but a targeted setup for regalloc with the specific dataflow-checker oracle would be much more powerful, I think.
So, what do folks think about applying to have regalloc.rs a part of oss-fuzz? cc @fitzgen (he/him) @Benjamin Bouvier @Julian Seward
if we are already moving regalloc.rs into the same repo, I think it makes more sense to just wait for that to happen, at which point we should get the fuzzing on oss-fuzz naturally
Ah, so I think the latest story on that is that we want to keep it in a separate crate
(we will have to move the fuzz target definitions themselves into the top level fuzz
crate. the actual fuzz target definition can simply be a function call to the "real" fuzz target, similar to a bunch of our existing fuzz targets work)
oh for sure, it can still be a separate crate, but are we rethinking including it in the repo?
err, sorry, s/crate/repo/
I believe the main concerns were CI time and appearance of independence/reusability
in that case it probably makes sense to copy what we do for the wasm-tools
fuzz targets
They shoehorn into wasmtime's oss-fuzz allocation somehow?
Excellent, this answers many questions, thanks! (I wasn't sure whether it would be ok to piggyback on wasmtime's fuzz time within the spirit of the rules but I suppose we use it anyway so this just targets fuzzing more precisely)
as an aside, while you're here -- do we have a good differential-fuzzing story for wasmtime yet? (I've just started digging through all the configs/targets)
we have something set up but it hasn't really found anything
I'd love to do old-backend vs. new-backend and new-backend vs. native (or at least the latter)
I think it is hard to use just coverage to guide differential fuzzing, I don't know
hmm, yeah, I haven't played with this too much before either
anyway I'll keep digging -- thanks!
https://github.com/bytecodealliance/wasmtime/blob/main/crates/fuzzing/src/oracles.rs#L119
to reuse this approach with new vs old backend, we would need a way to compile both the old and new backends and be able to choose on a per-module basis which one we were using
Yes, I suppose I was envisioning something much hackier that would drive two wasmtime binaries from a shell script and compare outputs, or something like that
I think one would need something similar for wasmtime vs. native
iirc this is how csmith-based fuzzing works but I haven't looked at that in a long time...?
yeah that would probably be better than exposing backends in our config, but it also doesn't really fit into the oss-fuzz model, which is essentially "we give them a libfuzzer-using binary" and doesn't really work well with external processes
yeah, I don't think you can link csmith as a library, I think you have to spawn it. maybe worth doing a quick grep over oss-fuzz to see if anything is using csmith, and if so then we could probably spawn our own processes too
would ideally have something running in-process so that libfuzzer gets that coverage feedback
I suppose one could shell out to csmith and the compiler and then dlopen
the result for the native half of the comparison
compilation happens in toplevel process so all the coverage guidance is driven at least from that
ok, I'll play with this more (and look at what other folks may be doing). thanks!
Adding regalloc to our config should be pretty easy, I'd recommend modeling after what wasm-tools is doing
for diffing the old/new backend I think that'd be pretty easy so long as they were both compiled into cranelift, we'd need some sort of Config
option to select and you could diff the result of two Instance
s within two stores
one store with the old backend and one with the new
Hmm, actually, I think this is the best way to go -- differential execution-backend-X on different Stores; we could diff against a wasm bytecode-level interpreter at some point, too. Would certainly give more throughput than shelling out to a compiler
+1 for fuzzing regalloc, thanks! (Maybe disable the linear scan fuzz target first; while it appears to be working, it's still experimental and i'd like to play with it more and change core parts of the algorithm.)
Regarding differential fuzzing, maybe there'll be some hardship if you want to compile old-x64 vs new-x64, because of the experimental_x64 feature.
And on the same topic of using that experimental_x64 feature: has the wasmtime fuzzing been done with this feature enabled, on ossfuzz? It could probably find a few issues, if it hasn't ever been enabld.
We don't have experimental_x64 on oss-fuzz yet, no
being a cargo feature instead of a runtime feature we'd have to build the binaries twice I believe, which would just be an edit to the oss-fuzz build script
Last updated: Dec 23 2024 at 12:05 UTC