Stream: cranelift

Topic: regalloc-fuzzing


view this post on Zulip Benjamin Bouvier (Feb 21 2020 at 15:50):

@Nick Fitzgerald I've set up fuzzing with libfuzzer and Arbitrary on our WIP replacement for regalloc, and I wanted to start a conversation about strategies to do this efficiently.

Modular register allocator algorithms. Contribute to bnjbvr/regalloc.rs development by creating an account on GitHub.

view this post on Zulip Benjamin Bouvier (Feb 21 2020 at 15:56):

Right now, I've implemented a validator function that checks that the given input (constructed from random data bytes) is correct, and i run this before passing the generated input to the actually-useful test oracles. With code-coverage guided fuzzing, it seems libfuzzer found the way to create some valid inputs by luck, but it seems they remain mostly identical: a function with one block only, a few instructions in this block.

view this post on Zulip Benjamin Bouvier (Feb 21 2020 at 15:57):

So my question is really about the best strategy i could use there: is it likely that libfuzzer may find the different allowed inputs (several blocks), or should i start to make my own generator so that i only generate valid test cases?

view this post on Zulip bjorn3 (Feb 21 2020 at 16:23):

For the Arbitrary impl on for example Label and Block, you may want to manually implement one that creates valid Strings when the input is not utf-8. For example by reading 32bits and converting that to a base64 string. This way the fuzzer doesn't have to "learn" what utf-8 is.

view this post on Zulip bjorn3 (Feb 21 2020 at 16:24):

Another idea would be to store the name String in a side table, and use indexes everywhere instead. If you allow the side tables to be empty, you could skip them in the Arbitrary impl.

view this post on Zulip Benjamin Bouvier (Feb 21 2020 at 16:27):

Ah, good point, this was one of the things i wanted to ask about, as well: is there a way the Arbitrary derive trait can ignore some fields and not generate these, by having the user provide a default value instead? All the string fields in the regalloc crate definitely belong to this category.

One way to emulate this would be to have a FuzzFunc data structures, which derives Arbitrary, and it would only contain the fields we actually want to fuzz. Then it would also implement Into<Func> and fill in all the default values, e.g. for strings and all of this.

view this post on Zulip fitzgen (he/him) (Feb 21 2020 at 17:50):

edit: from the mentions view, I didn't get to see the whole thread, deleting this comment and then reading backlog >.<

view this post on Zulip fitzgen (he/him) (Feb 21 2020 at 17:57):

@Benjamin Bouvier so you don't need to jump all the way to "custom generators" from here, you can explore doing targeted impl Arbitrary for X blocks by hand, for example to avoid generating irrelevant strings

view this post on Zulip fitzgen (he/him) (Feb 21 2020 at 17:59):

is the validation pass to make sure that (for example) we only use things that have been already defined? (I'm looking at things like BlockIx which at first blush looks like an index that needs to be valid)

view this post on Zulip fitzgen (he/him) (Feb 21 2020 at 18:01):

if so, then it makes sense to only have an Arbitrary implementation "one level up" at the thing that contains/defines all the valid entities/indices. we do this kind of thing in wasmtime here: https://github.com/bytecodealliance/wasmtime/blob/master/crates/fuzzing/src/generators/api.rs

Standalone JIT-style runtime for WebAssembly, using Cranelift - bytecodealliance/wasmtime

view this post on Zulip fitzgen (he/him) (Feb 21 2020 at 18:03):

regarding ignoring some fields: no we can't currently (and we can't ever fully ignore, we would need to either use a Default::default implementation or some other function)

I'll file an issue for this tho because I've also wanted it in the past and think it would be generally valuable

view this post on Zulip Benjamin Bouvier (Feb 21 2020 at 18:04):

re: validation, yes, it's the same as Cranelift's verifier (checks that the IR is sane: blocks are not empty and must end with a control flow instruction, etc)

view this post on Zulip Benjamin Bouvier (Feb 21 2020 at 18:05):

thanks for the pointers, i'll look into this early next week!

view this post on Zulip fitzgen (he/him) (Feb 21 2020 at 18:05):

no problem! happy to help, keep me in the loop :)

view this post on Zulip fitzgen (he/him) (Feb 21 2020 at 18:06):

hm... is there something more abstract than literal clif that the allocator can work on? like can it take in a set of constraints? might be easier to generate the constraints than actually valid clif

view this post on Zulip fitzgen (he/him) (Feb 21 2020 at 18:07):

if doing full clif, it may prove more fruitful to have the fuzz target take in String and seed the corpus with a bunch of valid clif files, fwiw

view this post on Zulip fitzgen (he/him) (Feb 21 2020 at 18:07):

worth trying at least

view this post on Zulip fitzgen (he/him) (Feb 21 2020 at 19:40):

Here is the issue I filed for "skipping" fields in #[derive(Arbitrary)]:

https://github.com/rust-fuzz/arbitrary/issues/33

Sometimes a field of a struct doesn't implement arbitrary and it is either impossible to do (because it is from another crate, for example) or undesired. We should support some kind of attribut...

view this post on Zulip Benjamin Bouvier (Feb 24 2020 at 18:09):

@Nick Fitzgerald is there a way to have libfuzzer/the harness record statistics on my behalf? say, if i wanted to count the number of test cases that were valid vs invalid (i.e. didn't trigger a panic, but resulted in an error during interpretation, for instance), and get an idea of how effective my fuzzing is.

view this post on Zulip fitzgen (he/him) (Feb 24 2020 at 18:12):

cargo fuzz and libFuzzer should stop once they discover a panic, so I think the answer is that all of the test cases run didn't trigger a panic

maybe I'm not following exactly what you're asking for though...

are you looking for "what % of test cases reached code location X?" where X is the branch for valid test cases?

view this post on Zulip Benjamin Bouvier (Feb 24 2020 at 18:16):

Context is I run the (now always valid) generated func in an IR interpreter, and interpreting can return errors, e.g. division by zero, infinite loops, etc. So in this case, the generated func is structurally valid, but not runnable. I'd like to get a rough estimate of the proportion of such test cases, vs test cases that can actually be interpreted, and thus can go through register allocation.

view this post on Zulip fitzgen (he/him) (Feb 24 2020 at 18:32):

I don't think there's a nice way to get this info out, unfortunately

the official libfuzzer docs recommend using clang code coverage visualization to get an idea of "how good the fuzzer is" but rustc doesn't support that right now :(

as a hacky work around, you could try adding a panic!() to the start of the register allocation testing code path. or even just a println!("got to reg alloc") and count them with a CLI script

view this post on Zulip Benjamin Bouvier (Feb 24 2020 at 18:47):

ok, thanks! had another idea: if i can eliminate most real OOMs, i can trigger fake OOMs by allocating very large vectors on paths where the test case is valid but not interpreted correctly, so they get displayed in the output of libfuzzer (when using multiple jobs). Quite hacky :slight_smile:

view this post on Zulip Benjamin Bouvier (Feb 25 2020 at 18:28):

@Nick Fitzgerald Hey, i'm debug-printing the Unstructured instance's length, and i see it's around 3 bytes in most of my test runs, which is not enough bytes to run interesting programs. Do you know how its size is computed, and if there are ways i can increase it?

view this post on Zulip Benjamin Bouvier (Feb 25 2020 at 18:29):

Generating mostly correct output is time consuming, and i get to around 10 execs/run only, so there might be some bad interactions there...

view this post on Zulip fitzgen (he/him) (Feb 25 2020 at 18:31):

@Benjamin Bouvier couple things:

view this post on Zulip Benjamin Bouvier (Feb 25 2020 at 18:34):

great, -max_len and -len_control is what i want. Thanks!

view this post on Zulip Benjamin Bouvier (Feb 25 2020 at 18:39):

Since arbitrary returns a Result, it would be pretty nice if cargo fuzz could show me a relative proportion of Err among generated inputs, to make it discoverable that there's something wrong with the size of the raw data bytes. Would it be feasible?

view this post on Zulip fitzgen (he/him) (Feb 25 2020 at 18:52):

perhaps... cargo fuzz mostly just wraps libfuzzer and provides the logic to build sanitizers and link libfuzzer

but we could probably use the Arbitrary::size_hint to auto add seed files to the corpus or to pass -max_len and -len_control flags to libfuzzer. not sure exactly how this would work, there's some design work to be done

view this post on Zulip fitzgen (he/him) (Feb 25 2020 at 20:41):

Filed https://github.com/rust-fuzz/cargo-fuzz/issues/218

@bnjbvr was reporting that starting fuzzing from scratch with a fuzz target that takes an Arbtirary impl was spending a lot of time on three bytes long inputs, where the Arbitrary implementation re...

Last updated: Dec 23 2024 at 12:05 UTC