wasmtime / PR #4773 [fuzz] Configure the `differential` t... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / PR #4773 [fuzz] Configure the `differential` t...

Wasmtime GitHub notifications bot (Aug 24 2022 at 17:55):

abrown opened PR #4773 from fuzz-lists to main:

This change is a follow-on from #4515 to add the ability to configure
the differential fuzz target by limiting which engines and modules are
used for fuzzing. This is incredibly useful when troubleshooting, e.g.,
when an engine is more prone to failure, we can target that engine
exclusively. The effect of this configuration is visible in the
statistics now printed out from #4739.

Engines are configured using the ALLOWED_ENGINES environment variable.
We can either subtract from the set of allowed engines (e.g.,
ALLOWED_ENGINES=-v8) or build up a set of allowed engines (e.g.,
ALLOWED_ENGINES=wasmi,spec), but not both at the same time.
ALLOWED_ENGINES only configures the left-hand side engine; the
right-hand side is always Wasmtime. When omitted, ALLOWED_ENGINES
defaults to [wasmtime, wasmi, spec, v8].

The generated WebAssembly modules are configured using
ALLOWED_MODULES. This environment variables works the same as above
but the available options are: [wasm-smith, single-inst].

Wasmtime GitHub notifications bot (Aug 24 2022 at 17:55):

abrown requested jameysharp for a review on PR #4773.

Wasmtime GitHub notifications bot (Aug 24 2022 at 19:08):

jameysharp submitted PR review.

Wasmtime GitHub notifications bot (Aug 24 2022 at 19:08):

jameysharp submitted PR review.

Wasmtime GitHub notifications bot (Aug 24 2022 at 19:08):

jameysharp created PR review comment:

I got confused wondering what would happen if configured is empty, like if someone ran ALLOWED_ENGINES= cargo fuzz .... In that case both of these expressions are true, and you treat it as "nothing is subtracted", so all defaults are used. I think that's a good choice. Maybe worth a comment, though?

Wasmtime GitHub notifications bot (Aug 24 2022 at 19:08):

jameysharp created PR review comment:

Based purely on the types here, I believe you should be able to borrow from the env_variable rather than returning owned strings:
pub fn parse_env_list(env_variable: &str) -> Option<Vec<&str>> {
    std::env::var(env_variable)
        .ok()
        .map(|l| l.split(",").collect())
}

Wasmtime GitHub notifications bot (Aug 24 2022 at 19:08):

jameysharp created PR review comment:

Since the returned strings are always a subset of defaults, you could return them without copying them to the heap. Where this is used, the lifetime of the default strings is 'static, so it should be okay to stash references to them in a global.
pub fn build_allowed_env_list<'a>(env_list: Option<Vec<String>>, defaults: &[&'a str]) -> Vec<&'a str> {
There's a bit of nuance in the add_from_defaults case, because you could either choose to loop over configured and add if it's present in defaults, as you do here, or the other way around.

I suggest that in both the "add" and "subtract" cases, you should loop over defaults and check whether each item appears in configured. That lets you return the borrowed string from defaults in both cases. However, that is annoying when validating that every name in configured is present in defaults: you need the same kind of two-loop structure in both cases then too.

You can clean _that_ up by unifying both cases. The only differences between them are
- should the first character be stripped?
- do we want configured to contain the string, or not?

So you can save a range up front indicating which characters to pay attention to in configured:
let keep = if subtract_from_defaults { 1.. } else { .. };
And use &c[keep] whenever you need the stripped contents of an entry from configured. It's a little annoying to not be able to just use configured.contains() but I think configured.iter().any(|c| &c[keep] == d) isn't too terrible. Anyway I kind of like that the fact that this is technically quadratic isn't as obscured this way, even though it definitely doesn't matter here.

Finally, keep a default when that "contains" check != subtract_from_defaults. If we do want to subtract, we keep the ones that are not present, and vice versa.

Wasmtime GitHub notifications bot (Aug 24 2022 at 19:08):

jameysharp created PR review comment:

Not a big deal, but I like to have as little code inside unsafe blocks as possible. The only unsafe thing here is the assignment to the static mut variables, so I'd let-bind the results of calling build_allowed_env_list and then only assign the locals to the globals inside the unsafe.

Alternatively, you could pull in the lazy_static crate: it isn't currently used anywhere in wasmtime but it is already in our dependency tree. It just does what you're already doing, but puts a safe wrapper on it.

Wasmtime GitHub notifications bot (Aug 24 2022 at 19:08):

jameysharp created PR review comment:

This consumes some of the fuzz input even if this engine isn't selected. In fact, all of the engine initializers run and then most of them are thrown away each time. I'd suggest instead doing u.choose(allowed).unwrap() and then only initializing the engine that has the selected name.

Wasmtime GitHub notifications bot (Aug 24 2022 at 19:08):

jameysharp created PR review comment:

I believe this will get you the same result, including not consuming any of the fuzz input if there's only one configured choice.
    let wasm = match u.choose(unsafe { &ALLOWED_MODULES.as_slice() }).unwrap() {
        "wasm-smith" => build_wasm_smith_module(&mut u, &mut config)?,
        "single-inst" => build_single_inst_module(&mut u, &mut config)?,
        _ => unreachable!(),
    };
I think the only error that choose can report is that there weren't any choices provided. That can only happen here if the defaults array is empty, because if the ALLOWED_MODULES environment variable is empty then all defaults are configured. So choose failing would be a programmer error here, not user error, and I think unwrap is appropriate for that.

Since the closures are only called in one place this way, you could inline them into the match if you want. I kind of like the separation here though, so, either way is good.

Wasmtime GitHub notifications bot (Aug 24 2022 at 19:37):

abrown submitted PR review.

Wasmtime GitHub notifications bot (Aug 24 2022 at 19:37):

abrown created PR review comment:

Hm:

error[E0515]: cannot return value referencing function parameter `l`
   --> crates/fuzzing/src/oracles/engine.rs:186:18
    |
186 |         .map(|l| l.split(",").collect())
    |                  ------------^^^^^^^^^^
    |                  |
    |                  returns a value referencing data owned by the current function
    |                  `l` is borrowed here

Wasmtime GitHub notifications bot (Aug 24 2022 at 19:39):

jameysharp created PR review comment:

Oh! I misread this. Yeah, you're right, _this_ one probably does have to be an owned String. I think the changes to build_allowed_env_list should still work though.

Wasmtime GitHub notifications bot (Aug 24 2022 at 19:39):

jameysharp submitted PR review.

Wasmtime GitHub notifications bot (Aug 24 2022 at 19:47):

abrown created PR review comment:

We need the engine initializers to run because if they can't meet the requirements of the Config (e.g., SIMD, etc.) then we don't really want to pick them. Maybe we could choose on the list of allowed strings and _then_ try to initialize, looping on that if the engine can't meet the requirements... But then we would want to take/swap_remove from the string list. That might be less clear, though.

Wasmtime GitHub notifications bot (Aug 24 2022 at 19:47):

abrown submitted PR review.

Wasmtime GitHub notifications bot (Aug 24 2022 at 19:49):

jameysharp created PR review comment:

Ahhh, I didn't understand that. I retract this suggestion :grin:

Wasmtime GitHub notifications bot (Aug 24 2022 at 19:49):

jameysharp submitted PR review.

Wasmtime GitHub notifications bot (Aug 24 2022 at 20:23):

abrown updated PR #4773 from fuzz-lists to main.

Wasmtime GitHub notifications bot (Aug 24 2022 at 20:26):

abrown updated PR #4773 from fuzz-lists to main.

Wasmtime GitHub notifications bot (Aug 24 2022 at 20:28):

abrown created PR review comment:

I made the suggested change but added an extra check to panic! if ALLOWED_MODULES.is_empty(). This can be empty, if the user were to write ALLOWED_MODULES=-wasm-smith,-single-inst, so I want to fail with some kind of message in that case.

Wasmtime GitHub notifications bot (Aug 24 2022 at 20:28):

abrown submitted PR review.

Wasmtime GitHub notifications bot (Aug 24 2022 at 20:28):

abrown updated PR #4773 from fuzz-lists to main.

Wasmtime GitHub notifications bot (Aug 24 2022 at 21:20):

jameysharp created PR review comment:

Oh, that's true! Well, you could replace the .unwrap() in my suggestion with
.expect("unable to generate a module to fuzz against; check `ALLOWED_MODULES`")
but making the panic explicit is fine too.

Wasmtime GitHub notifications bot (Aug 24 2022 at 21:20):

jameysharp submitted PR review.

Wasmtime GitHub notifications bot (Aug 24 2022 at 21:25):

abrown updated PR #4773 from fuzz-lists to main.

Wasmtime GitHub notifications bot (Aug 24 2022 at 21:27):

abrown submitted PR review.

Wasmtime GitHub notifications bot (Aug 24 2022 at 21:27):

abrown created PR review comment:

Thanks for the explanation; it took me a second to parse but I think I implemented what you described. I went with the more verbose if (add_from_defaults && mentioned) || (subtract_from_defaults && !mentioned) in the logic because I thought that would be the most clear when we read this later...

Wasmtime GitHub notifications bot (Aug 24 2022 at 21:29):

abrown submitted PR review.

Wasmtime GitHub notifications bot (Aug 24 2022 at 21:29):

abrown created PR review comment:

Actually, with the latest change. I think if an empty list gets passed then this function returns the empty list of allowed names. Should I add logic for this corner case? I think the surrounding fuzz target code will fail if nothing is allowed.

Wasmtime GitHub notifications bot (Aug 24 2022 at 21:35):

jameysharp submitted PR review.

Wasmtime GitHub notifications bot (Aug 24 2022 at 21:35):

jameysharp created PR review comment:

If the input list is empty, so both add and subtract are true, then this condition:
(add_from_defaults && mentioned) || (subtract_from_defaults && !mentioned)
simplifies to mentioned || !mentioned, which is always true. So I think this is fine.

Wasmtime GitHub notifications bot (Aug 24 2022 at 21:39):

jameysharp created PR review comment:

I forgot that 1.. and .. have different types. So I'd change my suggestion here to:
let start = if subtract_from_defaults { 1 } else { 0 };
and then using &c[start..] everywhere, rather than cloning ranges. I think that'll be a little easier to read.

Wasmtime GitHub notifications bot (Aug 24 2022 at 21:39):

jameysharp submitted PR review.

Wasmtime GitHub notifications bot (Aug 24 2022 at 21:45):

jameysharp submitted PR review.

Wasmtime GitHub notifications bot (Aug 24 2022 at 21:45):

jameysharp created PR review comment:

Yeah, I agree, the more verbose condition is a good idea. And more broadly, I think this implementation looks great.

Wasmtime GitHub notifications bot (Aug 24 2022 at 21:51):

abrown updated PR #4773 from fuzz-lists to main.

Wasmtime GitHub notifications bot (Aug 24 2022 at 21:56):

jameysharp submitted PR review.

Wasmtime GitHub notifications bot (Aug 24 2022 at 22:49):

abrown merged PR #4773.

Last updated: Apr 18 2025 at 17:03 UTC