remlse opened PR #6208 from remlse:chaos-mode-fuel-param
to bytecodealliance:main
:
part of #4134
The fuel parameter limits the maximum number of perturbations introduced by the control plane. It can be used to binary-search for specific perturbations that triggered a bug.
cfallin submitted PR review.
remlse updated PR #6208.
remlse updated PR #6208.
remlse updated PR #6208.
remlse updated PR #6208.
remlse updated PR #6208.
remlse has marked PR #6208 as ready for review.
remlse requested jameysharp for a review on PR #6208.
remlse requested wasmtime-fuzz-reviewers for a review on PR #6208.
remlse requested wasmtime-compiler-reviewers for a review on PR #6208.
remlse requested wasmtime-default-reviewers for a review on PR #6208.
cfallin submitted PR review.
cfallin submitted PR review.
cfallin created PR review comment:
I would describe this a little differently, because the intent is for this to be able to control other compiler mutations as well, not just those introduced by chaos-mode. "Set the fuel limit, which controls the number of mutations or optimizations that the compiler will perform before stopping. Currently applies to chaos perturbations but may apply to other optimization decisions in the future." or something like that?
cfallin created PR review comment:
This doc-comment covers two of the three cases I think -- it also returns
true
if fuel is activated and there is enough fuel, right?It might make sense to describe at a slightly higher level: "Tries to consume fuel, returning true if successful (or if fuel-limiting is disabled)."
cfallin created PR review comment:
This is unfortunately not the right way to go about this, I think: we should funnel all settings that control compilation through the settings plumbing, rather than pulling inputs from implicit/ambient environment. (The latter may seem easier for a specific setting, such as running a fuzz case on the command line, but can lead to really surprising and subtle bugs later, and also precludes other options like a tool that bisects automatically.)
I think this is a good opportunity to flesh out the text format, not just for printing but for parsing as well. If we can print a textual form of the control-plane (analogous to the
set option=value
, we could havecontrol_plane chaos=... fuel=...
or somesuch) and parse it, then the fuzzing debug flow would be: print the formatted fuzz testcase as CLIF; then iterate on it by tweaking the setting. Alternately we can add command-line options toclif-util test
.
remlse created PR review comment:
I'm assuming the subtle bugs you're referring to would be forgetting to set the fuel correctly in some case. To address this, I think it's necessary to remove the
Arbitrary
implementation of the control plane to avoid constructing one without having consulted the fuel setting first.The ad-hoc CLI argument parsing is still there, now in the TestCase arbitrary implementation. Which feels a little dirty. But I didn't see another way, I thought it was more important to ensure a control plane is created with the fuel already specified.
The text representation and parsing I'm still working on.
If the control plane fuel is part of cranelift's settings, should it be part of the text representation of a control plane as well? How should a text representation like this be interpreted:
; <other_flags> set control_plane_fuel=16 ; <target> control_plane fuel=8 data=1100 ; <function>
What if the control plane value (8) is a different one than the cranelift setting (16)?
remlse updated PR #6208.
remlse updated PR #6208.
remlse created PR review comment:
I've tried to incorporate the general information in there into the heading at the crate level documentation and linked to that from
ControlPlane::new
(which replacesset_fuel
).
cfallin created PR review comment:
I'm assuming the subtle bugs you're referring to would be forgetting to set the fuel correctly in some case. To address this, I think it's necessary to remove the Arbitrary implementation of the control plane to avoid constructing one without having consulted the fuel setting first.
Is it necessarily only one or the other? Why is discarding
Arbitrary
altogether the only other choice? For example, could we instead inject a fuel value into the control-plane object after creating one withArbitrary
?More generally: reading environment variables from inside a library is to be avoided because it circumvents the embedder's control of the library, and means that the library's behavior will be non-deterministic from the point of view of the API user. I ask for a control-plane with some bytes and compile on my system; you do the same on your system; we get two different results, because our ambient process state is different.
So the sort of bug I worry about isn't "forgot to set fuel" but "perplexing divergence in behavior that only makes sense once we check all basic assumptions and examine our shell configuration for environment variables" -- much less fun to debug!
remlse created PR review comment:
The argument parsing is still in the same file, the fuzz target
cranelift-fuzzgen
. Which library do you mean?I guess the problem I was working around is that the flags are generated in the
Arbitrary
implementation ofTestCase
, which doesn't take any arguments. And it's not possible to change the flags afterwards (?).So it seems to me the CLI argument must be parsed before or during the the construction of an arbitrary
TestCase
. Otherwise we can't set the appropriate cranelift flag. Parsing CLI args before test case generation with arbitrary doesn't work though:fuzz_target!(|data: &[u8]| { let fuel = std::env::args() .find_map(|arg| arg.strip_prefix("--fuel=").map(|s| s.to_owned())) .map(|fuel| fuel.parse().expect("fuel should be a valid integer")); let mut testcase = TestCase::generate(&mut Unstructured::new(data), fuel).unwrap(); for i in 0..testcase.ctrl_planes.len() { testcase.ctrl_planes[i].set_fuel(testcase.isa.flags().control_plane_fuel()) }
The problem here is
TestCase::generate(/* ... */).unwrap()
. As far as I can tell from the cargo fuzz book and the expansion of thefuzz_target!
macro, the body of a fuzz target is not allowed to reject an input (data: &[u8]
) because its structure is unsuitable. That should (can?) only be done by putting the structured input right as the argument (fuzz_target!(|data: TestCase|). I might be wrong about this.So that leads me to think our only option is to parse the CLI argument in some implementation of
Arbitrary
. Would it be better to make a separate struct andArbitrary
implementation just for the purpose of parsing the CLI args, to separate these concerns? Pseudo-code:struct Foo(TestCase); impl<'a> Arbitrary<'a> for Foo { fn arbitrary(u: &mut Unstructured<'a>) -> arbitrary::Result<Self> { let fuel = parse_fuel_from_cli_args(); let mut testcase = TestCase::generate(u, fuel)?; testcase.set_all_fuel(testcase.isa.flags().control_plane_fuel()); Ok(Foo(testcase)) } } fuzz_target!(|foo: Foo| { let Foo(testcase) = foo;
remlse updated PR #6208.
cfallin created PR review comment:
The argument parsing is still in the same file, the fuzz target
cranelift-fuzzgen
. Which library do you mean?I guess I meant it a little more abstractly: the implementation of
Arbitrary
is a library layer of sorts, and should depend only on its inputs.I guess the problem I was working around is that the flags are generated in the
Arbitrary
implementation ofTestCase
, which doesn't take any arguments. And it's not possible to change the flags afterwards (?).That I think is the key assumption that is guiding this into the direction I'm not sure about :-) The
TestCase
should be defined to contain only the plain-old-data inputs; we shouldn't be building the compiler context with flags inside theArbitrary
impl itself. We should be able to then build whatever context we need in the fuzzing entrypoint: the compiler context is a pure function of theTestCase
and the fuel parameter.So I'm imagining something like:
pub struct TestCase { flags: cranelift_codegen::Settings, // ... }
and then
impl TestCase { fn build_isa(&self, fuel: ...) -> isa::OwnedTargetIsa { ... } }
or even
fn build_isa(&self, user_params: &UserParams)
and build the user params however we want (e.g.UserParams::from_env()
). That makes the plumbing explicit and removes the dependency on environment variables from inside what should be a pure function :-)
Taking a step back, I think we've discussed at least two different kinds of "fuel" and three different sources of the setting, and I want to at least outline this taxonomy here to make sure we're on the same page and have a framework to think about it further.
Kinds of fuel:
- Fuel that "masks" other settings (e.g. chaos-decision values), and
- Fuel that is a separate dimension or axis that controls many different mutations in the compiler.Overall the latter is the more useful long-term, but the former is useful when controlled by the user to bisect failures, so both are interesting in some context.
Sources of fuel setting:
- User parameter (CLIF option on clif-util, environment variable read from toplevel fuzz-test entry point, ...). Useful for the bisection kind of fuel.
- Cranelift settings, or a separate control-plane setting that is passed in alongside settings. Useful for the kind of fuel that can be used to limit optimization effort (the latter), and also for bisection when driven in certain ways (CLIF file editing etc).
- Programmatic generation of control-plane state: useful for e.g. tools that try to bisect automatically.The approach we're trying to take here is squarely the first kind of fuel, for now, and with the first source. If we can fix up the plumbing a little bit as above to isolate the read-of-ambient-state in as clean a way as possible, then I think I'm happy. Thanks!
remlse updated PR #6208.
remlse updated PR #6208.
remlse updated PR #6208.
remlse created PR review comment:
This is the reason I wrapped the
isa_builder
in anRc
.to_optimized
takes a reference and returns an ownedSelf
, so theisa_builder
must either be cloned or shared.I tried to make the
isa_builder
Clone
, for that I would have to make the following alsoClone
:If making these two
Clone
is better than usingRc
, I can change that.
cfallin submitted PR review:
Thanks, I think this is reasonable enough to land. We can certainly further tweak the way that the fuel settings are plumbed later, if desired, but I don't want to hold this PR up any more than I already have, so let's merge.
cfallin created PR review comment:
... actually, now that I look at the CI failure, I think we can both eliminate the Cranelift setting (which is causing the failure because the "default settings" test is not updated) and just use
fuel
directly here. If we want a mode later where a setting can also set fuel, we can certainly add that, but this setting doesn't appear to be wired up to the control plane anywhere except here so I think it's best to just omit.
remlse created PR review comment:
You suggested to flesh out the text representation of the control plane right now, so I didn't fix the tests related to that yet. I have a commit pretty much ready for that purpose, although there will probably be a couple iterations on that as well until it fits nicely with the rest of the parsing.
I'm totally fine with keeping the PR open until that's done as well.
remlse edited PR review comment.
remlse updated PR #6208.
remlse updated PR #6208.
remlse updated PR #6208.
cfallin created PR review comment:
To be honest I think at this point I'd rather merge this logically consistent piece (without the Cranelift setting at all, and without any text parsing) and then we can review further bits as they're done -- otherwise the patch gets a bit unwieldy.
remlse updated PR #6208.
remlse updated PR #6208.
remlse created PR review comment:
Ok, I removed everything except the part in cranelift-control and the reading of the CLI argument.
cfallin submitted PR review.
cfallin merged PR #6208.
Last updated: Dec 23 2024 at 13:07 UTC