wasmtime / issue #4134 Cranelift: test the compiler with ... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / issue #4134 Cranelift: test the compiler with ...

Wasmtime GitHub notifications bot (May 11 2022 at 00:05):

We should rigorously test the compiler with a randomized testing strategy one might call "chaos mode". The general idea is to introduce perturbations (deterministically derived from some seed to allow bug reproduction) into various decision points within the compiler, where these perturbations are either "worst-case conditions" or just randomness.

The first step is to build the "control plane", or a bit of instrumentation threaded through the compiler to allow centralization of certain decisions:

Request fuel to perform some optimization or transform, skipping if no fuel left (allows bisecting for origin of bugs);

Request a "random choice" among some options based on the seed, or based on external user input (later).

Then we thread this through the passes, lowering backend, ABI code, and possibly even regalloc2, and use it in interesting ways:

Clobber any state at the machine level that is not supposed to be defined:

Unused registers (as per regalloc2);

Volatile (caller-saved) registers in function prologues;

Put magic values in callee-saved registers around calls, and check them;

Mangle the upper bits of narrow types in wider registers (u8/16/32 in a 64-bit GPR, for example);

Insert randomness into the regalloc's input and its decisions:

Add "ghost clobbers" that randomly allocate a tmp and clobber it, or "ghost uses" that random use a value, forcing it into a register;

Split liveranges randomly, and add new (extraneous but legal) constraints randomly, causing e.g. spills or new moves;

Disable the redundant-move eliminator;

Randomize the resolution of parallel moves into sequentialized moves;

Randomize and constrain control flow:

Randomize basic-block order;

Randomize whether or not MachBackend simplifies/collapses branches;

Randomly insert branch islands even when not needed;

Randomize instruction selection:

Mark some ISLE rules as "redundant" (expressing better lowerings, but covered by other rules) and randomly skip them dynamically with a predicate;

Randomly "hide" operand subtrees by not returning the source instruction for a Value to the matcher;

Randomly apply higher or lower priority to some rules (we will need to work out how to make this dynamic -- perhaps by duplicating a rule at two priorities and selecting with a predicate).

Once we have these mutations and the control plane to drive them, we can run them in fuzzing by taking the seed from the fuzz input. Because the control plane is deterministic, bugs should reproduce as per normal. We should evaluate the effectiveness by looking at fuzz coverage and devise new perturbations as necessary.

This should produce more efficient fuzzing than the current end-to-end compilation testing because it controls degrees of freedom directly: it allows the fuzzer to mutate state internal to the compiler without finding the inverse function for the pipeline in front of that state and indirectly mutating it.

Wasmtime GitHub notifications bot (May 11 2022 at 00:05):

cfallin labeled issue #4134:

We should rigorously test the compiler with a randomized testing strategy one might call "chaos mode". The general idea is to introduce perturbations (deterministically derived from some seed to allow bug reproduction) into various decision points within the compiler, where these perturbations are either "worst-case conditions" or just randomness.

The first step is to build the "control plane", or a bit of instrumentation threaded through the compiler to allow centralization of certain decisions:

Request fuel to perform some optimization or transform, skipping if no fuel left (allows bisecting for origin of bugs);

Request a "random choice" among some options based on the seed, or based on external user input (later).

Then we thread this through the passes, lowering backend, ABI code, and possibly even regalloc2, and use it in interesting ways:

Clobber any state at the machine level that is not supposed to be defined:

Unused registers (as per regalloc2);

Volatile (caller-saved) registers in function prologues;

Put magic values in callee-saved registers around calls, and check them;

Mangle the upper bits of narrow types in wider registers (u8/16/32 in a 64-bit GPR, for example);

Insert randomness into the regalloc's input and its decisions:

Add "ghost clobbers" that randomly allocate a tmp and clobber it, or "ghost uses" that random use a value, forcing it into a register;

Split liveranges randomly, and add new (extraneous but legal) constraints randomly, causing e.g. spills or new moves;

Disable the redundant-move eliminator;

Randomize the resolution of parallel moves into sequentialized moves;

Randomize and constrain control flow:

Randomize basic-block order;

Randomize whether or not MachBackend simplifies/collapses branches;

Randomly insert branch islands even when not needed;

Randomize instruction selection:

Mark some ISLE rules as "redundant" (expressing better lowerings, but covered by other rules) and randomly skip them dynamically with a predicate;

Randomly "hide" operand subtrees by not returning the source instruction for a Value to the matcher;

Randomly apply higher or lower priority to some rules (we will need to work out how to make this dynamic -- perhaps by duplicating a rule at two priorities and selecting with a predicate).

Once we have these mutations and the control plane to drive them, we can run them in fuzzing by taking the seed from the fuzz input. Because the control plane is deterministic, bugs should reproduce as per normal. We should evaluate the effectiveness by looking at fuzz coverage and devise new perturbations as necessary.

This should produce more efficient fuzzing than the current end-to-end compilation testing because it controls degrees of freedom directly: it allows the fuzzer to mutate state internal to the compiler without finding the inverse function for the pipeline in front of that state and indirectly mutating it.

Wasmtime GitHub notifications bot (May 11 2022 at 00:05):

cfallin labeled issue #4134:

We should rigorously test the compiler with a randomized testing strategy one might call "chaos mode". The general idea is to introduce perturbations (deterministically derived from some seed to allow bug reproduction) into various decision points within the compiler, where these perturbations are either "worst-case conditions" or just randomness.

The first step is to build the "control plane", or a bit of instrumentation threaded through the compiler to allow centralization of certain decisions:

Request fuel to perform some optimization or transform, skipping if no fuel left (allows bisecting for origin of bugs);

Request a "random choice" among some options based on the seed, or based on external user input (later).

Then we thread this through the passes, lowering backend, ABI code, and possibly even regalloc2, and use it in interesting ways:

Clobber any state at the machine level that is not supposed to be defined:

Unused registers (as per regalloc2);

Volatile (caller-saved) registers in function prologues;

Put magic values in callee-saved registers around calls, and check them;

Mangle the upper bits of narrow types in wider registers (u8/16/32 in a 64-bit GPR, for example);

Insert randomness into the regalloc's input and its decisions:

Add "ghost clobbers" that randomly allocate a tmp and clobber it, or "ghost uses" that random use a value, forcing it into a register;

Split liveranges randomly, and add new (extraneous but legal) constraints randomly, causing e.g. spills or new moves;

Disable the redundant-move eliminator;

Randomize the resolution of parallel moves into sequentialized moves;

Randomize and constrain control flow:

Randomize basic-block order;

Randomize whether or not MachBackend simplifies/collapses branches;

Randomly insert branch islands even when not needed;

Randomize instruction selection:

Mark some ISLE rules as "redundant" (expressing better lowerings, but covered by other rules) and randomly skip them dynamically with a predicate;

Randomly "hide" operand subtrees by not returning the source instruction for a Value to the matcher;

Randomly apply higher or lower priority to some rules (we will need to work out how to make this dynamic -- perhaps by duplicating a rule at two priorities and selecting with a predicate).

Once we have these mutations and the control plane to drive them, we can run them in fuzzing by taking the seed from the fuzz input. Because the control plane is deterministic, bugs should reproduce as per normal. We should evaluate the effectiveness by looking at fuzz coverage and devise new perturbations as necessary.

This should produce more efficient fuzzing than the current end-to-end compilation testing because it controls degrees of freedom directly: it allows the fuzzer to mutate state internal to the compiler without finding the inverse function for the pipeline in front of that state and indirectly mutating it.

Last updated: Apr 17 2025 at 22:03 UTC