afonso360 opened PR #4894 from forward-branching
to main
:
:wave: Hey,
This PR addresses some of the issues found in #3050 regarding the performance of fuzzgen. We found that we had an outsized number of timeouts during execution of the fuzzer. This is mostly due to generating a bunch of infinite loops.
This changes our branching strategy. Instead of choosing blocks at random, we always choose a block ahead of the current block. This makes it so that we always make forward progress and don't time out very often. I still think backwards branches are important to generate, so we make those a configurable option (0.1% of branches).
Additionally it looks like not having valid target branches was also an issue. 27.3% of our invalid
Arbitrary::choice
failures were due to us trying to select a block that we didn't have. Such as for example trying to select a block that does not have params when all of them do.
We are now smarter about which terminators we allow to be inserted avoiding these issues.cc: @jameysharp
Benchmarks
<details>
<summary><h3>Baseline</h3></summary>Statistics:
#49967 NEW cov: 31776 ft: 183224 corp: 4225/5444Kb lim: 394895 exec/s: 23 rss: 966Mb L: 2106/373965 MS: 2 ShuffleBytes-CopyPart- == FuzzGen Statistics ==================== Total Inputs: 50000 Valid Inputs: 18378 (36.8%) Total Runs: 62769 Successful Runs: 48229 (76.8% of Total Runs) Timed out Runs: 10440 (16.6% of Total Runs) Traps: user code: int_divz: 4100 (6.5% of Total Runs) user code: bad_sig: 0 (0.0% of Total Runs) user debug: 0 (0.0% of Total Runs) user code: icall_null: 0 (0.0% of Total Runs) user code: heap_oob: 0 (0.0% of Total Runs) user code: int_ovf: 0 (0.0% of Total Runs) user code: bad_toint: 0 (0.0% of Total Runs) user code: interrupt: 0 (0.0% of Total Runs) user code: stk_ovf: 0 (0.0% of Total Runs) user code: table_oob: 0 (0.0% of Total Runs) user code: heap_misaligned: 0 (0.0% of Total Runs) user code: unreachable: 0 (0.0% of Total Runs) resumable: 0 (0.0% of Total Runs)
Arbitrary::choice
failures:Loaded 20139 stacks First Line duplicates 5.9 - Count: 1181 - Line: 0: cranelift_fuzzgen::function_generator::FunctionGenerator::generate_br_table 8.3 - Count: 1679 - Line: 0: cranelift_fuzzgen::function_generator::FunctionGenerator::generate_jumptables 1.4 - Count: 282 - Line: 0: cranelift_fuzzgen::function_generator::FunctionGenerator::generate_switch 11.7 - Count: 2355 - Line: 0: cranelift_fuzzgen::function_generator::FunctionGenerator::generate_target_block 29.9 - Count: 6014 - Line: 0: cranelift_fuzzgen::function_generator::FunctionGenerator::get_variable_of_type 29.7 - Count: 5991 - Line: 0: cranelift_fuzzgen::function_generator::FunctionGenerator::stack_slot_with_size 13.1 - Count: 2637 - Line: 0: cranelift_fuzzgen::function_generator::insert_call
</details>
<details>
<summary><h3>This PR</h3></summary>#49979 NEW cov: 31503 ft: 178604 corp: 3520/7374Kb lim: 1048576 exec/s: 48 rss: 1054Mb L: 2702/1048576 MS: 1 CMP- DE: "\x01\x00\x00\x00\x00\x00\x00\x05"- == FuzzGen Statistics ==================== Total Inputs: 50000 Valid Inputs: 21036 (42.1%) Total Runs: 724599 Successful Runs: 719276 (99.3% of Total Runs) Timed out Runs: 32 (0.0% of Total Runs) Traps: user code: stk_ovf: 0 (0.0% of Total Runs) user code: heap_misaligned: 0 (0.0% of Total Runs) user code: bad_sig: 0 (0.0% of Total Runs) user code: table_oob: 0 (0.0% of Total Runs) user code: bad_toint: 0 (0.0% of Total Runs) user code: heap_oob: 0 (0.0% of Total Runs) user debug: 0 (0.0% of Total Runs) resumable: 0 (0.0% of Total Runs) user code: icall_null: 0 (0.0% of Total Runs) user code: int_ovf: 0 (0.0% of Total Runs) user code: int_divz: 5291 (0.7% of Total Runs) user code: unreachable: 0 (0.0% of Total Runs) user code: interrupt: 0 (0.0% of Total Runs)
Loaded 14367 stacks First Line duplicates 41.3 - Count: 5936 - Line: 0: cranelift_fuzzgen::function_generator::FunctionGenerator::get_variable_of_type 37.6 - Count: 5395 - Line: 0: cranelift_fuzzgen::function_generator::FunctionGenerator::stack_slot_with_size 21.1 - Count: 3036 - Line: 0: cranelift_fuzzgen::function_generator::insert_call
</details>
This improves execs/s by 108% (23 -> 48) which is nice. It also eliminates all invalid choices regarding block terminators.
afonso360 edited PR #4894 from forward-branching
to main
:
:wave: Hey,
This PR addresses some of the issues found in #3050 regarding the performance of fuzzgen. We found that we had an outsized number of timeouts during execution of the fuzzer. This is mostly due to generating a bunch of infinite loops.
This changes our branching strategy. Instead of choosing blocks at random, we always choose a block ahead of the current block. This makes it so that we always make forward progress and don't time out very often. I still think backwards branches are important to generate, so we make those a configurable option (0.1% of branches).
Additionally it looks like not having valid target branches was also an issue. 27.3% of our invalid
Arbitrary::choice
failures were due to us trying to select a block that we didn't have. Such as for example trying to select a block that does not have params when all of them do.
We are now smarter about which terminators we allow to be inserted avoiding these issues.cc: @jameysharp
Benchmarks
<details>
<summary><h3>Baseline</h3></summary>Statistics:
#49967 NEW cov: 31776 ft: 183224 corp: 4225/5444Kb lim: 394895 exec/s: 23 rss: 966Mb L: 2106/373965 MS: 2 ShuffleBytes-CopyPart- == FuzzGen Statistics ==================== Total Inputs: 50000 Valid Inputs: 18378 (36.8%) Total Runs: 62769 Successful Runs: 48229 (76.8% of Total Runs) Timed out Runs: 10440 (16.6% of Total Runs) Traps: user code: int_divz: 4100 (6.5% of Total Runs) user code: bad_sig: 0 (0.0% of Total Runs) user debug: 0 (0.0% of Total Runs) user code: icall_null: 0 (0.0% of Total Runs) user code: heap_oob: 0 (0.0% of Total Runs) user code: int_ovf: 0 (0.0% of Total Runs) user code: bad_toint: 0 (0.0% of Total Runs) user code: interrupt: 0 (0.0% of Total Runs) user code: stk_ovf: 0 (0.0% of Total Runs) user code: table_oob: 0 (0.0% of Total Runs) user code: heap_misaligned: 0 (0.0% of Total Runs) user code: unreachable: 0 (0.0% of Total Runs) resumable: 0 (0.0% of Total Runs)
Arbitrary::choice
failures:Loaded 20139 stacks First Line duplicates 5.9 - Count: 1181 - Line: 0: cranelift_fuzzgen::function_generator::FunctionGenerator::generate_br_table 8.3 - Count: 1679 - Line: 0: cranelift_fuzzgen::function_generator::FunctionGenerator::generate_jumptables 1.4 - Count: 282 - Line: 0: cranelift_fuzzgen::function_generator::FunctionGenerator::generate_switch 11.7 - Count: 2355 - Line: 0: cranelift_fuzzgen::function_generator::FunctionGenerator::generate_target_block 29.9 - Count: 6014 - Line: 0: cranelift_fuzzgen::function_generator::FunctionGenerator::get_variable_of_type 29.7 - Count: 5991 - Line: 0: cranelift_fuzzgen::function_generator::FunctionGenerator::stack_slot_with_size 13.1 - Count: 2637 - Line: 0: cranelift_fuzzgen::function_generator::insert_call
</details>
<details>
<summary><h3>This PR</h3></summary>Statistics:
#49979 NEW cov: 31503 ft: 178604 corp: 3520/7374Kb lim: 1048576 exec/s: 48 rss: 1054Mb L: 2702/1048576 MS: 1 CMP- DE: "\x01\x00\x00\x00\x00\x00\x00\x05"- == FuzzGen Statistics ==================== Total Inputs: 50000 Valid Inputs: 21036 (42.1%) Total Runs: 724599 Successful Runs: 719276 (99.3% of Total Runs) Timed out Runs: 32 (0.0% of Total Runs) Traps: user code: stk_ovf: 0 (0.0% of Total Runs) user code: heap_misaligned: 0 (0.0% of Total Runs) user code: bad_sig: 0 (0.0% of Total Runs) user code: table_oob: 0 (0.0% of Total Runs) user code: bad_toint: 0 (0.0% of Total Runs) user code: heap_oob: 0 (0.0% of Total Runs) user debug: 0 (0.0% of Total Runs) resumable: 0 (0.0% of Total Runs) user code: icall_null: 0 (0.0% of Total Runs) user code: int_ovf: 0 (0.0% of Total Runs) user code: int_divz: 5291 (0.7% of Total Runs) user code: unreachable: 0 (0.0% of Total Runs) user code: interrupt: 0 (0.0% of Total Runs)
Arbitrary::choice
failures:Loaded 14367 stacks First Line duplicates 41.3 - Count: 5936 - Line: 0: cranelift_fuzzgen::function_generator::FunctionGenerator::get_variable_of_type 37.6 - Count: 5395 - Line: 0: cranelift_fuzzgen::function_generator::FunctionGenerator::stack_slot_with_size 21.1 - Count: 3036 - Line: 0: cranelift_fuzzgen::function_generator::insert_call
</details>
This improves execs/s by 108% (23 -> 48) which is nice. It also eliminates all invalid choices regarding block terminators.
afonso360 edited PR #4894 from forward-branching
to main
:
:wave: Hey,
This PR addresses some of the issues found in #3050 regarding the performance of fuzzgen. We found that we had an outsized number of timeouts during execution of the fuzzer. This is mostly due to generating a bunch of infinite loops.
This changes our branching strategy. Instead of choosing blocks at random, we always choose a block ahead of the current block. This makes it so that we always make forward progress and don't time out very often. I still think backwards branches are important to generate, so we make those a configurable option (0.1% of branches).
Additionally it looks like not having valid target branches was also an issue. 27.3% of our invalid
Arbitrary::choice
failures were due to us trying to select a block that we didn't have. Such as for example trying to select a block that does not have params when all of them do.
We are now smarter about which terminators we allow to be inserted avoiding these issues.cc: @jameysharp
Benchmarks
<details>
<summary><h3>Baseline</h3></summary>Statistics:
#49967 NEW cov: 31776 ft: 183224 corp: 4225/5444Kb lim: 394895 exec/s: 23 rss: 966Mb L: 2106/373965 MS: 2 ShuffleBytes-CopyPart- == FuzzGen Statistics ==================== Total Inputs: 50000 Valid Inputs: 18378 (36.8%) Total Runs: 62769 Successful Runs: 48229 (76.8% of Total Runs) Timed out Runs: 10440 (16.6% of Total Runs) Traps: user code: int_divz: 4100 (6.5% of Total Runs) user code: bad_sig: 0 (0.0% of Total Runs) user debug: 0 (0.0% of Total Runs) user code: icall_null: 0 (0.0% of Total Runs) user code: heap_oob: 0 (0.0% of Total Runs) user code: int_ovf: 0 (0.0% of Total Runs) user code: bad_toint: 0 (0.0% of Total Runs) user code: interrupt: 0 (0.0% of Total Runs) user code: stk_ovf: 0 (0.0% of Total Runs) user code: table_oob: 0 (0.0% of Total Runs) user code: heap_misaligned: 0 (0.0% of Total Runs) user code: unreachable: 0 (0.0% of Total Runs) resumable: 0 (0.0% of Total Runs)
Arbitrary::choice
failures:Loaded 20139 stacks First Line duplicates 5.9 - Count: 1181 - Line: 0: cranelift_fuzzgen::function_generator::FunctionGenerator::generate_br_table 8.3 - Count: 1679 - Line: 0: cranelift_fuzzgen::function_generator::FunctionGenerator::generate_jumptables 1.4 - Count: 282 - Line: 0: cranelift_fuzzgen::function_generator::FunctionGenerator::generate_switch 11.7 - Count: 2355 - Line: 0: cranelift_fuzzgen::function_generator::FunctionGenerator::generate_target_block 29.9 - Count: 6014 - Line: 0: cranelift_fuzzgen::function_generator::FunctionGenerator::get_variable_of_type 29.7 - Count: 5991 - Line: 0: cranelift_fuzzgen::function_generator::FunctionGenerator::stack_slot_with_size 13.1 - Count: 2637 - Line: 0: cranelift_fuzzgen::function_generator::insert_call
</details>
<details>
<summary><h3>This PR</h3></summary>Statistics:
#49979 NEW cov: 31503 ft: 178604 corp: 3520/7374Kb lim: 1048576 exec/s: 48 rss: 1054Mb L: 2702/1048576 MS: 1 CMP- DE: "\x01\x00\x00\x00\x00\x00\x00\x05"- == FuzzGen Statistics ==================== Total Inputs: 50000 Valid Inputs: 21036 (42.1%) Total Runs: 724599 Successful Runs: 719276 (99.3% of Total Runs) Timed out Runs: 32 (0.0% of Total Runs) Traps: user code: stk_ovf: 0 (0.0% of Total Runs) user code: heap_misaligned: 0 (0.0% of Total Runs) user code: bad_sig: 0 (0.0% of Total Runs) user code: table_oob: 0 (0.0% of Total Runs) user code: bad_toint: 0 (0.0% of Total Runs) user code: heap_oob: 0 (0.0% of Total Runs) user debug: 0 (0.0% of Total Runs) resumable: 0 (0.0% of Total Runs) user code: icall_null: 0 (0.0% of Total Runs) user code: int_ovf: 0 (0.0% of Total Runs) user code: int_divz: 5291 (0.7% of Total Runs) user code: unreachable: 0 (0.0% of Total Runs) user code: interrupt: 0 (0.0% of Total Runs)
Arbitrary::choice
failures:Loaded 14367 stacks First Line duplicates 41.3 - Count: 5936 - Line: 0: cranelift_fuzzgen::function_generator::FunctionGenerator::get_variable_of_type 37.6 - Count: 5395 - Line: 0: cranelift_fuzzgen::function_generator::FunctionGenerator::stack_slot_with_size 21.1 - Count: 3036 - Line: 0: cranelift_fuzzgen::function_generator::insert_call
</details>
This improves execs/s by 108% (23 -> 48) which is nice. It also eliminates all invalid choices regarding block terminators so we reject fewer inputs as well.
jameysharp created PR review comment:
It's a minor thing, but I'd prefer to not consume any fuzz input if there aren't any backwards blocks to jump to:
let (backwards_blocks, forward_blocks) = self.resources.partition_blocks(source_block); let block_targets = if !backwards_blocks.is_empty() && self.u.ratio(1, 1000)? {
jameysharp created PR review comment:
// so generate a return. We could technically give it a chance to generate a backwards
jameysharp created PR review comment:
I think
forward_blocks
should always be non-empty here because you return early ifis_last_block
. So you could either treat thehas_forward_blocks
checks below as always-true, or you could remove the early return in theis_last_block
case above. I'd be inclined to remove theis_last_block
check, because that check represents a second place where you have to reason about which terminators are legal. I think this function would continue to work correctly without that check.Note that
Unstructured::choose
doesn't consume any of the fuzz input if there's only one choice, so we don't lose anything important by just calling it every time.
jameysharp created PR review comment:
Unlike
partition_blocks
, you'd definitely have to search to find the right slice, but I think this can still be simplified:let partition_point = self.blocks_without_params.partition_point(|b| *b <= block); &self.blocks_without_params[partition_point..]
jameysharp created PR review comment:
Seems unfortunate that
insert_switch
can't generate backward branches, but I guess that's fine.
jameysharp created PR review comment:
What do you think of something along these lines?
let terminators: Vec<BlockTerminator> = [ (insert_return, true), (insert_switch, has_forward_blocks_without_params), ... ].into_iter().filter(|(_, valid)| *valid).map(|(term, _)| term).collect();
We could go further and compute, for each kind of terminator, the first block index where that kind is not valid any more. (Something not valid anywhere in this function would be invalid at block 0, and something that's valid everywhere would become invalid sometime after the last block in the function.) Then we could reuse that list for every block instead of computing it fresh every time. If we have it sorted by block index, then we can choose from a suffix of the array, and never have to copy it.
But that's probably more work than it's worth. I like that the way you've written it means you can check whether a terminator is valid for a block by just getting the same list of targets that the
insert_*
functions would get.
jameysharp created PR review comment:
I haven't checked carefully but I think this should be equivalent. All blocks less than or equal to
block
are in&blocks[..partition_point]
and all greater thanblock
are in&blocks[partition_point..]
, and that's true even if either of those slices is empty.let partition_point = self.blocks.partition_point(|(b, _)| *b <= block);
That said... Aren't the blocks added in-order and without gaps? Like, is
self.blocks[i].0
always equal toBlock::new(i)
? Again, I haven't checked that, but if so we shouldn't need to store the block number and we should be able to find the right block just by indexing into the array.
jameysharp created PR review comment:
Not important, but you can remove a layer of indirection by calling
copied()
sooner. Also I think if you shorten it this wayrustfmt
should still be happy.let jump_tables = &builder.func.jump_tables; self.jump_tables .iter() .copied() .filter(|jt| jump_tables[*jt].iter().all(|target| *target > block))
jameysharp submitted PR review.
jameysharp submitted PR review.
afonso360 submitted PR review.
afonso360 created PR review comment:
That is much neater, thanks! I wasn't aware of
partition_point()
afonso360 created PR review comment:
Yeah, we should probably include that
afonso360 submitted PR review.
jameysharp submitted PR review.
jameysharp created PR review comment:
I don't mind leaving this as something to come back to later though. If you want to just add a TODO comment to
insert_switch
that it would be nice to also check backward branches, that's plenty.
afonso360 created PR review comment:
Yeah, that's a great suggestion! I was able to get this to the point where we directly index into the blocks vector using the block number.
We don't necessarily need to store the block alongside the Signature, however removing it makes it so that we can't return a slice here, and the caller of this is quite messy without it and without allocating anything.
afonso360 submitted PR review.
afonso360 updated PR #4894 from forward-branching
to main
.
afonso360 created PR review comment:
I like the list thingy. Precomputing the terminator block validity is nice but I also think is a bit of work :sweat_smile:
afonso360 submitted PR review.
jameysharp submitted PR review.
jameysharp created PR review comment:
Oh, good point! Storing the block number does make things easier in the caller.
afonso360 updated PR #4894 from forward-branching
to main
.
jameysharp submitted PR review.
jameysharp created PR review comment:
On further thought, we could avoid allocating here by sorting
self.jump_tables
ingenerate_jumptables
. I'm happy to merge without it but I'm going to write down the idea in case you decide you want to do it.The sort key is
|jt| builder.func.jump_tables[*jt].iter().min()
. Ingenerate_jumptables
you could use that as the argument toself.resources.jump_tables.sort_by_cached_key(...)
. Then here you could use it inself.jump_tables.partition_point(... <= block)
.To keep the sort key computation in only one place, you could have
generate_jumptables
save a pair of(min_block, table_id)
inself.resources.jump_tables
. Then the sort is justself.resources.jump_tables.sort_unstable_by_key(|(min_block, _)| *min_block)
and this function just needsself.jump_tables.partition_point(|(min_block, _)| *min_block <= block)
.
jameysharp submitted PR review.
afonso360 updated PR #4894 from forward-branching
to main
.
afonso360 created PR review comment:
That sounds quite doable! I'm going to add a TODO comment to revisit later, I'd prefer to tackle the other fuzzgen issues for now.
afonso360 submitted PR review.
afonso360 updated PR #4894 from forward-branching
to main
.
afonso360 has marked PR #4894 as ready for review.
jameysharp merged PR #4894.
Last updated: Jan 24 2025 at 00:11 UTC