github-actions[bot] commented on issue #5998:
Subscribe to Label Action
cc @fitzgen
<details>
This issue or pull request has been labeled: "fuzzing"Thus the following users have been cc'd because of the following labels:
- fitzgen: fuzzing
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
Learn more.
</details>
jameysharp commented on issue #5998:
Oh, and when you write a more complete commit message, please include the phrase "Fixes #5967" in it somewhere so GitHub understands that these two issues are related.
ghostway0 commented on issue #5998:
I still don't understand why this fails on the optimized case, on the first test...
afonso360 commented on issue #5998:
I still don't understand why this fails on the optimized case, on the first test...
So, I took a look at this, and it found a bug!
More specifically #5922. Although that bug is tagged as x64, it really is a bug in our mid end. We translate
stack_stores intostore notrap aligned, however it probably shouldn't unconditionally be translated asaligned.The reason this only triggers in the second run is that the interpreter allows unaligned stores with
stack_store, but if you pass it astore alignedand it isn't aligned, it complains.Here are the test cases:
<details>
<summary>Original</summary>test interpret test run set opt_level=speed_and_size set enable_alias_analysis=false set use_egraphs=false set enable_simd=true set enable_llvm_abi_extensions=true set unwind_info=false set machine_code_cfg_info=true set enable_jump_tables=false set enable_heap_access_spectre_mitigation=false set enable_table_access_spectre_mitigation=false set enable_incremental_compilation_cache_checks=true target x86_64 function %a() system_v { ss0 = explicit_slot 51 ss1 = explicit_slot 68 sig0 = (f32) -> f32 system_v sig1 = (f64) -> f64 system_v sig2 = (f32) -> f32 system_v sig3 = (f64) -> f64 system_v sig4 = (f32) -> f32 system_v sig5 = (f64) -> f64 system_v fn0 = %CeilF32 sig0 fn1 = %CeilF64 sig1 fn2 = %FloorF32 sig2 fn3 = %FloorF64 sig3 fn4 = colocated %TruncF32 sig4 fn5 = %TruncF64 sig5 block0: v0 = iconst.i8 0 v1 = iconst.i16 0 v2 = iconst.i32 0 v3 = iconst.i64 0 v4 = uextend.i128 v3 ; v3 = 0 stack_store v4, ss0 stack_store v4, ss0+16 stack_store v4, ss0+32 stack_store v1, ss0+48 ; v1 = 0 stack_store v0, ss0+50 ; v0 = 0 stack_store v4, ss1 stack_store v4, ss1+16 stack_store v4, ss1+32 stack_store v4, ss1+48 stack_store v2, ss1+64 ; v2 = 0 return }</details>
<details>
<summary>Optimized</summary>
function u1:0() system_v { ss0 = explicit_slot 51 ss1 = explicit_slot 68 sig0 = (f32) -> f32 system_v sig1 = (f64) -> f64 system_v sig2 = (f32) -> f32 system_v sig3 = (f64) -> f64 system_v sig4 = (f32) -> f32 system_v sig5 = (f64) -> f64 system_v fn0 = %CeilF32 sig0 fn1 = %CeilF64 sig1 fn2 = %FloorF32 sig2 fn3 = %FloorF64 sig3 fn4 = colocated %TruncF32 sig4 fn5 = %TruncF64 sig5 block0: v0 = iconst.i8 0 v1 = iconst.i16 0 v2 = iconst.i32 0 v3 = iconst.i64 0 v4 = uextend.i128 v3 ; v3 = 0 v5 = stack_addr.i64 ss0 store notrap aligned v4, v5 v6 = stack_addr.i64 ss0+16 store notrap aligned v4, v6 v7 = stack_addr.i64 ss0+32 store notrap aligned v4, v7 v8 = stack_addr.i64 ss0+48 store notrap aligned v1, v8 ; v1 = 0 v9 = stack_addr.i64 ss0+50 store notrap aligned v0, v9 ; v0 = 0 v10 = stack_addr.i64 ss1 store notrap aligned v4, v10 v11 = stack_addr.i64 ss1+16 store notrap aligned v4, v11 v12 = stack_addr.i64 ss1+32 store notrap aligned v4, v12 v13 = stack_addr.i64 ss1+48 store notrap aligned v4, v13 v14 = stack_addr.i64 ss1+64 store notrap aligned v2, v14 ; v2 = 0 return }</details>
This crashes in the first store to
ss1, since it expects it to be aligned to 16bytes, but it isn't due to the previous stack slot. The interpreter never inserts padding between stack slots.
afonso360 edited a comment on issue #5998:
I still don't understand why this fails on the optimized case, on the first test...
So, I took a look at this, and it found a bug! :tada:
More specifically #5922. Although that bug is tagged as x64, it really is a bug in our mid end. We translate
stack_stores intostore notrap aligned, however it probably shouldn't unconditionally be translated asaligned.The reason this only triggers in the second run is that the interpreter allows unaligned stores with
stack_store, but if you pass it astore alignedand it isn't aligned, it complains.Here are the test cases:
<details>
<summary>Original</summary>test interpret test run set opt_level=speed_and_size set enable_alias_analysis=false set use_egraphs=false set enable_simd=true set enable_llvm_abi_extensions=true set unwind_info=false set machine_code_cfg_info=true set enable_jump_tables=false set enable_heap_access_spectre_mitigation=false set enable_table_access_spectre_mitigation=false set enable_incremental_compilation_cache_checks=true target x86_64 function %a() system_v { ss0 = explicit_slot 51 ss1 = explicit_slot 68 sig0 = (f32) -> f32 system_v sig1 = (f64) -> f64 system_v sig2 = (f32) -> f32 system_v sig3 = (f64) -> f64 system_v sig4 = (f32) -> f32 system_v sig5 = (f64) -> f64 system_v fn0 = %CeilF32 sig0 fn1 = %CeilF64 sig1 fn2 = %FloorF32 sig2 fn3 = %FloorF64 sig3 fn4 = colocated %TruncF32 sig4 fn5 = %TruncF64 sig5 block0: v0 = iconst.i8 0 v1 = iconst.i16 0 v2 = iconst.i32 0 v3 = iconst.i64 0 v4 = uextend.i128 v3 ; v3 = 0 stack_store v4, ss0 stack_store v4, ss0+16 stack_store v4, ss0+32 stack_store v1, ss0+48 ; v1 = 0 stack_store v0, ss0+50 ; v0 = 0 stack_store v4, ss1 stack_store v4, ss1+16 stack_store v4, ss1+32 stack_store v4, ss1+48 stack_store v2, ss1+64 ; v2 = 0 return }</details>
<details>
<summary>Optimized</summary>
function u1:0() system_v { ss0 = explicit_slot 51 ss1 = explicit_slot 68 sig0 = (f32) -> f32 system_v sig1 = (f64) -> f64 system_v sig2 = (f32) -> f32 system_v sig3 = (f64) -> f64 system_v sig4 = (f32) -> f32 system_v sig5 = (f64) -> f64 system_v fn0 = %CeilF32 sig0 fn1 = %CeilF64 sig1 fn2 = %FloorF32 sig2 fn3 = %FloorF64 sig3 fn4 = colocated %TruncF32 sig4 fn5 = %TruncF64 sig5 block0: v0 = iconst.i8 0 v1 = iconst.i16 0 v2 = iconst.i32 0 v3 = iconst.i64 0 v4 = uextend.i128 v3 ; v3 = 0 v5 = stack_addr.i64 ss0 store notrap aligned v4, v5 v6 = stack_addr.i64 ss0+16 store notrap aligned v4, v6 v7 = stack_addr.i64 ss0+32 store notrap aligned v4, v7 v8 = stack_addr.i64 ss0+48 store notrap aligned v1, v8 ; v1 = 0 v9 = stack_addr.i64 ss0+50 store notrap aligned v0, v9 ; v0 = 0 v10 = stack_addr.i64 ss1 store notrap aligned v4, v10 v11 = stack_addr.i64 ss1+16 store notrap aligned v4, v11 v12 = stack_addr.i64 ss1+32 store notrap aligned v4, v12 v13 = stack_addr.i64 ss1+48 store notrap aligned v4, v13 v14 = stack_addr.i64 ss1+64 store notrap aligned v2, v14 ; v2 = 0 return }</details>
This crashes in the first store to
ss1, since it expects it to be aligned to 16bytes, but it isn't due to the previous stack slot. The interpreter never inserts padding between stack slots.
afonso360 edited a comment on issue #5998:
I still don't understand why this fails on the optimized case, on the first test...
So, I took a look at this, and it found a bug! :tada:
More specifically #5922. Although that bug is tagged as x64, it really is a bug in our mid end. We translate
stack_stores intostore notrap aligned, however it probably shouldn't unconditionally be translated asaligned.The reason this only triggers in the second run is that the interpreter allows unaligned stores with
stack_store, but if you pass it astore alignedand it isn't aligned, it complains.Here are the test cases:
<details>
<summary>Original</summary>test interpret test run set opt_level=speed_and_size set enable_alias_analysis=false set use_egraphs=false set enable_simd=true set enable_llvm_abi_extensions=true set unwind_info=false set machine_code_cfg_info=true set enable_jump_tables=false set enable_heap_access_spectre_mitigation=false set enable_table_access_spectre_mitigation=false set enable_incremental_compilation_cache_checks=true target x86_64 function %a() system_v { ss0 = explicit_slot 51 ss1 = explicit_slot 68 sig0 = (f32) -> f32 system_v sig1 = (f64) -> f64 system_v sig2 = (f32) -> f32 system_v sig3 = (f64) -> f64 system_v sig4 = (f32) -> f32 system_v sig5 = (f64) -> f64 system_v fn0 = %CeilF32 sig0 fn1 = %CeilF64 sig1 fn2 = %FloorF32 sig2 fn3 = %FloorF64 sig3 fn4 = colocated %TruncF32 sig4 fn5 = %TruncF64 sig5 block0: v0 = iconst.i8 0 v1 = iconst.i16 0 v2 = iconst.i32 0 v3 = iconst.i64 0 v4 = uextend.i128 v3 ; v3 = 0 stack_store v4, ss0 stack_store v4, ss0+16 stack_store v4, ss0+32 stack_store v1, ss0+48 ; v1 = 0 stack_store v0, ss0+50 ; v0 = 0 stack_store v4, ss1 stack_store v4, ss1+16 stack_store v4, ss1+32 stack_store v4, ss1+48 stack_store v2, ss1+64 ; v2 = 0 return }</details>
<details>
<summary>Optimized</summary>
function u1:0() system_v { ss0 = explicit_slot 51 ss1 = explicit_slot 68 sig0 = (f32) -> f32 system_v sig1 = (f64) -> f64 system_v sig2 = (f32) -> f32 system_v sig3 = (f64) -> f64 system_v sig4 = (f32) -> f32 system_v sig5 = (f64) -> f64 system_v fn0 = %CeilF32 sig0 fn1 = %CeilF64 sig1 fn2 = %FloorF32 sig2 fn3 = %FloorF64 sig3 fn4 = colocated %TruncF32 sig4 fn5 = %TruncF64 sig5 block0: v0 = iconst.i8 0 v1 = iconst.i16 0 v2 = iconst.i32 0 v3 = iconst.i64 0 v4 = uextend.i128 v3 ; v3 = 0 v5 = stack_addr.i64 ss0 store notrap aligned v4, v5 v6 = stack_addr.i64 ss0+16 store notrap aligned v4, v6 v7 = stack_addr.i64 ss0+32 store notrap aligned v4, v7 v8 = stack_addr.i64 ss0+48 store notrap aligned v1, v8 ; v1 = 0 v9 = stack_addr.i64 ss0+50 store notrap aligned v0, v9 ; v0 = 0 v10 = stack_addr.i64 ss1 store notrap aligned v4, v10 v11 = stack_addr.i64 ss1+16 store notrap aligned v4, v11 v12 = stack_addr.i64 ss1+32 store notrap aligned v4, v12 v13 = stack_addr.i64 ss1+48 store notrap aligned v4, v13 v14 = stack_addr.i64 ss1+64 store notrap aligned v2, v14 ; v2 = 0 return }</details>
This crashes in the first store to
ss1, since it expects it to be aligned to 16bytes, but it isn't due to the previous stack slot. The interpreter never inserts padding between stack slots.Edit: I've opened #6016 to address this.
ghostway0 commented on issue #5998:
That's cool. If there are no further comments, after #6016 merges, this can be merged as well!
ghostway0 edited a comment on issue #5998:
That's cool. If there are no further comments, after #6016 merges, this can be merged as well (after pretty-printing is fixed)!
ghostway0 edited a comment on issue #5998:
That's cool. If there are no further comments, after #6016 merges, this can be merged as well (after pretty-printing is fixed)!
Interestingly, it is failing for me on a different test case! So there might be another bug-fix needed soon
ghostway0 commented on issue #5998:
Should I rebase or would you squash all the commits?
ghostway0 edited a comment on issue #5998:
Should I rebase or would you squash all the commits?
Anyway, a diff-like tool would be great to use, if PrintableTestCase can be extended to do that
afonso360 commented on issue #5998:
Should I rebase or would you squash all the commits?
I think it will squash all commits once the PR gets merged.
Anyway, a diff-like tool would be great to use, if PrintableTestCase can be extended to do that
I don't think we have anything like that right now (though it would be a good addition!), but if you do something like
RUST_LOG=trace cargo run test ./output-from-fuzzer.clif, you should see the optimized function being printed in the trace log. It's pretty much my go to every time.
ghostway0 commented on issue #5998:
Oh, it appears to be the same problem!
stack_stores that are optimized tostore notrap aligned. But when I try to fuzz with #6016, it still fails
ghostway0 edited a comment on issue #5998:
Oh, it appears to be the same problem!
stack_stores that are optimized tostore notrap aligned. But when I try to fuzz with #6016 (main), it still fails
ghostway0 edited a comment on issue #5998:
Oh, it appears to be the same problem!stack_stores that are optimized tostore notrap aligned. But when I try to fuzz with #6016 (main), it still fails
No! Apparently I didn't pull recently enough. It works!
Last updated: Dec 13 2025 at 19:03 UTC