github-actions[bot] commented on issue #5998:
Subscribe to Label Action
cc @fitzgen
<details>
This issue or pull request has been labeled: "fuzzing"Thus the following users have been cc'd because of the following labels:
- fitzgen: fuzzing
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
Learn more.
</details>
jameysharp commented on issue #5998:
Oh, and when you write a more complete commit message, please include the phrase "Fixes #5967" in it somewhere so GitHub understands that these two issues are related.
ghostway0 commented on issue #5998:
I still don't understand why this fails on the optimized case, on the first test...
afonso360 commented on issue #5998:
I still don't understand why this fails on the optimized case, on the first test...
So, I took a look at this, and it found a bug!
More specifically #5922. Although that bug is tagged as x64, it really is a bug in our mid end. We translate
stack_store
s intostore notrap aligned
, however it probably shouldn't unconditionally be translated asaligned
.The reason this only triggers in the second run is that the interpreter allows unaligned stores with
stack_store
, but if you pass it astore aligned
and it isn't aligned, it complains.Here are the test cases:
<details>
<summary>Original</summary>test interpret test run set opt_level=speed_and_size set enable_alias_analysis=false set use_egraphs=false set enable_simd=true set enable_llvm_abi_extensions=true set unwind_info=false set machine_code_cfg_info=true set enable_jump_tables=false set enable_heap_access_spectre_mitigation=false set enable_table_access_spectre_mitigation=false set enable_incremental_compilation_cache_checks=true target x86_64 function %a() system_v { ss0 = explicit_slot 51 ss1 = explicit_slot 68 sig0 = (f32) -> f32 system_v sig1 = (f64) -> f64 system_v sig2 = (f32) -> f32 system_v sig3 = (f64) -> f64 system_v sig4 = (f32) -> f32 system_v sig5 = (f64) -> f64 system_v fn0 = %CeilF32 sig0 fn1 = %CeilF64 sig1 fn2 = %FloorF32 sig2 fn3 = %FloorF64 sig3 fn4 = colocated %TruncF32 sig4 fn5 = %TruncF64 sig5 block0: v0 = iconst.i8 0 v1 = iconst.i16 0 v2 = iconst.i32 0 v3 = iconst.i64 0 v4 = uextend.i128 v3 ; v3 = 0 stack_store v4, ss0 stack_store v4, ss0+16 stack_store v4, ss0+32 stack_store v1, ss0+48 ; v1 = 0 stack_store v0, ss0+50 ; v0 = 0 stack_store v4, ss1 stack_store v4, ss1+16 stack_store v4, ss1+32 stack_store v4, ss1+48 stack_store v2, ss1+64 ; v2 = 0 return }
</details>
<details>
<summary>Optimized</summary>
function u1:0() system_v { ss0 = explicit_slot 51 ss1 = explicit_slot 68 sig0 = (f32) -> f32 system_v sig1 = (f64) -> f64 system_v sig2 = (f32) -> f32 system_v sig3 = (f64) -> f64 system_v sig4 = (f32) -> f32 system_v sig5 = (f64) -> f64 system_v fn0 = %CeilF32 sig0 fn1 = %CeilF64 sig1 fn2 = %FloorF32 sig2 fn3 = %FloorF64 sig3 fn4 = colocated %TruncF32 sig4 fn5 = %TruncF64 sig5 block0: v0 = iconst.i8 0 v1 = iconst.i16 0 v2 = iconst.i32 0 v3 = iconst.i64 0 v4 = uextend.i128 v3 ; v3 = 0 v5 = stack_addr.i64 ss0 store notrap aligned v4, v5 v6 = stack_addr.i64 ss0+16 store notrap aligned v4, v6 v7 = stack_addr.i64 ss0+32 store notrap aligned v4, v7 v8 = stack_addr.i64 ss0+48 store notrap aligned v1, v8 ; v1 = 0 v9 = stack_addr.i64 ss0+50 store notrap aligned v0, v9 ; v0 = 0 v10 = stack_addr.i64 ss1 store notrap aligned v4, v10 v11 = stack_addr.i64 ss1+16 store notrap aligned v4, v11 v12 = stack_addr.i64 ss1+32 store notrap aligned v4, v12 v13 = stack_addr.i64 ss1+48 store notrap aligned v4, v13 v14 = stack_addr.i64 ss1+64 store notrap aligned v2, v14 ; v2 = 0 return }
</details>
This crashes in the first store to
ss1
, since it expects it to be aligned to 16bytes, but it isn't due to the previous stack slot. The interpreter never inserts padding between stack slots.
afonso360 edited a comment on issue #5998:
I still don't understand why this fails on the optimized case, on the first test...
So, I took a look at this, and it found a bug! :tada:
More specifically #5922. Although that bug is tagged as x64, it really is a bug in our mid end. We translate
stack_store
s intostore notrap aligned
, however it probably shouldn't unconditionally be translated asaligned
.The reason this only triggers in the second run is that the interpreter allows unaligned stores with
stack_store
, but if you pass it astore aligned
and it isn't aligned, it complains.Here are the test cases:
<details>
<summary>Original</summary>test interpret test run set opt_level=speed_and_size set enable_alias_analysis=false set use_egraphs=false set enable_simd=true set enable_llvm_abi_extensions=true set unwind_info=false set machine_code_cfg_info=true set enable_jump_tables=false set enable_heap_access_spectre_mitigation=false set enable_table_access_spectre_mitigation=false set enable_incremental_compilation_cache_checks=true target x86_64 function %a() system_v { ss0 = explicit_slot 51 ss1 = explicit_slot 68 sig0 = (f32) -> f32 system_v sig1 = (f64) -> f64 system_v sig2 = (f32) -> f32 system_v sig3 = (f64) -> f64 system_v sig4 = (f32) -> f32 system_v sig5 = (f64) -> f64 system_v fn0 = %CeilF32 sig0 fn1 = %CeilF64 sig1 fn2 = %FloorF32 sig2 fn3 = %FloorF64 sig3 fn4 = colocated %TruncF32 sig4 fn5 = %TruncF64 sig5 block0: v0 = iconst.i8 0 v1 = iconst.i16 0 v2 = iconst.i32 0 v3 = iconst.i64 0 v4 = uextend.i128 v3 ; v3 = 0 stack_store v4, ss0 stack_store v4, ss0+16 stack_store v4, ss0+32 stack_store v1, ss0+48 ; v1 = 0 stack_store v0, ss0+50 ; v0 = 0 stack_store v4, ss1 stack_store v4, ss1+16 stack_store v4, ss1+32 stack_store v4, ss1+48 stack_store v2, ss1+64 ; v2 = 0 return }
</details>
<details>
<summary>Optimized</summary>
function u1:0() system_v { ss0 = explicit_slot 51 ss1 = explicit_slot 68 sig0 = (f32) -> f32 system_v sig1 = (f64) -> f64 system_v sig2 = (f32) -> f32 system_v sig3 = (f64) -> f64 system_v sig4 = (f32) -> f32 system_v sig5 = (f64) -> f64 system_v fn0 = %CeilF32 sig0 fn1 = %CeilF64 sig1 fn2 = %FloorF32 sig2 fn3 = %FloorF64 sig3 fn4 = colocated %TruncF32 sig4 fn5 = %TruncF64 sig5 block0: v0 = iconst.i8 0 v1 = iconst.i16 0 v2 = iconst.i32 0 v3 = iconst.i64 0 v4 = uextend.i128 v3 ; v3 = 0 v5 = stack_addr.i64 ss0 store notrap aligned v4, v5 v6 = stack_addr.i64 ss0+16 store notrap aligned v4, v6 v7 = stack_addr.i64 ss0+32 store notrap aligned v4, v7 v8 = stack_addr.i64 ss0+48 store notrap aligned v1, v8 ; v1 = 0 v9 = stack_addr.i64 ss0+50 store notrap aligned v0, v9 ; v0 = 0 v10 = stack_addr.i64 ss1 store notrap aligned v4, v10 v11 = stack_addr.i64 ss1+16 store notrap aligned v4, v11 v12 = stack_addr.i64 ss1+32 store notrap aligned v4, v12 v13 = stack_addr.i64 ss1+48 store notrap aligned v4, v13 v14 = stack_addr.i64 ss1+64 store notrap aligned v2, v14 ; v2 = 0 return }
</details>
This crashes in the first store to
ss1
, since it expects it to be aligned to 16bytes, but it isn't due to the previous stack slot. The interpreter never inserts padding between stack slots.
afonso360 edited a comment on issue #5998:
I still don't understand why this fails on the optimized case, on the first test...
So, I took a look at this, and it found a bug! :tada:
More specifically #5922. Although that bug is tagged as x64, it really is a bug in our mid end. We translate
stack_store
s intostore notrap aligned
, however it probably shouldn't unconditionally be translated asaligned
.The reason this only triggers in the second run is that the interpreter allows unaligned stores with
stack_store
, but if you pass it astore aligned
and it isn't aligned, it complains.Here are the test cases:
<details>
<summary>Original</summary>test interpret test run set opt_level=speed_and_size set enable_alias_analysis=false set use_egraphs=false set enable_simd=true set enable_llvm_abi_extensions=true set unwind_info=false set machine_code_cfg_info=true set enable_jump_tables=false set enable_heap_access_spectre_mitigation=false set enable_table_access_spectre_mitigation=false set enable_incremental_compilation_cache_checks=true target x86_64 function %a() system_v { ss0 = explicit_slot 51 ss1 = explicit_slot 68 sig0 = (f32) -> f32 system_v sig1 = (f64) -> f64 system_v sig2 = (f32) -> f32 system_v sig3 = (f64) -> f64 system_v sig4 = (f32) -> f32 system_v sig5 = (f64) -> f64 system_v fn0 = %CeilF32 sig0 fn1 = %CeilF64 sig1 fn2 = %FloorF32 sig2 fn3 = %FloorF64 sig3 fn4 = colocated %TruncF32 sig4 fn5 = %TruncF64 sig5 block0: v0 = iconst.i8 0 v1 = iconst.i16 0 v2 = iconst.i32 0 v3 = iconst.i64 0 v4 = uextend.i128 v3 ; v3 = 0 stack_store v4, ss0 stack_store v4, ss0+16 stack_store v4, ss0+32 stack_store v1, ss0+48 ; v1 = 0 stack_store v0, ss0+50 ; v0 = 0 stack_store v4, ss1 stack_store v4, ss1+16 stack_store v4, ss1+32 stack_store v4, ss1+48 stack_store v2, ss1+64 ; v2 = 0 return }
</details>
<details>
<summary>Optimized</summary>
function u1:0() system_v { ss0 = explicit_slot 51 ss1 = explicit_slot 68 sig0 = (f32) -> f32 system_v sig1 = (f64) -> f64 system_v sig2 = (f32) -> f32 system_v sig3 = (f64) -> f64 system_v sig4 = (f32) -> f32 system_v sig5 = (f64) -> f64 system_v fn0 = %CeilF32 sig0 fn1 = %CeilF64 sig1 fn2 = %FloorF32 sig2 fn3 = %FloorF64 sig3 fn4 = colocated %TruncF32 sig4 fn5 = %TruncF64 sig5 block0: v0 = iconst.i8 0 v1 = iconst.i16 0 v2 = iconst.i32 0 v3 = iconst.i64 0 v4 = uextend.i128 v3 ; v3 = 0 v5 = stack_addr.i64 ss0 store notrap aligned v4, v5 v6 = stack_addr.i64 ss0+16 store notrap aligned v4, v6 v7 = stack_addr.i64 ss0+32 store notrap aligned v4, v7 v8 = stack_addr.i64 ss0+48 store notrap aligned v1, v8 ; v1 = 0 v9 = stack_addr.i64 ss0+50 store notrap aligned v0, v9 ; v0 = 0 v10 = stack_addr.i64 ss1 store notrap aligned v4, v10 v11 = stack_addr.i64 ss1+16 store notrap aligned v4, v11 v12 = stack_addr.i64 ss1+32 store notrap aligned v4, v12 v13 = stack_addr.i64 ss1+48 store notrap aligned v4, v13 v14 = stack_addr.i64 ss1+64 store notrap aligned v2, v14 ; v2 = 0 return }
</details>
This crashes in the first store to
ss1
, since it expects it to be aligned to 16bytes, but it isn't due to the previous stack slot. The interpreter never inserts padding between stack slots.Edit: I've opened #6016 to address this.
ghostway0 commented on issue #5998:
That's cool. If there are no further comments, after #6016 merges, this can be merged as well!
ghostway0 edited a comment on issue #5998:
That's cool. If there are no further comments, after #6016 merges, this can be merged as well (after pretty-printing is fixed)!
ghostway0 edited a comment on issue #5998:
That's cool. If there are no further comments, after #6016 merges, this can be merged as well (after pretty-printing is fixed)!
Interestingly, it is failing for me on a different test case! So there might be another bug-fix needed soon
ghostway0 commented on issue #5998:
Should I rebase or would you squash all the commits?
ghostway0 edited a comment on issue #5998:
Should I rebase or would you squash all the commits?
Anyway, a diff-like tool would be great to use, if PrintableTestCase can be extended to do that
afonso360 commented on issue #5998:
Should I rebase or would you squash all the commits?
I think it will squash all commits once the PR gets merged.
Anyway, a diff-like tool would be great to use, if PrintableTestCase can be extended to do that
I don't think we have anything like that right now (though it would be a good addition!), but if you do something like
RUST_LOG=trace cargo run test ./output-from-fuzzer.clif
, you should see the optimized function being printed in the trace log. It's pretty much my go to every time.
ghostway0 commented on issue #5998:
Oh, it appears to be the same problem!
stack_store
s that are optimized tostore notrap aligned
. But when I try to fuzz with #6016, it still fails
ghostway0 edited a comment on issue #5998:
Oh, it appears to be the same problem!
stack_store
s that are optimized tostore notrap aligned
. But when I try to fuzz with #6016 (main), it still fails
ghostway0 edited a comment on issue #5998:
Oh, it appears to be the same problem!stack_store
s that are optimized tostore notrap aligned
. But when I try to fuzz with #6016 (main), it still fails
No! Apparently I didn't pull recently enough. It works!
Last updated: Jan 24 2025 at 00:11 UTC