cfallin opened issue #4761:
https://oss-fuzz.com/testcase-detail/4896860279537664
ERROR: AddressSanitizer: SEGV on unknown address (pc 0x6250000014d5 bp 0x7fff30133db0 sp 0x7fff30133ad0 T0) -- | ==743==The signal is caused by a READ memory access. | ==743==Hint: this fault was caused by a dereference of a high value address (see register values below). Disassemble the provided pc to learn which register was used. | SCARINESS: 20 (wild-addr-read) | #0 0x6250000014d5 (<unknown module>) | #1 0x62500000175d (<unknown module>) | #0 0x5579a6d0c6c4 in cranelift_filetests::function_runner::CompiledFunction::call::ha6b9162cd2e21784 wasmtime/cranelift/filetests/src/function_runner.rs:183:9
input:
IFWoAAAAAE0AAABSSUl9LCB7eAx6ZcYNR9T0RvCdMjI0MjaWMgAAAAA7ACypAH19f319fX0AAAA7 JAABqQB9fX19fX19fX0IAAAAAAAAAQAAwyvDw8PDAADJ/zw8PEHDw8MAAGUBb/PGANpGAAAAAAAj OwAsqW9vb29vb28AfX19fX19fQAAADskAAGpAH19fX19fX19fQ7/////////CAAAAAAAAAEAAMMr w8PDwwAAyf88PDxBw8PDAABlAW/zxgDaRgAAAAAAAAAAAAAAAACTgoL/////////MAAAAAAAAAUA AAAACIKCgoKCgkEAgoKCgoJNAAAAgoKCgoKCgoKCgoKCwoKCgoKCgoKSioIEAAcAgoKCgoKCgoKC goKCgoKCGIKC//8wAAAAAAAABQAAAAAIgoKCgoKCQQCCgoKCgk0AAACCgoKCgoKCgoKCgoLCeoKC eYKCgpKKggQAAACCgoKCgoKCgoKCgoKCgoIYgoKCgoKCgoKCgoKCFACCgjOCgoKCgoL/////goKC goKCgoKCgoKCgoJBAUFB
cc @afonso360
afonso360 commented on issue #4761:
<details>
<summary> Formatted </summary>ubuntu@instance-20220805-0848:~/git/wasmtime/fuzz$ cargo fuzz fmt cranelift-fuzzgen ./4761.in --no-default-features Output of `std::fmt::Debug`: ;; Fuzzgen test case test interpret test run set enable_llvm_abi_extensions target aarch64 target s390x target x86_64 function u0:1(i128, b1, b1, i128, b1, i16 uext, i8 sext, i64 sext, i32, i64 sext, i128, i128 sext, i64 uext, f32, i128 sext) -> i64 sext, f32, i16, b1, b1 sext, b1 sext, i8, i128 sext, f64 sext, i128 sext, i128, b1, i32, b1 uext, i8, i128 sext system_v { ss0 = explicit_slot 59 ss1 = explicit_slot 0 ss2 = explicit_slot 44 ss3 = explicit_slot 40 ss4 = explicit_slot 111 ss5 = explicit_slot 111 ss6 = explicit_slot 111 ss7 = explicit_slot 111 jt0 = jump_table [block3, block2, block2, block2, block2, block2, block2, block2, block2] block0(v0: i128, v1: b1, v2: b1, v3: i128, v4: b1, v5: i16, v6: i8, v7: i64, v8: i32, v9: i64, v10: i128, v11: i128, v12: i64, v13: f32, v14: i128): v39 = f64const 0x1.d7d7d7d7d006fp984 v40 = iconst.i64 0xa901_0024_3b00_0000 v41 = iconst.i64 0x7d7d_7d7d_7d7d_7d00 v42 = iconcat v41, v40 ; v41 = 0x7d7d_7d7d_7d7d_7d00, v40 = 0xa901_0024_3b00_0000 v43 = iconst.i64 0xffff_ffff_ffff_0e7d v44 = iconst.i64 2303 v45 = iconcat v44, v43 ; v44 = 2303, v43 = 0xffff_ffff_ffff_0e7d v46 = iconst.i8 0 v47 = bconst.b1 true v48 = iconst.i32 0xffff_ffff_c3c3_c3c3 v49 = bconst.b1 false v50 = iconst.i8 -1 v51 = iconst.i64 0xc3c3_c341_3c3c v52 = iconst.i128 0 v53 = iconst.i64 0 v54 = iconst.i32 0 v55 = iconst.i16 0 v56 = iconst.i8 0 stack_store v52, ss0 ; v52 = 0 stack_store v52, ss0+16 ; v52 = 0 stack_store v52, ss0+32 ; v52 = 0 stack_store v53, ss0+48 ; v53 = 0 stack_store v55, ss0+56 ; v55 = 0 stack_store v56, ss0+58 ; v56 = 0 stack_store v52, ss2 ; v52 = 0 stack_store v52, ss2+16 ; v52 = 0 stack_store v53, ss2+32 ; v53 = 0 stack_store v54, ss2+40 ; v54 = 0 stack_store v52, ss3 ; v52 = 0 stack_store v52, ss3+16 ; v52 = 0 stack_store v53, ss3+32 ; v53 = 0 stack_store v52, ss4 ; v52 = 0 stack_store v52, ss4+16 ; v52 = 0 stack_store v52, ss4+32 ; v52 = 0 stack_store v52, ss4+48 ; v52 = 0 stack_store v52, ss4+64 ; v52 = 0 stack_store v52, ss4+80 ; v52 = 0 stack_store v53, ss4+96 ; v53 = 0 stack_store v54, ss4+104 ; v54 = 0 stack_store v55, ss4+108 ; v55 = 0 stack_store v56, ss4+110 ; v56 = 0 stack_store v52, ss5 ; v52 = 0 stack_store v52, ss5+16 ; v52 = 0 stack_store v52, ss5+32 ; v52 = 0 stack_store v52, ss5+48 ; v52 = 0 stack_store v52, ss5+64 ; v52 = 0 stack_store v52, ss5+80 ; v52 = 0 stack_store v53, ss5+96 ; v53 = 0 stack_store v54, ss5+104 ; v54 = 0 stack_store v55, ss5+108 ; v55 = 0 stack_store v56, ss5+110 ; v56 = 0 stack_store v52, ss6 ; v52 = 0 stack_store v52, ss6+16 ; v52 = 0 stack_store v52, ss6+32 ; v52 = 0 stack_store v52, ss6+48 ; v52 = 0 stack_store v52, ss6+64 ; v52 = 0 stack_store v52, ss6+80 ; v52 = 0 stack_store v53, ss6+96 ; v53 = 0 stack_store v54, ss6+104 ; v54 = 0 stack_store v55, ss6+108 ; v55 = 0 stack_store v56, ss6+110 ; v56 = 0 stack_store v52, ss7 ; v52 = 0 stack_store v52, ss7+16 ; v52 = 0 stack_store v52, ss7+32 ; v52 = 0 stack_store v52, ss7+48 ; v52 = 0 stack_store v52, ss7+64 ; v52 = 0 stack_store v52, ss7+80 ; v52 = 0 stack_store v53, ss7+96 ; v53 = 0 stack_store v54, ss7+104 ; v54 = 0 stack_store v55, ss7+108 ; v55 = 0 stack_store v56, ss7+110 ; v56 = 0 v57 = iadd v6, v6 nop stack_store v5, ss0 nop nop nop nop nop nop nop nop nop nop nop nop v58 = uextend.i64 v46 ; v46 = 0 v59 = udiv v51, v51 ; v51 = 0xc3c3_c341_3c3c, v51 = 0xc3c3_c341_3c3c v60 = udiv v59, v59 nop nop nop nop nop nop v61 = iadd v0, v0 nop v62 = isub v8, v8 v63 = ushr v62, v46 ; v46 = 0 nop v64 = ushr v62, v46 ; v46 = 0 v65 = ushr v63, v57 nop v66 = ushr v65, v46 ; v46 = 0 v67 = ushr v66, v46 ; v46 = 0 v68 = ushr v67, v46 ; v46 = 0 v69 = fcopysign v39, v39 ; v39 = 0x1.d7d7d7d7d006fp984, v39 = 0x1.d7d7d7d7d006fp984 return v58, v13, v5, v1, v1, v1, v46, v14, v69, v45, v42, v1, v68, v1, v46, v61 ; v46 = 0, v46 = 0 block1(v15: i128, v16: i128, v17: i128, v18: i128, v19: b1, v20: b1) cold: return v70, v71, v72, v20, v20, v20, v73, v18, v74, v18, v75, v20, v76, v20, v77, v78 ; v70 = 0, v71 = 0.0, v72 = 0, v73 = 0, v74 = 0.0, v75 = 0, v76 = 0, v77 = 0, v78 = 0 block2: v165 = bconst.b1 false v112 -> v165 v164 -> v165 v163 = iconst.i128 0 v110 -> v163 v162 -> v163 v140 = iconst.i128 0 v126 -> v140 v78 -> v126 v139 = iconst.i8 0 v125 -> v139 v77 -> v125 v138 = iconst.i128 0 v124 -> v138 v75 -> v124 v137 = f64const 0.0 v111 -> v137 v123 -> v137 v74 -> v123 v136 = iconst.i16 0 v109 -> v136 v122 -> v136 v72 -> v122 v135 = f32const 0.0 v108 -> v135 v121 -> v135 v71 -> v121 v134 = iconst.i64 0 v107 -> v134 v120 -> v134 v70 -> v120 v133 = iconst.i32 0 v85 -> v133 v132 = iconst.i32 0 v84 -> v132 v76 -> v84 v131 = iconst.i8 0 v83 -> v131 v73 -> v83 v130 = iconst.i8 0 v82 -> v130 v129 = bconst.b1 false v81 -> v129 v128 = iconst.i128 0 v80 -> v128 v127 = iconst.i128 0 v79 -> v127 br_icmp eq v82, v83, block1(v79, v79, v79, v80, v81, v81) ; v82 = 0, v83 = 0, v79 = 0, v79 = 0, v79 = 0, v80 = 0, v81 = false, v81 = false jump block5(v84, v84, v84, v84, v85, v81, v81, v83) ; v84 = 0, v84 = 0, v84 = 0, v84 = 0, v85 = 0, v81 = false, v81 = false, v83 = 0 block3: v150 = iconst.i8 0 v95 -> v150 v149 = iconst.i32 0 v94 -> v149 v148 = f64const 0.0 v93 -> v148 v147 = iconst.i128 0 v92 -> v147 v146 = iconst.i8 0 v91 -> v146 v145 = bconst.b1 false v90 -> v145 v144 = bconst.b1 false v89 -> v144 v143 = iconst.i16 0 v88 -> v143 v142 = f32const 0.0 v87 -> v142 v141 = iconst.i64 0 v86 -> v141 return v86, v87, v88, v89, v90, v90, v91, v92, v93, v92, v92, v90, v94, v90, v95, v92 ; v86 = 0, v87 = 0.0, v88 = 0, v89 = false, v90 = false, v90 = false, v91 = 0, v92 = 0, v93 = 0.0, v92 = 0, v92 = 0, v90 = false, v94 = 0, v90 = false, v95 = 0, v92 = 0 block4(v21: b1): v161 = bconst.b1 false v106 -> v161 v160 = iconst.i32 0 v105 -> v160 v159 = iconst.i128 0 v104 -> v159 v158 = f64const 0.0 v103 -> v158 v157 = iconst.i128 0 v102 -> v157 v156 = iconst.i8 0 v101 -> v156 v155 = bconst.b1 false v100 -> v155 v154 = bconst.b1 false v99 -> v154 v153 = iconst.i16 0 v98 -> v153 v152 = f32const 0.0 v97 -> v152 v151 = iconst.i64 0 v96 -> v151 return v96, v97, v98, v99, v100, v21, v101, v102, v103, v104, v104, v21, v105, v106, v101, v104 ; v96 = 0, v97 = 0.0, v98 = 0, v99 = false, v100 = false, v101 = 0, v102 = 0, v103 = 0.0, v104 = 0, v104 = 0, v105 = 0, v106 = false, v101 = 0, v104 = 0 block5(v22: i32, v23: i32, v24: i32, v25: i32, v26: i32, v27: b1, v28: b1, v29: i8): return v107, v108, v109, v28, v28, v28, v29, v110, v111, v110, v110, v112, v26, v28, v29, v110 ; v107 = 0, v108 = 0.0, v109 = 0, v110 = 0, v111 = 0.0, v110 = 0, v110 = 0, v112 = false, v110 = 0 block6(v30: i64, v31: i64, v32: i8, v33: i32, v34: i32, v35: i32, v36: b1, v37: b1, v38: i128) cold: v172 = iconst.i8 0 v119 -> v172 v171 = iconst.i32 0 v118 -> v171 v170 = f64const 0.0 v117 -> v170 v169 = iconst.i128 0 v116 -> v169 v168 = bconst.b1 false v115 -> v168 v167 = iconst.i16 0 v114 -> v167 v166 = f32const 0.0 v113 -> v166 return v31, v113, v114, v115, v37, v37, v32, v116, v117, v116, v116, v37, v118, v37, v119, v116 ; v113 = 0.0, v114 = 0, v115 = false, v116 = 0, v117 = 0.0, v116 = 0, v116 = 0, v118 = 0, v119 = 0, v116 = 0 } ; Note: the results in the below test cases are simply a placeholder and probably will be wrong ; run: u0:1(86737344494455290561902794539232559746, false, false, 0, false, 0, 0, 0, 0, 0, 0, 0, 0, 0.0, 0) == [0, 0.0, 0, false, false, false, 0, 0, 0.0, 0, 0, false, 0, false, 0, 0]
</details
afonso360 commented on issue #4761:
<details>
<summary> Minimized this to: </summary>
```
;; Fuzzgen test casetest interpret
test run
set enable_llvm_abi_extensions
target aarch64
target s390x
target x86_64function %a(i128, b1, b1, i128, b1, i16 uext, i8 sext, i64 sext, i32, i64 sext, i128, i128 sext, i64 uext, f32, i128 sext) -> i64 sext, f32, i16, b1, b1 sext, b1 sext, i8, i128 sext, f64 sext, i128 sext, i128, b1, i32, b1 uext, i8, i128 sext windows_fastcall {
ss0 = explicit_slot 59block0(v0: i128, v1: b1, v2: b1, v3: i128, v4: b1, v5: i16, v6: i8, v7: i64, v8: i32, v9: i64, v10: i128, v11: i128, v12: i64, v13: f32, v14: i128):
v39 = f64const 0x1.d7d7d7d7d006fp984
v40 = iconst.i64 0xa901_0024_3b00_0000
v41 = iconst.i64 0x7d7d_7d7d_7d7d_7d00
v42 = iconcat v41, v40 ; v41 = 0x7d7d_7d7d_7d7d_7d00, v40 = 0xa901_0024_3b00_0000
v43 = iconst.i64 0xffff_ffff_ffff_0e7d
v44 = iconst.i64 2303
v45 = iconcat v44, v43 ; v44 = 2303, v43 = 0xffff_ffff_ffff_0e7d
v46 = iconst.i8 0
v47 = bconst.b1 true
v48 = iconst.i32 0xffff_ffff_c3c3_c3c3
v49 = bconst.b1 false
v50 = iconst.i8 -1
v51 = iconst.i64 0xc3c3_c341_3c3c
v52 = iconst.i128 0
v53 = iconst.i64 0
v54 = iconst.i32 0
v55 = iconst.i16 0
v56 = iconst.i8 0
stack_store v52, ss0 ; v52 = 0
stack_store v52, ss0+16 ; v52 = 0
stack_store v52, ss0+32 ; v52 = 0
stack_store v53, ss0+48 ; v53 = 0
stack_store v55, ss0+56 ; v55 = 0
stack_store v56, ss0+58 ; v56 = 0
v57 = iadd v6, v6
stack_store v5, ss0
v58 = uextend.i64 v46 ; v46 = 0
v59 = udiv v51, v51 ; v51 = 0xc3c3_c341_3c3c, v51 = 0xc3c3_c341_3c3c
v60 = udiv v59, v59
v61 = iadd v0, v0
v62 = isub v8, v8
v63 = ushr v62, v46 ; v46 = 0
v64 = ushr v62, v46 ; v46 = 0
v65 = ushr v63, v57
v66 = ushr v65, v46 ; v46 = 0
v67 = ushr v66, v46 ; v46 = 0
v68 = ushr v67, v46 ; v46 = 0
v69 = fcopysign v39, v39 ; v39 = 0x1.d7d7d7d7d006fp984, v39 = 0x1.d7d7d7d7d006fp984
return v58, v13, v5, v1, v1, v1, v46, v14, v69, v45, v42, v1, v68, v1, v46, v61 ; v46 = 0, v46 = 0
}; run: %a(86737344494455290561902794539232559746, false, false, 0, false, 0, 0, 0, 0, 0, 0, 0, 0, 0.0, 0) == [0, 0.0, 0, false, false, false, 0, 0, 0x1.d7d7d7d7d006fp984, -1140506845845240447760129, -115637640465955680012984649720956879616, false, 0, false, 0, -166807677932027882339569018353303091964]
```
</details>And this stops failing if we remove the
stack_store.i16
, so I think its related to that.
afonso360 edited a comment on issue #4761:
<details>
<summary> Minimized this to: </summary>```
;; Fuzzgen test casetest interpret
test run
set enable_llvm_abi_extensions
target aarch64
target s390x
target x86_64function %a(i128, b1, b1, i128, b1, i16 uext, i8 sext, i64 sext, i32, i64 sext, i128, i128 sext, i64 uext, f32, i128 sext) -> i64 sext, f32, i16, b1, b1 sext, b1 sext, i8, i128 sext, f64 sext, i128 sext, i128, b1, i32, b1 uext, i8, i128 sext windows_fastcall {
ss0 = explicit_slot 59block0(v0: i128, v1: b1, v2: b1, v3: i128, v4: b1, v5: i16, v6: i8, v7: i64, v8: i32, v9: i64, v10: i128, v11: i128, v12: i64, v13: f32, v14: i128):
v39 = f64const 0x1.d7d7d7d7d006fp984
v40 = iconst.i64 0xa901_0024_3b00_0000
v41 = iconst.i64 0x7d7d_7d7d_7d7d_7d00
v42 = iconcat v41, v40 ; v41 = 0x7d7d_7d7d_7d7d_7d00, v40 = 0xa901_0024_3b00_0000
v43 = iconst.i64 0xffff_ffff_ffff_0e7d
v44 = iconst.i64 2303
v45 = iconcat v44, v43 ; v44 = 2303, v43 = 0xffff_ffff_ffff_0e7d
v46 = iconst.i8 0
v47 = bconst.b1 true
v48 = iconst.i32 0xffff_ffff_c3c3_c3c3
v49 = bconst.b1 false
v50 = iconst.i8 -1
v51 = iconst.i64 0xc3c3_c341_3c3c
v52 = iconst.i128 0
v53 = iconst.i64 0
v54 = iconst.i32 0
v55 = iconst.i16 0
v56 = iconst.i8 0
stack_store v52, ss0 ; v52 = 0
stack_store v52, ss0+16 ; v52 = 0
stack_store v52, ss0+32 ; v52 = 0
stack_store v53, ss0+48 ; v53 = 0
stack_store v55, ss0+56 ; v55 = 0
stack_store v56, ss0+58 ; v56 = 0
v57 = iadd v6, v6
stack_store v5, ss0
v58 = uextend.i64 v46 ; v46 = 0
v59 = udiv v51, v51 ; v51 = 0xc3c3_c341_3c3c, v51 = 0xc3c3_c341_3c3c
v60 = udiv v59, v59
v61 = iadd v0, v0
v62 = isub v8, v8
v63 = ushr v62, v46 ; v46 = 0
v64 = ushr v62, v46 ; v46 = 0
v65 = ushr v63, v57
v66 = ushr v65, v46 ; v46 = 0
v67 = ushr v66, v46 ; v46 = 0
v68 = ushr v67, v46 ; v46 = 0
v69 = fcopysign v39, v39 ; v39 = 0x1.d7d7d7d7d006fp984, v39 = 0x1.d7d7d7d7d006fp984
return v58, v13, v5, v1, v1, v1, v46, v14, v69, v45, v42, v1, v68, v1, v46, v61 ; v46 = 0, v46 = 0
}; run: %a(86737344494455290561902794539232559746, false, false, 0, false, 0, 0, 0, 0, 0, 0, 0, 0, 0.0, 0) == [0, 0.0, 0, false, false, false, 0, 0, 0x1.d7d7d7d7d006fp984, -1140506845845240447760129, -115637640465955680012984649720956879616, false, 0, false, 0, -166807677932027882339569018353303091964]
```
</details>
And this stops failing if we remove the
stack_store.i16
, so I think its related to that.
afonso360 edited a comment on issue #4761:
<details>
<summary> Minimized this to: </summary>```
;; Fuzzgen test casetest interpret
test run
set enable_llvm_abi_extensions
target aarch64
target s390x
target x86_64function %a(i128, b1, b1, i128, b1, i16 uext, i8 sext, i64 sext, i32, i64 sext, i128, i128 sext, i64 uext, f32, i128 sext) -> i64 sext, f32, i16, b1, b1 sext, b1 sext, i8, i128 sext, f64 sext, i128 sext, i128, b1, i32, b1 uext, i8, i128 sext windows_fastcall {
ss0 = explicit_slot 59block0(v0: i128, v1: b1, v2: b1, v3: i128, v4: b1, v5: i16, v6: i8, v7: i64, v8: i32, v9: i64, v10: i128, v11: i128, v12: i64, v13: f32, v14: i128):
v39 = f64const 0x1.d7d7d7d7d006fp984
v40 = iconst.i64 0xa901_0024_3b00_0000
v41 = iconst.i64 0x7d7d_7d7d_7d7d_7d00
v42 = iconcat v41, v40 ; v41 = 0x7d7d_7d7d_7d7d_7d00, v40 = 0xa901_0024_3b00_0000
v43 = iconst.i64 0xffff_ffff_ffff_0e7d
v44 = iconst.i64 2303
v45 = iconcat v44, v43 ; v44 = 2303, v43 = 0xffff_ffff_ffff_0e7d
v46 = iconst.i8 0
v47 = bconst.b1 true
v48 = iconst.i32 0xffff_ffff_c3c3_c3c3
v49 = bconst.b1 false
v50 = iconst.i8 -1
v51 = iconst.i64 0xc3c3_c341_3c3c
v52 = iconst.i128 0
v53 = iconst.i64 0
v54 = iconst.i32 0
v55 = iconst.i16 0
v56 = iconst.i8 0
stack_store v52, ss0 ; v52 = 0
stack_store v52, ss0+16 ; v52 = 0
stack_store v52, ss0+32 ; v52 = 0
stack_store v53, ss0+48 ; v53 = 0
stack_store v55, ss0+56 ; v55 = 0
stack_store v56, ss0+58 ; v56 = 0
v57 = iadd v6, v6
stack_store v5, ss0
v58 = uextend.i64 v46 ; v46 = 0
v59 = udiv v51, v51 ; v51 = 0xc3c3_c341_3c3c, v51 = 0xc3c3_c341_3c3c
v60 = udiv v59, v59
v61 = iadd v0, v0
v62 = isub v8, v8
v63 = ushr v62, v46 ; v46 = 0
v64 = ushr v62, v46 ; v46 = 0
v65 = ushr v63, v57
v66 = ushr v65, v46 ; v46 = 0
v67 = ushr v66, v46 ; v46 = 0
v68 = ushr v67, v46 ; v46 = 0
v69 = fcopysign v39, v39 ; v39 = 0x1.d7d7d7d7d006fp984, v39 = 0x1.d7d7d7d7d006fp984
return v58, v13, v5, v1, v1, v1, v46, v14, v69, v45, v42, v1, v68, v1, v46, v61 ; v46 = 0, v46 = 0
}; run: %a(86737344494455290561902794539232559746, false, false, 0, false, 0, 0, 0, 0, 0, 0, 0, 0, 0.0, 0) == [0, 0.0, 0, false, false, false, 0, 0, 0x1.d7d7d7d7d006fp984, -1140506845845240447760129, -115637640465955680012984649720956879616, false, 0, false, 0, -166807677932027882339569018353303091964]
```
</details>
And this stops failing if we remove the
stack_store v5, ss0
, so I think its related to that.
afonso360 commented on issue #4761:
I've tried to minimize this further, and here's what I got
<details>
<summary>Minimized CLIF</summary>test run set enable_llvm_abi_extensions target x86_64 function %a(i128, b1, b1, i128, b1, i16 uext, i8 sext, i64 sext, i32, i64 sext, i128, i128 sext, i64 uext, f32, i128 sext) -> i64 sext, f32, i16, b1, b1 sext, b1 sext, i8, i128 sext, f64 sext, i128 sext, i128, b1, i32, b1 uext, i8, i128 sext windows_fastcall { ss0 = explicit_slot 59 block0(v0: i128, v1: b1, v2: b1, v3: i128, v4: b1, v5: i16, v6: i8, v7: i64, v8: i32, v9: i64, v10: i128, v11: i128, v12: i64, v13: f32, v14: i128): v39 = f64const 0x1.d7d7d7d7d006fp984 v43 = iconst.i64 0xffff_ffff_ffff_0e7d v44 = iconst.i64 2303 v45 = iconcat v44, v43 ; v44 = 2303, v43 = 0xffff_ffff_ffff_0e7d v46 = iconst.i8 0 v51 = iconst.i64 0xc3c3_c341_3c3c v52 = iconst.i128 0 v53 = iconst.i64 0 v55 = iconst.i16 0 v56 = iconst.i8 0 stack_store v52, ss0 ; v52 = 0 stack_store v52, ss0+16 ; v52 = 0 stack_store v52, ss0+32 ; v52 = 0 stack_store v53, ss0+48 ; v53 = 0 stack_store v55, ss0+56 ; v55 = 0 stack_store v56, ss0+58 ; v56 = 0 stack_store v5, ss0 v58 = uextend.i64 v46 ; v46 = 0 v59 = udiv v51, v51 ; v51 = 0xc3c3_c341_3c3c, v51 = 0xc3c3_c341_3c3c v61 = iadd v0, v0 v63 = ushr v8, v46 ; v46 = 0 v69 = fcopysign v39, v39 ; v39 = 0x1.d7d7d7d7d006fp984, v39 = 0x1.d7d7d7d7d006fp984 return v58, v13, v5, v1, v1, v1, v46, v14, v69, v45, v0, v1, v63, v1, v46, v61 ; v46 = 0, v46 = 0 } ; run: %a(86737344494455290561902794539232559746, false, false, 0, false, 0, 0, 0, 0, 0, 0, 0, 0, 0.0, 0) == [0, 0.0, 0, false, false, false, 0, 0, 0x1.d7d7d7d7d006fp984, -1140506845845240447760129, -115637640465955680012984649720956879616, false, 0, false, 0, -166807677932027882339569018353303091964]
</details>
Removing any of those instructions causes this to stop crashing.
<details>
<summary>Disassembly</summary>Disassembly of 712 bytes: 0: 55 push rbp 1: 48 89 e5 mov rbp, rsp 4: 48 81 ec f0 00 00 00 sub rsp, 0xf0 b: 48 89 9c 24 a0 00 00 00 mov qword ptr [rsp + 0xa0], rbx 13: 48 89 b4 24 a8 00 00 00 mov qword ptr [rsp + 0xa8], rsi 1b: 48 89 bc 24 b0 00 00 00 mov qword ptr [rsp + 0xb0], rdi 23: 4c 89 a4 24 b8 00 00 00 mov qword ptr [rsp + 0xb8], r12 2b: 4c 89 ac 24 c0 00 00 00 mov qword ptr [rsp + 0xc0], r13 33: 4c 89 b4 24 c8 00 00 00 mov qword ptr [rsp + 0xc8], r14 3b: 4c 89 bc 24 d0 00 00 00 mov qword ptr [rsp + 0xd0], r15 43: f3 44 0f 7f b4 24 e0 00 00 00 movdqu xmmword ptr [rsp + 0xe0], xmm14 4d: 48 89 4c 24 40 mov qword ptr [rsp + 0x40], rcx 52: 48 89 54 24 48 mov qword ptr [rsp + 0x48], rdx 57: 4c 89 44 24 50 mov qword ptr [rsp + 0x50], r8 5c: 48 8b 45 30 mov rax, qword ptr [rbp + 0x30] 60: 48 8b 75 38 mov rsi, qword ptr [rbp + 0x38] 64: 4c 8b 4d 40 mov r9, qword ptr [rbp + 0x40] 68: 48 8b 55 48 mov rdx, qword ptr [rbp + 0x48] 6c: 4c 8b 55 50 mov r10, qword ptr [rbp + 0x50] 70: 48 8b 5d 58 mov rbx, qword ptr [rbp + 0x58] 74: 4c 8b 5d 60 mov r11, qword ptr [rbp + 0x60] 78: 4c 89 9c 24 88 00 00 00 mov qword ptr [rsp + 0x88], r11 80: 4c 8b 65 68 mov r12, qword ptr [rbp + 0x68] 84: 4c 8b 6d 70 mov r13, qword ptr [rbp + 0x70] 88: 4c 8b 75 78 mov r14, qword ptr [rbp + 0x78] 8c: 4c 8b bd 80 00 00 00 mov r15, qword ptr [rbp + 0x80] 93: 48 8b bd 88 00 00 00 mov rdi, qword ptr [rbp + 0x88] 9a: 4c 8b 9d 90 00 00 00 mov r11, qword ptr [rbp + 0x90] a1: f3 0f 10 85 98 00 00 00 movss xmm0, dword ptr [rbp + 0x98] a9: 48 8b bd a0 00 00 00 mov rdi, qword ptr [rbp + 0xa0] b0: 48 89 7c 24 70 mov qword ptr [rsp + 0x70], rdi b5: 48 8b 8d a8 00 00 00 mov rcx, qword ptr [rbp + 0xa8] bc: 48 89 4c 24 78 mov qword ptr [rsp + 0x78], rcx c1: 4c 8b 9d b0 00 00 00 mov r11, qword ptr [rbp + 0xb0] c8: b8 ff 08 00 00 mov eax, 0x8ff cd: 48 89 84 24 98 00 00 00 mov qword ptr [rsp + 0x98], rax d5: 48 c7 c0 7d 0e ff ff mov rax, 0xffffffffffff0e7d dc: 49 89 c2 mov r10, rax df: 48 8d 34 24 lea rsi, [rsp] e3: 4d 31 c9 xor r9, r9 e6: 48 31 ff xor rdi, rdi e9: 4c 89 0e mov qword ptr [rsi], r9 ec: 48 89 7e 08 mov qword ptr [rsi + 8], rdi f0: 48 8d 74 24 10 lea rsi, [rsp + 0x10] f5: 48 31 c0 xor rax, rax f8: 48 31 ff xor rdi, rdi fb: 48 89 06 mov qword ptr [rsi], rax fe: 48 89 7e 08 mov qword ptr [rsi + 8], rdi 102: 48 8d 44 24 20 lea rax, [rsp + 0x20] 107: 48 31 ff xor rdi, rdi 10a: 4d 31 c0 xor r8, r8 10d: 48 89 38 mov qword ptr [rax], rdi 110: 4c 89 40 08 mov qword ptr [rax + 8], r8 114: 4c 8d 44 24 30 lea r8, [rsp + 0x30] 119: 4d 31 c9 xor r9, r9 11c: 4d 89 08 mov qword ptr [r8], r9 11f: 4c 8d 44 24 38 lea r8, [rsp + 0x38] 124: 45 31 c9 xor r9d, r9d 127: 66 45 89 08 mov word ptr [r8], r9w 12b: 4c 8d 4c 24 3a lea r9, [rsp + 0x3a] 130: 31 f6 xor esi, esi 132: 41 88 31 mov byte ptr [r9], sil 135: 48 8d 34 24 lea rsi, [rsp] 139: 66 89 16 mov word ptr [rsi], dx 13c: 48 89 54 24 68 mov qword ptr [rsp + 0x68], rdx 141: 48 0f b6 05 77 01 00 00 movzx rax, byte ptr [rip + 0x177] 149: 49 89 c0 mov r8, rax 14c: 48 b8 3c 3c 41 c3 c3 c3 00 00 movabs rax, 0xc3c3c3413c3c 156: 48 be 3c 3c 41 c3 c3 c3 00 00 movabs rsi, 0xc3c3c3413c3c 160: ba 00 00 00 00 mov edx, 0 165: 48 f7 f6 div rsi 168: 48 8b 4c 24 40 mov rcx, qword ptr [rsp + 0x40] 16d: 49 89 cc mov r12, rcx 170: 49 01 cc add r12, rcx 173: 4d 89 e1 mov r9, r12 176: 48 8b 5c 24 48 mov rbx, qword ptr [rsp + 0x48] 17b: 48 89 da mov rdx, rbx 17e: 48 11 da adc rdx, rbx 181: 48 89 94 24 90 00 00 00 mov qword ptr [rsp + 0x90], rdx 189: 48 8b 8c 24 88 00 00 00 mov rcx, qword ptr [rsp + 0x88] 191: c1 e9 00 shr ecx, 0 194: 48 be 00 00 00 00 00 00 00 80 movabs rsi, 0x8000000000000000 19e: 66 4c 0f 6e f6 movq xmm14, rsi 1a3: 66 41 0f 6f ce movdqa xmm1, xmm14 1a8: 66 0f 55 0d 08 01 00 00 andnpd xmm1, xmmword ptr [rip + 0x108] 1b0: 66 44 0f 54 35 ff 00 00 00 andpd xmm14, xmmword ptr [rip + 0xff] 1b9: 66 41 0f 56 ce orpd xmm1, xmm14 1be: 4c 89 c0 mov rax, r8 1c1: 48 8b 54 24 68 mov rdx, qword ptr [rsp + 0x68] 1c6: 48 8b 7c 24 50 mov rdi, qword ptr [rsp + 0x50] 1cb: 48 89 7c 24 58 mov qword ptr [rsp + 0x58], rdi 1d0: 48 89 7c 24 60 mov qword ptr [rsp + 0x60], rdi 1d5: 48 89 bc 24 80 00 00 00 mov qword ptr [rsp + 0x80], rdi 1dd: 45 31 ed xor r13d, r13d 1e0: 4c 8b 7c 24 70 mov r15, qword ptr [rsp + 0x70] 1e5: 4c 8b 64 24 78 mov r12, qword ptr [rsp + 0x78] 1ea: 4d 89 d6 mov r14, r10 1ed: 48 8b 74 24 40 mov rsi, qword ptr [rsp + 0x40] 1f2: 45 31 c0 xor r8d, r8d 1f5: 4c 8b 54 24 58 mov r10, qword ptr [rsp + 0x58] 1fa: 45 88 13 mov byte ptr [r11], r10b 1fd: 4c 8b 54 24 60 mov r10, qword ptr [rsp + 0x60] 202: 45 88 53 08 mov byte ptr [r11 + 8], r10b 206: 4c 8b 94 24 80 00 00 00 mov r10, qword ptr [rsp + 0x80] 20e: 45 88 53 10 mov byte ptr [r11 + 0x10], r10b 212: 45 88 6b 18 mov byte ptr [r11 + 0x18], r13b 216: 4d 89 7b 20 mov qword ptr [r11 + 0x20], r15 21a: 4d 89 63 28 mov qword ptr [r11 + 0x28], r12 21e: f2 41 0f 11 4b 30 movsd qword ptr [r11 + 0x30], xmm1 224: 4c 8b 94 24 98 00 00 00 mov r10, qword ptr [rsp + 0x98] 22c: 4d 89 53 38 mov qword ptr [r11 + 0x38], r10 230: 4d 89 73 40 mov qword ptr [r11 + 0x40], r14 234: 49 89 73 48 mov qword ptr [r11 + 0x48], rsi 238: 49 89 5b 50 mov qword ptr [r11 + 0x50], rbx 23c: 41 88 7b 58 mov byte ptr [r11 + 0x58], dil 240: 41 89 4b 60 mov dword ptr [r11 + 0x60], ecx 244: 48 8b 74 24 50 mov rsi, qword ptr [rsp + 0x50] 249: 41 88 73 68 mov byte ptr [r11 + 0x68], sil 24d: 45 88 43 70 mov byte ptr [r11 + 0x70], r8b 251: 4d 89 4b 78 mov qword ptr [r11 + 0x78], r9 255: 4c 8b 84 24 90 00 00 00 mov r8, qword ptr [rsp + [message truncated]
afonso360 deleted a comment on issue #4761:
I've tried to minimize this further, and here's what I got
<details>
<summary>Minimized CLIF</summary>test run set enable_llvm_abi_extensions target x86_64 function %a(i128, b1, b1, i128, b1, i16 uext, i8 sext, i64 sext, i32, i64 sext, i128, i128 sext, i64 uext, f32, i128 sext) -> i64 sext, f32, i16, b1, b1 sext, b1 sext, i8, i128 sext, f64 sext, i128 sext, i128, b1, i32, b1 uext, i8, i128 sext windows_fastcall { ss0 = explicit_slot 59 block0(v0: i128, v1: b1, v2: b1, v3: i128, v4: b1, v5: i16, v6: i8, v7: i64, v8: i32, v9: i64, v10: i128, v11: i128, v12: i64, v13: f32, v14: i128): v39 = f64const 0x1.d7d7d7d7d006fp984 v43 = iconst.i64 0xffff_ffff_ffff_0e7d v44 = iconst.i64 2303 v45 = iconcat v44, v43 ; v44 = 2303, v43 = 0xffff_ffff_ffff_0e7d v46 = iconst.i8 0 v51 = iconst.i64 0xc3c3_c341_3c3c v52 = iconst.i128 0 v53 = iconst.i64 0 v55 = iconst.i16 0 v56 = iconst.i8 0 stack_store v52, ss0 ; v52 = 0 stack_store v52, ss0+16 ; v52 = 0 stack_store v52, ss0+32 ; v52 = 0 stack_store v53, ss0+48 ; v53 = 0 stack_store v55, ss0+56 ; v55 = 0 stack_store v56, ss0+58 ; v56 = 0 stack_store v5, ss0 v58 = uextend.i64 v46 ; v46 = 0 v59 = udiv v51, v51 ; v51 = 0xc3c3_c341_3c3c, v51 = 0xc3c3_c341_3c3c v61 = iadd v0, v0 v63 = ushr v8, v46 ; v46 = 0 v69 = fcopysign v39, v39 ; v39 = 0x1.d7d7d7d7d006fp984, v39 = 0x1.d7d7d7d7d006fp984 return v58, v13, v5, v1, v1, v1, v46, v14, v69, v45, v0, v1, v63, v1, v46, v61 ; v46 = 0, v46 = 0 } ; run: %a(86737344494455290561902794539232559746, false, false, 0, false, 0, 0, 0, 0, 0, 0, 0, 0, 0.0, 0) == [0, 0.0, 0, false, false, false, 0, 0, 0x1.d7d7d7d7d006fp984, -1140506845845240447760129, -115637640465955680012984649720956879616, false, 0, false, 0, -166807677932027882339569018353303091964]
</details>
Removing any of those instructions causes this to stop crashing.
<details>
<summary>Disassembly</summary>Disassembly of 712 bytes: 0: 55 push rbp 1: 48 89 e5 mov rbp, rsp 4: 48 81 ec f0 00 00 00 sub rsp, 0xf0 b: 48 89 9c 24 a0 00 00 00 mov qword ptr [rsp + 0xa0], rbx 13: 48 89 b4 24 a8 00 00 00 mov qword ptr [rsp + 0xa8], rsi 1b: 48 89 bc 24 b0 00 00 00 mov qword ptr [rsp + 0xb0], rdi 23: 4c 89 a4 24 b8 00 00 00 mov qword ptr [rsp + 0xb8], r12 2b: 4c 89 ac 24 c0 00 00 00 mov qword ptr [rsp + 0xc0], r13 33: 4c 89 b4 24 c8 00 00 00 mov qword ptr [rsp + 0xc8], r14 3b: 4c 89 bc 24 d0 00 00 00 mov qword ptr [rsp + 0xd0], r15 43: f3 44 0f 7f b4 24 e0 00 00 00 movdqu xmmword ptr [rsp + 0xe0], xmm14 4d: 48 89 4c 24 40 mov qword ptr [rsp + 0x40], rcx 52: 48 89 54 24 48 mov qword ptr [rsp + 0x48], rdx 57: 4c 89 44 24 50 mov qword ptr [rsp + 0x50], r8 5c: 48 8b 45 30 mov rax, qword ptr [rbp + 0x30] 60: 48 8b 75 38 mov rsi, qword ptr [rbp + 0x38] 64: 4c 8b 4d 40 mov r9, qword ptr [rbp + 0x40] 68: 48 8b 55 48 mov rdx, qword ptr [rbp + 0x48] 6c: 4c 8b 55 50 mov r10, qword ptr [rbp + 0x50] 70: 48 8b 5d 58 mov rbx, qword ptr [rbp + 0x58] 74: 4c 8b 5d 60 mov r11, qword ptr [rbp + 0x60] 78: 4c 89 9c 24 88 00 00 00 mov qword ptr [rsp + 0x88], r11 80: 4c 8b 65 68 mov r12, qword ptr [rbp + 0x68] 84: 4c 8b 6d 70 mov r13, qword ptr [rbp + 0x70] 88: 4c 8b 75 78 mov r14, qword ptr [rbp + 0x78] 8c: 4c 8b bd 80 00 00 00 mov r15, qword ptr [rbp + 0x80] 93: 48 8b bd 88 00 00 00 mov rdi, qword ptr [rbp + 0x88] 9a: 4c 8b 9d 90 00 00 00 mov r11, qword ptr [rbp + 0x90] a1: f3 0f 10 85 98 00 00 00 movss xmm0, dword ptr [rbp + 0x98] a9: 48 8b bd a0 00 00 00 mov rdi, qword ptr [rbp + 0xa0] b0: 48 89 7c 24 70 mov qword ptr [rsp + 0x70], rdi b5: 48 8b 8d a8 00 00 00 mov rcx, qword ptr [rbp + 0xa8] bc: 48 89 4c 24 78 mov qword ptr [rsp + 0x78], rcx c1: 4c 8b 9d b0 00 00 00 mov r11, qword ptr [rbp + 0xb0] c8: b8 ff 08 00 00 mov eax, 0x8ff cd: 48 89 84 24 98 00 00 00 mov qword ptr [rsp + 0x98], rax d5: 48 c7 c0 7d 0e ff ff mov rax, 0xffffffffffff0e7d dc: 49 89 c2 mov r10, rax df: 48 8d 34 24 lea rsi, [rsp] e3: 4d 31 c9 xor r9, r9 e6: 48 31 ff xor rdi, rdi e9: 4c 89 0e mov qword ptr [rsi], r9 ec: 48 89 7e 08 mov qword ptr [rsi + 8], rdi f0: 48 8d 74 24 10 lea rsi, [rsp + 0x10] f5: 48 31 c0 xor rax, rax f8: 48 31 ff xor rdi, rdi fb: 48 89 06 mov qword ptr [rsi], rax fe: 48 89 7e 08 mov qword ptr [rsi + 8], rdi 102: 48 8d 44 24 20 lea rax, [rsp + 0x20] 107: 48 31 ff xor rdi, rdi 10a: 4d 31 c0 xor r8, r8 10d: 48 89 38 mov qword ptr [rax], rdi 110: 4c 89 40 08 mov qword ptr [rax + 8], r8 114: 4c 8d 44 24 30 lea r8, [rsp + 0x30] 119: 4d 31 c9 xor r9, r9 11c: 4d 89 08 mov qword ptr [r8], r9 11f: 4c 8d 44 24 38 lea r8, [rsp + 0x38] 124: 45 31 c9 xor r9d, r9d 127: 66 45 89 08 mov word ptr [r8], r9w 12b: 4c 8d 4c 24 3a lea r9, [rsp + 0x3a] 130: 31 f6 xor esi, esi 132: 41 88 31 mov byte ptr [r9], sil 135: 48 8d 34 24 lea rsi, [rsp] 139: 66 89 16 mov word ptr [rsi], dx 13c: 48 89 54 24 68 mov qword ptr [rsp + 0x68], rdx 141: 48 0f b6 05 77 01 00 00 movzx rax, byte ptr [rip + 0x177] 149: 49 89 c0 mov r8, rax 14c: 48 b8 3c 3c 41 c3 c3 c3 00 00 movabs rax, 0xc3c3c3413c3c 156: 48 be 3c 3c 41 c3 c3 c3 00 00 movabs rsi, 0xc3c3c3413c3c 160: ba 00 00 00 00 mov edx, 0 165: 48 f7 f6 div rsi 168: 48 8b 4c 24 40 mov rcx, qword ptr [rsp + 0x40] 16d: 49 89 cc mov r12, rcx 170: 49 01 cc add r12, rcx 173: 4d 89 e1 mov r9, r12 176: 48 8b 5c 24 48 mov rbx, qword ptr [rsp + 0x48] 17b: 48 89 da mov rdx, rbx 17e: 48 11 da adc rdx, rbx 181: 48 89 94 24 90 00 00 00 mov qword ptr [rsp + 0x90], rdx 189: 48 8b 8c 24 88 00 00 00 mov rcx, qword ptr [rsp + 0x88] 191: c1 e9 00 shr ecx, 0 194: 48 be 00 00 00 00 00 00 00 80 movabs rsi, 0x8000000000000000 19e: 66 4c 0f 6e f6 movq xmm14, rsi 1a3: 66 41 0f 6f ce movdqa xmm1, xmm14 1a8: 66 0f 55 0d 08 01 00 00 andnpd xmm1, xmmword ptr [rip + 0x108] 1b0: 66 44 0f 54 35 ff 00 00 00 andpd xmm14, xmmword ptr [rip + 0xff] 1b9: 66 41 0f 56 ce orpd xmm1, xmm14 1be: 4c 89 c0 mov rax, r8 1c1: 48 8b 54 24 68 mov rdx, qword ptr [rsp + 0x68] 1c6: 48 8b 7c 24 50 mov rdi, qword ptr [rsp + 0x50] 1cb: 48 89 7c 24 58 mov qword ptr [rsp + 0x58], rdi 1d0: 48 89 7c 24 60 mov qword ptr [rsp + 0x60], rdi 1d5: 48 89 bc 24 80 00 00 00 mov qword ptr [rsp + 0x80], rdi 1dd: 45 31 ed xor r13d, r13d 1e0: 4c 8b 7c 24 70 mov r15, qword ptr [rsp + 0x70] 1e5: 4c 8b 64 24 78 mov r12, qword ptr [rsp + 0x78] 1ea: 4d 89 d6 mov r14, r10 1ed: 48 8b 74 24 40 mov rsi, qword ptr [rsp + 0x40] 1f2: 45 31 c0 xor r8d, r8d 1f5: 4c 8b 54 24 58 mov r10, qword ptr [rsp + 0x58] 1fa: 45 88 13 mov byte ptr [r11], r10b 1fd: 4c 8b 54 24 60 mov r10, qword ptr [rsp + 0x60] 202: 45 88 53 08 mov byte ptr [r11 + 8], r10b 206: 4c 8b 94 24 80 00 00 00 mov r10, qword ptr [rsp + 0x80] 20e: 45 88 53 10 mov byte ptr [r11 + 0x10], r10b 212: 45 88 6b 18 mov byte ptr [r11 + 0x18], r13b 216: 4d 89 7b 20 mov qword ptr [r11 + 0x20], r15 21a: 4d 89 63 28 mov qword ptr [r11 + 0x28], r12 21e: f2 41 0f 11 4b 30 movsd qword ptr [r11 + 0x30], xmm1 224: 4c 8b 94 24 98 00 00 00 mov r10, qword ptr [rsp + 0x98] 22c: 4d 89 53 38 mov qword ptr [r11 + 0x38], r10 230: 4d 89 73 40 mov qword ptr [r11 + 0x40], r14 234: 49 89 73 48 mov qword ptr [r11 + 0x48], rsi 238: 49 89 5b 50 mov qword ptr [r11 + 0x50], rbx 23c: 41 88 7b 58 mov byte ptr [r11 + 0x58], dil 240: 41 89 4b 60 mov dword ptr [r11 + 0x60], ecx 244: 48 8b 74 24 50 mov rsi, qword ptr [rsp + 0x50] 249: 41 88 73 68 mov byte ptr [r11 + 0x68], sil 24d: 45 88 43 70 mov byte ptr [r11 + 0x70], r8b 251: 4d 89 4b 78 mov qword ptr [r11 + 0x78], r9 255: 4c 8b 84 24 90 00 00 00 mov r8, qword pt [message truncated]
bjorn3 commented on issue #4761:
The address is 8 aligned (does it need to be 16?).
Yes, vector instructions that do a load or store as part of their operand need it to be 16 byte aligned. Only the unaligned variants of vector loads and stores (like movupd) allow arbitrary alignment.
afonso360 commented on issue #4761:
I've tried to minimize this further, and here's what I got
<details>
<summary>Minimized CLIF</summary>test run set enable_llvm_abi_extensions target x86_64 function %a(i128, b1, b1, i128, b1, i16 uext, i8 sext, i64 sext, i32, i64 sext, i128, i128 sext, i64 uext, f32, i128 sext) -> i64 sext, f32, i16, b1, b1 sext, b1 sext, i8, i128 sext, f64 sext, i128 sext, i128, b1, i32, b1 uext, i8, i128 sext windows_fastcall { ss0 = explicit_slot 59 block0(v0: i128, v1: b1, v2: b1, v3: i128, v4: b1, v5: i16, v6: i8, v7: i64, v8: i32, v9: i64, v10: i128, v11: i128, v12: i64, v13: f32, v14: i128): v39 = f64const 0x1.d7d7d7d7d006fp984 v43 = iconst.i64 0xffff_ffff_ffff_0e7d v44 = iconst.i64 2303 v45 = iconcat v44, v43 ; v44 = 2303, v43 = 0xffff_ffff_ffff_0e7d v46 = iconst.i8 0 v51 = iconst.i64 0xc3c3_c341_3c3c v52 = iconst.i128 0 v53 = iconst.i64 0 v55 = iconst.i16 0 v56 = iconst.i8 0 stack_store v52, ss0 ; v52 = 0 stack_store v52, ss0+16 ; v52 = 0 stack_store v52, ss0+32 ; v52 = 0 stack_store v53, ss0+48 ; v53 = 0 stack_store v55, ss0+56 ; v55 = 0 stack_store v56, ss0+58 ; v56 = 0 stack_store v5, ss0 v58 = uextend.i64 v46 ; v46 = 0 v59 = udiv v51, v51 ; v51 = 0xc3c3_c341_3c3c, v51 = 0xc3c3_c341_3c3c v61 = iadd v0, v0 v63 = ushr v8, v46 ; v46 = 0 v69 = fcopysign v39, v39 ; v39 = 0x1.d7d7d7d7d006fp984, v39 = 0x1.d7d7d7d7d006fp984 return v58, v13, v5, v1, v1, v1, v46, v14, v69, v45, v0, v1, v63, v1, v46, v61 ; v46 = 0, v46 = 0 } ; run: %a(86737344494455290561902794539232559746, false, false, 0, false, 0, 0, 0, 0, 0, 0, 0, 0, 0.0, 0) == [0, 0.0, 0, false, false, false, 0, 0, 0x1.d7d7d7d7d006fp984, -1140506845845240447760129, -115637640465955680012984649720956879616, false, 0, false, 0, -166807677932027882339569018353303091964]
</details>
Removing any of those instructions causes this to stop crashing.
<details>
<summary>Disassembly</summary>Disassembly of 712 bytes: 0: 55 push rbp 1: 48 89 e5 mov rbp, rsp 4: 48 81 ec f0 00 00 00 sub rsp, 0xf0 b: 48 89 9c 24 a0 00 00 00 mov qword ptr [rsp + 0xa0], rbx 13: 48 89 b4 24 a8 00 00 00 mov qword ptr [rsp + 0xa8], rsi 1b: 48 89 bc 24 b0 00 00 00 mov qword ptr [rsp + 0xb0], rdi 23: 4c 89 a4 24 b8 00 00 00 mov qword ptr [rsp + 0xb8], r12 2b: 4c 89 ac 24 c0 00 00 00 mov qword ptr [rsp + 0xc0], r13 33: 4c 89 b4 24 c8 00 00 00 mov qword ptr [rsp + 0xc8], r14 3b: 4c 89 bc 24 d0 00 00 00 mov qword ptr [rsp + 0xd0], r15 43: f3 44 0f 7f b4 24 e0 00 00 00 movdqu xmmword ptr [rsp + 0xe0], xmm14 4d: 48 89 4c 24 40 mov qword ptr [rsp + 0x40], rcx 52: 48 89 54 24 48 mov qword ptr [rsp + 0x48], rdx 57: 4c 89 44 24 50 mov qword ptr [rsp + 0x50], r8 5c: 48 8b 45 30 mov rax, qword ptr [rbp + 0x30] 60: 48 8b 75 38 mov rsi, qword ptr [rbp + 0x38] 64: 4c 8b 4d 40 mov r9, qword ptr [rbp + 0x40] 68: 48 8b 55 48 mov rdx, qword ptr [rbp + 0x48] 6c: 4c 8b 55 50 mov r10, qword ptr [rbp + 0x50] 70: 48 8b 5d 58 mov rbx, qword ptr [rbp + 0x58] 74: 4c 8b 5d 60 mov r11, qword ptr [rbp + 0x60] 78: 4c 89 9c 24 88 00 00 00 mov qword ptr [rsp + 0x88], r11 80: 4c 8b 65 68 mov r12, qword ptr [rbp + 0x68] 84: 4c 8b 6d 70 mov r13, qword ptr [rbp + 0x70] 88: 4c 8b 75 78 mov r14, qword ptr [rbp + 0x78] 8c: 4c 8b bd 80 00 00 00 mov r15, qword ptr [rbp + 0x80] 93: 48 8b bd 88 00 00 00 mov rdi, qword ptr [rbp + 0x88] 9a: 4c 8b 9d 90 00 00 00 mov r11, qword ptr [rbp + 0x90] a1: f3 0f 10 85 98 00 00 00 movss xmm0, dword ptr [rbp + 0x98] a9: 48 8b bd a0 00 00 00 mov rdi, qword ptr [rbp + 0xa0] b0: 48 89 7c 24 70 mov qword ptr [rsp + 0x70], rdi b5: 48 8b 8d a8 00 00 00 mov rcx, qword ptr [rbp + 0xa8] bc: 48 89 4c 24 78 mov qword ptr [rsp + 0x78], rcx c1: 4c 8b 9d b0 00 00 00 mov r11, qword ptr [rbp + 0xb0] c8: b8 ff 08 00 00 mov eax, 0x8ff cd: 48 89 84 24 98 00 00 00 mov qword ptr [rsp + 0x98], rax d5: 48 c7 c0 7d 0e ff ff mov rax, 0xffffffffffff0e7d dc: 49 89 c2 mov r10, rax df: 48 8d 34 24 lea rsi, [rsp] e3: 4d 31 c9 xor r9, r9 e6: 48 31 ff xor rdi, rdi e9: 4c 89 0e mov qword ptr [rsi], r9 ec: 48 89 7e 08 mov qword ptr [rsi + 8], rdi f0: 48 8d 74 24 10 lea rsi, [rsp + 0x10] f5: 48 31 c0 xor rax, rax f8: 48 31 ff xor rdi, rdi fb: 48 89 06 mov qword ptr [rsi], rax fe: 48 89 7e 08 mov qword ptr [rsi + 8], rdi 102: 48 8d 44 24 20 lea rax, [rsp + 0x20] 107: 48 31 ff xor rdi, rdi 10a: 4d 31 c0 xor r8, r8 10d: 48 89 38 mov qword ptr [rax], rdi 110: 4c 89 40 08 mov qword ptr [rax + 8], r8 114: 4c 8d 44 24 30 lea r8, [rsp + 0x30] 119: 4d 31 c9 xor r9, r9 11c: 4d 89 08 mov qword ptr [r8], r9 11f: 4c 8d 44 24 38 lea r8, [rsp + 0x38] 124: 45 31 c9 xor r9d, r9d 127: 66 45 89 08 mov word ptr [r8], r9w 12b: 4c 8d 4c 24 3a lea r9, [rsp + 0x3a] 130: 31 f6 xor esi, esi 132: 41 88 31 mov byte ptr [r9], sil 135: 48 8d 34 24 lea rsi, [rsp] 139: 66 89 16 mov word ptr [rsi], dx 13c: 48 89 54 24 68 mov qword ptr [rsp + 0x68], rdx 141: 48 0f b6 05 77 01 00 00 movzx rax, byte ptr [rip + 0x177] 149: 49 89 c0 mov r8, rax 14c: 48 b8 3c 3c 41 c3 c3 c3 00 00 movabs rax, 0xc3c3c3413c3c 156: 48 be 3c 3c 41 c3 c3 c3 00 00 movabs rsi, 0xc3c3c3413c3c 160: ba 00 00 00 00 mov edx, 0 165: 48 f7 f6 div rsi 168: 48 8b 4c 24 40 mov rcx, qword ptr [rsp + 0x40] 16d: 49 89 cc mov r12, rcx 170: 49 01 cc add r12, rcx 173: 4d 89 e1 mov r9, r12 176: 48 8b 5c 24 48 mov rbx, qword ptr [rsp + 0x48] 17b: 48 89 da mov rdx, rbx 17e: 48 11 da adc rdx, rbx 181: 48 89 94 24 90 00 00 00 mov qword ptr [rsp + 0x90], rdx 189: 48 8b 8c 24 88 00 00 00 mov rcx, qword ptr [rsp + 0x88] 191: c1 e9 00 shr ecx, 0 194: 48 be 00 00 00 00 00 00 00 80 movabs rsi, 0x8000000000000000 19e: 66 4c 0f 6e f6 movq xmm14, rsi 1a3: 66 41 0f 6f ce movdqa xmm1, xmm14 1a8: 66 0f 55 0d 08 01 00 00 andnpd xmm1, xmmword ptr [rip + 0x108] 1b0: 66 44 0f 54 35 ff 00 00 00 andpd xmm14, xmmword ptr [rip + 0xff] 1b9: 66 41 0f 56 ce orpd xmm1, xmm14 1be: 4c 89 c0 mov rax, r8 1c1: 48 8b 54 24 68 mov rdx, qword ptr [rsp + 0x68] 1c6: 48 8b 7c 24 50 mov rdi, qword ptr [rsp + 0x50] 1cb: 48 89 7c 24 58 mov qword ptr [rsp + 0x58], rdi 1d0: 48 89 7c 24 60 mov qword ptr [rsp + 0x60], rdi 1d5: 48 89 bc 24 80 00 00 00 mov qword ptr [rsp + 0x80], rdi 1dd: 45 31 ed xor r13d, r13d 1e0: 4c 8b 7c 24 70 mov r15, qword ptr [rsp + 0x70] 1e5: 4c 8b 64 24 78 mov r12, qword ptr [rsp + 0x78] 1ea: 4d 89 d6 mov r14, r10 1ed: 48 8b 74 24 40 mov rsi, qword ptr [rsp + 0x40] 1f2: 45 31 c0 xor r8d, r8d 1f5: 4c 8b 54 24 58 mov r10, qword ptr [rsp + 0x58] 1fa: 45 88 13 mov byte ptr [r11], r10b 1fd: 4c 8b 54 24 60 mov r10, qword ptr [rsp + 0x60] 202: 45 88 53 08 mov byte ptr [r11 + 8], r10b 206: 4c 8b 94 24 80 00 00 00 mov r10, qword ptr [rsp + 0x80] 20e: 45 88 53 10 mov byte ptr [r11 + 0x10], r10b 212: 45 88 6b 18 mov byte ptr [r11 + 0x18], r13b 216: 4d 89 7b 20 mov qword ptr [r11 + 0x20], r15 21a: 4d 89 63 28 mov qword ptr [r11 + 0x28], r12 21e: f2 41 0f 11 4b 30 movsd qword ptr [r11 + 0x30], xmm1 224: 4c 8b 94 24 98 00 00 00 mov r10, qword ptr [rsp + 0x98] 22c: 4d 89 53 38 mov qword ptr [r11 + 0x38], r10 230: 4d 89 73 40 mov qword ptr [r11 + 0x40], r14 234: 49 89 73 48 mov qword ptr [r11 + 0x48], rsi 238: 49 89 5b 50 mov qword ptr [r11 + 0x50], rbx 23c: 41 88 7b 58 mov byte ptr [r11 + 0x58], dil 240: 41 89 4b 60 mov dword ptr [r11 + 0x60], ecx 244: 48 8b 74 24 50 mov rsi, qword ptr [rsp + 0x50] 249: 41 88 73 68 mov byte ptr [r11 + 0x68], sil 24d: 45 88 43 70 mov byte ptr [r11 + 0x70], r8b 251: 4d 89 4b 78 mov qword ptr [r11 + 0x78], r9 255: 4c 8b 84 24 90 00 00 00 mov r8, qword ptr [rsp + [message truncated]
afonso360 commented on issue #4761:
The address is 8 aligned (does it need to be 16?).
Yes, vector instructions that do a load or store as part of their operand need it to be 16 byte aligned. Only the unaligned variants of vector loads and stores (like movupd) allow arbitrary alignment.
Yep, that does seem to fix it
afonso360 edited a comment on issue #4761:
The address is 8 aligned (does it need to be 16?).
Yes, vector instructions that do a load or store as part of their operand need it to be 16 byte aligned. Only the unaligned variants of vector loads and stores (like movupd) allow arbitrary alignment.
Yep, that does seem to fix it, thanks!
jameysharp commented on issue #4761:
@cfallin @abrown @elliottt: Is there a more targeted way to fix this than reserving 16 bytes for every entry in the constant pool, or should we just merge #4789?
This is cool: cranelift-fuzzgen found a codegen bug that involves interaction between distant parts of the backend, and requires an unlucky stack layout. That's exactly the kind of thing I'd hope for from it. And good work on minimizing the failing input, @afonso360!
elliottt commented on issue #4761:
I've managed to minimize the test down a bit:
test run set enable_llvm_abi_extensions target x86_64 function %a() -> f64 { ss0 = explicit_slot 59 block0: v0 = f64const 0x1.d7d7d7d7d006fp984 v1 = fcopysign v0, v0 return v1 } ; run: %a() == 0x0.0
elliottt edited a comment on issue #4761:
I've managed to minimize the test down a bit, this will segfault for any non-zero
explicit_slot
values greater than 0, but fail the test assertion otherwise.test run set enable_llvm_abi_extensions target x86_64 function %a() -> f64 { ss0 = explicit_slot 59 block0: v0 = f64const 0x1.d7d7d7d7d006fp984 v1 = fcopysign v0, v0 return v1 } ; run: %a() == 0x0.0
jameysharp commented on issue #4761:
After digging into this, we've determined that the problem here is not the constant alignment, exactly. The root cause is that we're trying to load values from memory which are not 16 bytes long, using instructions that load 16 bytes.
This specific bug was introduced this week in cee4b209f346ea279490268fe434dc52d0e0680c, while migrating
fcopysign
to ISLE.The old lowering always did an 8-byte (or 4-byte, for F32) load to put the operands into general-purpose registers first. Then it moved from the GPRs to XMM registers, and never used the memory-operand variants of these SIMD instructions.
The natural translation of that pattern to ISLE has revealed a footgun around implicit conversions to xmm-or-mem operands. It's only safe to use the memory operand variant of these instructions for types which really are 16 bytes in size.
@elliottt is working on fixing that implicit conversion, which should resolve this issue and avoid all future problems of this particular type.
elliottt closed issue #4761:
https://oss-fuzz.com/testcase-detail/4896860279537664
ERROR: AddressSanitizer: SEGV on unknown address (pc 0x6250000014d5 bp 0x7fff30133db0 sp 0x7fff30133ad0 T0) -- | ==743==The signal is caused by a READ memory access. | ==743==Hint: this fault was caused by a dereference of a high value address (see register values below). Disassemble the provided pc to learn which register was used. | SCARINESS: 20 (wild-addr-read) | #0 0x6250000014d5 (<unknown module>) | #1 0x62500000175d (<unknown module>) | #0 0x5579a6d0c6c4 in cranelift_filetests::function_runner::CompiledFunction::call::ha6b9162cd2e21784 wasmtime/cranelift/filetests/src/function_runner.rs:183:9
input:
IFWoAAAAAE0AAABSSUl9LCB7eAx6ZcYNR9T0RvCdMjI0MjaWMgAAAAA7ACypAH19f319fX0AAAA7 JAABqQB9fX19fX19fX0IAAAAAAAAAQAAwyvDw8PDAADJ/zw8PEHDw8MAAGUBb/PGANpGAAAAAAAj OwAsqW9vb29vb28AfX19fX19fQAAADskAAGpAH19fX19fX19fQ7/////////CAAAAAAAAAEAAMMr w8PDwwAAyf88PDxBw8PDAABlAW/zxgDaRgAAAAAAAAAAAAAAAACTgoL/////////MAAAAAAAAAUA AAAACIKCgoKCgkEAgoKCgoJNAAAAgoKCgoKCgoKCgoKCwoKCgoKCgoKSioIEAAcAgoKCgoKCgoKC goKCgoKCGIKC//8wAAAAAAAABQAAAAAIgoKCgoKCQQCCgoKCgk0AAACCgoKCgoKCgoKCgoLCeoKC eYKCgpKKggQAAACCgoKCgoKCgoKCgoKCgoIYgoKCgoKCgoKCgoKCFACCgjOCgoKCgoL/////goKC goKCgoKCgoKCgoJBAUFB
cc @afonso360
elliottt edited a comment on issue #4761:
I've managed to minimize the test down a bit, this will segfault for any
explicit_slot
values greater than 0, but fail the test assertion otherwise.test run set enable_llvm_abi_extensions target x86_64 function %a() -> f64 { ss0 = explicit_slot 59 block0: v0 = f64const 0x1.d7d7d7d7d006fp984 v1 = fcopysign v0, v0 return v1 } ; run: %a() == 0x0.0
Last updated: Dec 23 2024 at 12:05 UTC