T0b1-iOS opened issue #5645:
I have stumbled upon a rather nasty 'bug' this week.
When compiling IR like this:function u0:0(i64) -> i64 system_v { sig2 = (i64, i8) -> i64 system_v fn2 = u0:112 sig2 block0(v0: i64): v1 = icmp_imm ne v0, 0 v2 = call fn2(v0, v1) return v2 }
you will get assembly that looks like this:
pushq %rbp unwind PushFrameRegs { offset_upward_to_caller_sp: 16 } movq %rsp, %rbp unwind DefineNewFrame { offset_upward_to_caller_sp: 16, offset_downward_to_clobbers: 0 } block0: cmpq $0, %rdi setnz %sil load_ext_name userextname0+0, %r8 call *%r8 movq %rbp, %rsp popq %rbp ret
Note that the
icmp_imm
gets compiled to acmp
followed by asetcc
instruction usingsil
as a parameter.
This means that the upper bytes ofrsi
are undefined. Now whenfn2
is a C function that looks like this:void fun(uint64_t x, bool a) { do_smth(a ? x : 0); }
and you compile this with clang the assembly looks like this:
xor eax, eax test esi, esi cmove rdi, rax jmp do_smth(unsigned long)@PLT
while with gcc you get
xor eax, eax test sil, sil cmove rdi, rax jmp do_smth(unsigned long)
Note that gcc emits a
test sil, sil
instruction while clang emitstest esi, esi
meaning that it relies on the 8bit argument getting zero-extended to at least 32 bit and therefore you will get random behavior depending on whether the upper bytes ofrsi
were zero or not. (Compiler Explorer link)
According to the people at LLVM, whether arguments should be zero-extended or not is unclear in the ABI and therefore they do not consider it a bug until this issue has been resolved on the ABI level since the major compilers do zero-extend the arguments.Now my question is, do you consider this a bug in cranelift or at least an issue that should be addressed on the codegen level inside cranelift? (I have fixed this by manually checking if an argument is I8/I16 when emitting a function call and then passed that through a
uextend
/inarrow
combo)
Maybe this could be done when theenable_llvm_abi_extensions
flag is set?
And then the question would be where to fix it: When passing a variable to an external function or making sure that when instructions likesetcc
are emitted the upper parts of the destination register are cleared beforehand?
T0b1-iOS labeled issue #5645:
I have stumbled upon a rather nasty 'bug' this week.
When compiling IR like this:function u0:0(i64) -> i64 system_v { sig2 = (i64, i8) -> i64 system_v fn2 = u0:112 sig2 block0(v0: i64): v1 = icmp_imm ne v0, 0 v2 = call fn2(v0, v1) return v2 }
you will get assembly that looks like this:
pushq %rbp unwind PushFrameRegs { offset_upward_to_caller_sp: 16 } movq %rsp, %rbp unwind DefineNewFrame { offset_upward_to_caller_sp: 16, offset_downward_to_clobbers: 0 } block0: cmpq $0, %rdi setnz %sil load_ext_name userextname0+0, %r8 call *%r8 movq %rbp, %rsp popq %rbp ret
Note that the
icmp_imm
gets compiled to acmp
followed by asetcc
instruction usingsil
as a parameter.
This means that the upper bytes ofrsi
are undefined. Now whenfn2
is a C function that looks like this:void fun(uint64_t x, bool a) { do_smth(a ? x : 0); }
and you compile this with clang the assembly looks like this:
xor eax, eax test esi, esi cmove rdi, rax jmp do_smth(unsigned long)@PLT
while with gcc you get
xor eax, eax test sil, sil cmove rdi, rax jmp do_smth(unsigned long)
Note that gcc emits a
test sil, sil
instruction while clang emitstest esi, esi
meaning that it relies on the 8bit argument getting zero-extended to at least 32 bit and therefore you will get random behavior depending on whether the upper bytes ofrsi
were zero or not. (Compiler Explorer link)
According to the people at LLVM, whether arguments should be zero-extended or not is unclear in the ABI and therefore they do not consider it a bug until this issue has been resolved on the ABI level since the major compilers do zero-extend the arguments.Now my question is, do you consider this a bug in cranelift or at least an issue that should be addressed on the codegen level inside cranelift? (I have fixed this by manually checking if an argument is I8/I16 when emitting a function call and then passed that through a
uextend
/inarrow
combo)
Maybe this could be done when theenable_llvm_abi_extensions
flag is set?
And then the question would be where to fix it: When passing a variable to an external function or making sure that when instructions likesetcc
are emitted the upper parts of the destination register are cleared beforehand?
T0b1-iOS labeled issue #5645:
I have stumbled upon a rather nasty 'bug' this week.
When compiling IR like this:function u0:0(i64) -> i64 system_v { sig2 = (i64, i8) -> i64 system_v fn2 = u0:112 sig2 block0(v0: i64): v1 = icmp_imm ne v0, 0 v2 = call fn2(v0, v1) return v2 }
you will get assembly that looks like this:
pushq %rbp unwind PushFrameRegs { offset_upward_to_caller_sp: 16 } movq %rsp, %rbp unwind DefineNewFrame { offset_upward_to_caller_sp: 16, offset_downward_to_clobbers: 0 } block0: cmpq $0, %rdi setnz %sil load_ext_name userextname0+0, %r8 call *%r8 movq %rbp, %rsp popq %rbp ret
Note that the
icmp_imm
gets compiled to acmp
followed by asetcc
instruction usingsil
as a parameter.
This means that the upper bytes ofrsi
are undefined. Now whenfn2
is a C function that looks like this:void fun(uint64_t x, bool a) { do_smth(a ? x : 0); }
and you compile this with clang the assembly looks like this:
xor eax, eax test esi, esi cmove rdi, rax jmp do_smth(unsigned long)@PLT
while with gcc you get
xor eax, eax test sil, sil cmove rdi, rax jmp do_smth(unsigned long)
Note that gcc emits a
test sil, sil
instruction while clang emitstest esi, esi
meaning that it relies on the 8bit argument getting zero-extended to at least 32 bit and therefore you will get random behavior depending on whether the upper bytes ofrsi
were zero or not. (Compiler Explorer link)
According to the people at LLVM, whether arguments should be zero-extended or not is unclear in the ABI and therefore they do not consider it a bug until this issue has been resolved on the ABI level since the major compilers do zero-extend the arguments.Now my question is, do you consider this a bug in cranelift or at least an issue that should be addressed on the codegen level inside cranelift? (I have fixed this by manually checking if an argument is I8/I16 when emitting a function call and then passed that through a
uextend
/inarrow
combo)
Maybe this could be done when theenable_llvm_abi_extensions
flag is set?
And then the question would be where to fix it: When passing a variable to an external function or making sure that when instructions likesetcc
are emitted the upper parts of the destination register are cleared beforehand?
bjorn3 commented on issue #5645:
You will need to set the
zext
flag in the function signature (andsext
in case of signed ints). Cranelift provides all the building blocks for correctly handling the abi, but it is the responsibility of the frontend to correctly use those building blocks, just like in LLVM. See for example https://github.com/rust-lang/rust/blob/7919ef0ec5776c72dace7fec1c68551a617505ad/compiler/rustc_target/src/abi/call/x86_64.rs#L232 for where rustc specifies that these flags need to be set.
cfallin commented on issue #5645:
It seems the crux of the matter is how we interpret SysV ABI handling of narrow values; if according to the LLVM bug you linked, zext up to 32 bits is "de-facto the ABI already", then we could implement that, perhaps under a flag. But until/unless we decide to do that, @bjorn3's suggestion is the best: when translating C signatures to Cranelift, add
zext
toi8
ori16
values.
T0b1-iOS commented on issue #5645:
I'm not sure where to set it though since setting it on the parameter (that's what the ArgumentExtension doc suggests) like below compiles to the same assembly.
function u0:0(i64) -> i64 system_v { sig2 = (i64, i8 uext) -> i64 system_v fn2 = u0:112 sig2 block0(v0: i64): v1 = icmp_imm ne v0, 0 v2 = call fn2(v0, v1) return v2 }
T0b1-iOS edited a comment on issue #5645:
I'm not sure where to set it though since setting it on the parameter (that's what the ArgumentExtension doc suggests) like below compiles to the same assembly. (maybe it's not parsed by cranelift-reader there? Need to check.
function u0:0(i64) -> i64 system_v { sig2 = (i64, i8 uext) -> i64 system_v fn2 = u0:112 sig2 block0(v0: i64): v1 = icmp_imm ne v0, 0 v2 = call fn2(v0, v1) return v2 }
T0b1-iOS edited a comment on issue #5645:
I'm not sure where to set it though since setting it on the parameter (that's what the ArgumentExtension doc suggests) like below compiles to the same assembly. (maybe it's not parsed by cranelift-reader there? Need to check.)
function u0:0(i64) -> i64 system_v { sig2 = (i64, i8 uext) -> i64 system_v fn2 = u0:112 sig2 block0(v0: i64): v1 = icmp_imm ne v0, 0 v2 = call fn2(v0, v1) return v2 }
cfallin commented on issue #5645:
Ah, that is indeed a bug if
uext
doesn't actually work! We'll look into this.
T0b1-iOS commented on issue #5645:
I mean, the comment for ArgumentExtensions mentions
This attribute specifies how an argument or return value should be extended if the platform and ABI require it
and since the SystemV document is 'ambigous' on this issue it's technically not required so no extension needs to be done :P
The culprit seems to be
get_ext_mode
inX64ABIMachineSpec
:fn get_ext_mode( _call_conv: isa::CallConv, _specified: ir::ArgumentExtension, ) -> ir::ArgumentExtension { ir::ArgumentExtension::None }
It always ignores the specified extension. So make it honor the specified extension if the calling conv is SysV? Or only when the LLVM ABI Extensions are used...
bjorn3 commented on issue #5645:
if according to the LLVM bug you linked, zext up to 32 bits is "de-facto the ABI already", then we could implement that, perhaps under a flag.
As I understand it the de-facto ABI is to zero or sign extend depending on if the type is unsigned or signed. As clif ir doesn't distinguish between the two, explicit zext and sext are the only way to know if it should be zero or sign extended.
Last updated: Nov 22 2024 at 16:03 UTC