abrown opened Issue #2562:
In discussions in #2489, I found that one of the issues I was seeing was related to load coalescing in the new x64 backend--i.e., the load should have been coalesced into a subsequent instruction but was not. The x64 backend does correctly coalesce the following
load + scalar_to_vector
to a singlemovss 0(%rdi), %xmm0
:function %load32_zero(i64) -> i32x4 { block0(v0: i64): v1 = load.i32 v0 v2 = scalar_to_vector.i32x4 v1 return v2 }
But when I add a block, the load is no longer coalesced. As @cfallin explained re: the code here, cranelift ensures (using colors) that effectful instructions cannot be coalesced across block boundaries (which conservatively increments the color). But in the following example the
load + scalar_to_vector
is not coalesced across a boundary--it seems to me that this _should_ coalesce:function %load32_zero(i64) -> i32x4 { block0(v0: i64): v1 = load.i32 v0 v2 = scalar_to_vector.i32x4 v1 jump block1 block1: return v2 }
abrown labeled Issue #2562:
In discussions in #2489, I found that one of the issues I was seeing was related to load coalescing in the new x64 backend--i.e., the load should have been coalesced into a subsequent instruction but was not. The x64 backend does correctly coalesce the following
load + scalar_to_vector
to a singlemovss 0(%rdi), %xmm0
:function %load32_zero(i64) -> i32x4 { block0(v0: i64): v1 = load.i32 v0 v2 = scalar_to_vector.i32x4 v1 return v2 }
But when I add a block, the load is no longer coalesced. As @cfallin explained re: the code here, cranelift ensures (using colors) that effectful instructions cannot be coalesced across block boundaries (which conservatively increments the color). But in the following example the
load + scalar_to_vector
is not coalesced across a boundary--it seems to me that this _should_ coalesce:function %load32_zero(i64) -> i32x4 { block0(v0: i64): v1 = load.i32 v0 v2 = scalar_to_vector.i32x4 v1 jump block1 block1: return v2 }
abrown labeled Issue #2562:
In discussions in #2489, I found that one of the issues I was seeing was related to load coalescing in the new x64 backend--i.e., the load should have been coalesced into a subsequent instruction but was not. The x64 backend does correctly coalesce the following
load + scalar_to_vector
to a singlemovss 0(%rdi), %xmm0
:function %load32_zero(i64) -> i32x4 { block0(v0: i64): v1 = load.i32 v0 v2 = scalar_to_vector.i32x4 v1 return v2 }
But when I add a block, the load is no longer coalesced. As @cfallin explained re: the code here, cranelift ensures (using colors) that effectful instructions cannot be coalesced across block boundaries (which conservatively increments the color). But in the following example the
load + scalar_to_vector
is not coalesced across a boundary--it seems to me that this _should_ coalesce:function %load32_zero(i64) -> i32x4 { block0(v0: i64): v1 = load.i32 v0 v2 = scalar_to_vector.i32x4 v1 jump block1 block1: return v2 }
abrown commented on Issue #2562:
cc: @bnjbvr, @julian-seward1
cfallin closed Issue #2562:
In discussions in #2489, I found that one of the issues I was seeing was related to load coalescing in the new x64 backend--i.e., the load should have been coalesced into a subsequent instruction but was not. The x64 backend does correctly coalesce the following
load + scalar_to_vector
to a singlemovss 0(%rdi), %xmm0
:function %load32_zero(i64) -> i32x4 { block0(v0: i64): v1 = load.i32 v0 v2 = scalar_to_vector.i32x4 v1 return v2 }
But when I add a block, the load is no longer coalesced. As @cfallin explained re: the code here, cranelift ensures (using colors) that effectful instructions cannot be coalesced across block boundaries (which conservatively increments the color). But in the following example the
load + scalar_to_vector
is not coalesced across a boundary--it seems to me that this _should_ coalesce:function %load32_zero(i64) -> i32x4 { block0(v0: i64): v1 = load.i32 v0 v2 = scalar_to_vector.i32x4 v1 jump block1 block1: return v2 }
Last updated: Jan 24 2025 at 00:11 UTC