Stream: git-wasmtime

Topic: wasmtime / Issue #2562 Cranelift: failure to coalesce loads


view this post on Zulip Wasmtime GitHub notifications bot (Jan 09 2021 at 00:20):

abrown opened Issue #2562:

In discussions in #2489, I found that one of the issues I was seeing was related to load coalescing in the new x64 backend--i.e., the load should have been coalesced into a subsequent instruction but was not. The x64 backend does correctly coalesce the following load + scalar_to_vector to a single movss 0(%rdi), %xmm0:

function %load32_zero(i64) -> i32x4 {
block0(v0: i64):
    v1 = load.i32 v0
    v2 = scalar_to_vector.i32x4 v1
    return v2
}

But when I add a block, the load is no longer coalesced. As @cfallin explained re: the code here, cranelift ensures (using colors) that effectful instructions cannot be coalesced across block boundaries (which conservatively increments the color). But in the following example the load + scalar_to_vector is not coalesced across a boundary--it seems to me that this _should_ coalesce:

function %load32_zero(i64) -> i32x4 {
block0(v0: i64):
    v1 = load.i32 v0
    v2 = scalar_to_vector.i32x4 v1
    jump block1
 block1:
    return v2
}

view this post on Zulip Wasmtime GitHub notifications bot (Jan 09 2021 at 00:20):

abrown labeled Issue #2562:

In discussions in #2489, I found that one of the issues I was seeing was related to load coalescing in the new x64 backend--i.e., the load should have been coalesced into a subsequent instruction but was not. The x64 backend does correctly coalesce the following load + scalar_to_vector to a single movss 0(%rdi), %xmm0:

function %load32_zero(i64) -> i32x4 {
block0(v0: i64):
    v1 = load.i32 v0
    v2 = scalar_to_vector.i32x4 v1
    return v2
}

But when I add a block, the load is no longer coalesced. As @cfallin explained re: the code here, cranelift ensures (using colors) that effectful instructions cannot be coalesced across block boundaries (which conservatively increments the color). But in the following example the load + scalar_to_vector is not coalesced across a boundary--it seems to me that this _should_ coalesce:

function %load32_zero(i64) -> i32x4 {
block0(v0: i64):
    v1 = load.i32 v0
    v2 = scalar_to_vector.i32x4 v1
    jump block1
 block1:
    return v2
}

view this post on Zulip Wasmtime GitHub notifications bot (Jan 09 2021 at 00:20):

abrown labeled Issue #2562:

In discussions in #2489, I found that one of the issues I was seeing was related to load coalescing in the new x64 backend--i.e., the load should have been coalesced into a subsequent instruction but was not. The x64 backend does correctly coalesce the following load + scalar_to_vector to a single movss 0(%rdi), %xmm0:

function %load32_zero(i64) -> i32x4 {
block0(v0: i64):
    v1 = load.i32 v0
    v2 = scalar_to_vector.i32x4 v1
    return v2
}

But when I add a block, the load is no longer coalesced. As @cfallin explained re: the code here, cranelift ensures (using colors) that effectful instructions cannot be coalesced across block boundaries (which conservatively increments the color). But in the following example the load + scalar_to_vector is not coalesced across a boundary--it seems to me that this _should_ coalesce:

function %load32_zero(i64) -> i32x4 {
block0(v0: i64):
    v1 = load.i32 v0
    v2 = scalar_to_vector.i32x4 v1
    jump block1
 block1:
    return v2
}

view this post on Zulip Wasmtime GitHub notifications bot (Jan 09 2021 at 00:20):

abrown commented on Issue #2562:

cc: @bnjbvr, @julian-seward1

view this post on Zulip Wasmtime GitHub notifications bot (Jan 11 2021 at 20:06):

cfallin closed Issue #2562:

In discussions in #2489, I found that one of the issues I was seeing was related to load coalescing in the new x64 backend--i.e., the load should have been coalesced into a subsequent instruction but was not. The x64 backend does correctly coalesce the following load + scalar_to_vector to a single movss 0(%rdi), %xmm0:

function %load32_zero(i64) -> i32x4 {
block0(v0: i64):
    v1 = load.i32 v0
    v2 = scalar_to_vector.i32x4 v1
    return v2
}

But when I add a block, the load is no longer coalesced. As @cfallin explained re: the code here, cranelift ensures (using colors) that effectful instructions cannot be coalesced across block boundaries (which conservatively increments the color). But in the following example the load + scalar_to_vector is not coalesced across a boundary--it seems to me that this _should_ coalesce:

function %load32_zero(i64) -> i32x4 {
block0(v0: i64):
    v1 = load.i32 v0
    v2 = scalar_to_vector.i32x4 v1
    jump block1
 block1:
    return v2
}


Last updated: Jan 24 2025 at 00:11 UTC