wasmtime / Issue #1598 Add volatile flag to MemFlags · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / Issue #1598 Add volatile flag to MemFlags

Wasmtime GitHub notifications bot (Apr 25 2020 at 07:13):

Add Volatile flag to MemFlags.

Benefit

When some optimization pass knows about load/store type (is it volatile or not?) then it's possible to do a lot of optimizations. One of it is memory to register promotion. I tried to implement mem2reg pass there cranelift-mem2reg and this pass works but in some cases it fails e.g this simple C code:
// compiled with rcc
typedef struct Foo {
    int x;
    int y;
} Foo;

int main() {
    Foo foo;
    foo.y = 42;
}
Message:  definition error: Compilation error: Verifier errors
note: while compiling function u0:0() -> i32 system_v {
    ss0 = explicit_slot 8

block0:
    v1 = iconst.i64 4
    v2 = iadd.i64 v0, v1
    v3 = iconst.i32 42
    store v3, v2
    v4 = iconst.i32 0
    return v4
}
All loads and stores might be marked as volatile and mem2reg will not optimize struct Foo into registers.
Volatile flag also can help in SRoA when it's possible to replace allocation on stack with single 32 or 64 bit register.

Implementation

Just add additional enum case in codegen/ir/memflags.rs and add 3 simple functions to MemFlags: volatile() -> Self,is_volatile() -> bool, set_volatile(&mut self).

Alternatives

I do not see any alternatives.

Wasmtime GitHub notifications bot (Apr 25 2020 at 07:23):

playXE edited Issue #1598:

Add Volatile flag to MemFlags.

Benefit

When some optimization pass knows about load/store type (is it volatile or not?) then it's possible to do a lot of optimizations. One of it is memory to register promotion. I tried to implement mem2reg pass there cranelift-mem2reg and this pass works but in some cases it fails e.g this simple C code:
// compiled with rcc
typedef struct Foo {
    int x;
    int y;
} Foo;

int main() {
    Foo foo;
    foo.y = 42;
}
Message:  definition error: Compilation error: Verifier errors
note: while compiling function u0:0() -> i32 system_v {
    ss0 = explicit_slot 8

block0:
    v1 = iconst.i64 4
    v2 = iadd.i64 v0, v1
    v3 = iconst.i32 42
    store v3, v2
    v4 = iconst.i32 0
    return v4
}
All loads and stores might be marked as volatile and mem2reg will not optimize struct Foo into registers.
Volatile flag also can help in SRoA when it's possible to replace allocation on stack with single 32 or 64 bit register.

Implementation

Just add additional enum case in codegen/ir/memflags.rs and add 3 simple functions to MemFlags: volatile() -> Self,is_volatile() -> bool, set_volatile(&mut self).

Alternatives

I do not see any alternatives.

UPD: This might be useful for this issue: Support volatile store/loads

Wasmtime GitHub notifications bot (Apr 25 2020 at 08:06):

bjorn3 commented on Issue #1598:

FYI I implemented something very similar to your mem2reg in cg_clif: https://github.com/bjorn3/rustc_codegen_cranelift/blob/169140506366d5aa96967a71b6af98563a9c2988/src/optimize/stack2reg.rs

UPD: This might be useful for this issue: Support volatile store/loads

Implementing that issue would almosy definitely be done by adding a volatile flag to MemFlags as this issue suggests.

Wasmtime GitHub notifications bot (Apr 25 2020 at 08:06):

bjorn3 edited a comment on Issue #1598:

FYI I implemented something very similar to your mem2reg in cg_clif: https://github.com/bjorn3/rustc_codegen_cranelift/blob/169140506366d5aa96967a71b6af98563a9c2988/src/optimize/stack2reg.rs

UPD: This might be useful for this issue: Support volatile store/loads

Implementing that issue would almost definitely be done by adding a volatile flag to MemFlags as this issue suggests.

Wasmtime GitHub notifications bot (Apr 25 2020 at 10:51):

playXE commented on Issue #1598:

@bjorn3 your implementation looks much better than mine, but issue still remains, it is impossible to do mem2reg pass safely without volatile loads and stores :(

Wasmtime GitHub notifications bot (Apr 25 2020 at 11:20):

bjorn3 commented on Issue #1598:

I have been thinking if we should maybe make MemFlags some opaque number just like SourceLoc and then pass a MemoryModel to the compile function, that uses the number to determine what kind of optimizations are allowed around load and store. That MemoryModel is then responsible for aliasing, volatility, atomic orderings, ... Another similar way would be to pre-compute all memory dependencies and store that in the function, like @sunfishcode suggested in https://youtu.be/9OIA7DTFQWU beginning at 36:40.

Wasmtime GitHub notifications bot (Apr 25 2020 at 13:27):

tschneidereit labeled Issue #1598:

Add Volatile flag to MemFlags.

Benefit

When some optimization pass knows about load/store type (is it volatile or not?) then it's possible to do a lot of optimizations. One of it is memory to register promotion. I tried to implement mem2reg pass there cranelift-mem2reg and this pass works but in some cases it fails e.g this simple C code:
// compiled with rcc
typedef struct Foo {
    int x;
    int y;
} Foo;

int main() {
    Foo foo;
    foo.y = 42;
}
Message:  definition error: Compilation error: Verifier errors
note: while compiling function u0:0() -> i32 system_v {
    ss0 = explicit_slot 8

block0:
    v1 = iconst.i64 4
    v2 = iadd.i64 v0, v1
    v3 = iconst.i32 42
    store v3, v2
    v4 = iconst.i32 0
    return v4
}
All loads and stores might be marked as volatile and mem2reg will not optimize struct Foo into registers.
Volatile flag also can help in SRoA when it's possible to replace allocation on stack with single 32 or 64 bit register.

Implementation

Just add additional enum case in codegen/ir/memflags.rs and add 3 simple functions to MemFlags: volatile() -> Self,is_volatile() -> bool, set_volatile(&mut self).

Alternatives

I do not see any alternatives.

UPD: This might be useful for this issue: Support volatile store/loads

Wasmtime GitHub notifications bot (Apr 25 2020 at 13:27):

tschneidereit labeled Issue #1598:

Add Volatile flag to MemFlags.

Benefit

When some optimization pass knows about load/store type (is it volatile or not?) then it's possible to do a lot of optimizations. One of it is memory to register promotion. I tried to implement mem2reg pass there cranelift-mem2reg and this pass works but in some cases it fails e.g this simple C code:
// compiled with rcc
typedef struct Foo {
    int x;
    int y;
} Foo;

int main() {
    Foo foo;
    foo.y = 42;
}
Message:  definition error: Compilation error: Verifier errors
note: while compiling function u0:0() -> i32 system_v {
    ss0 = explicit_slot 8

block0:
    v1 = iconst.i64 4
    v2 = iadd.i64 v0, v1
    v3 = iconst.i32 42
    store v3, v2
    v4 = iconst.i32 0
    return v4
}
All loads and stores might be marked as volatile and mem2reg will not optimize struct Foo into registers.
Volatile flag also can help in SRoA when it's possible to replace allocation on stack with single 32 or 64 bit register.

Implementation

Just add additional enum case in codegen/ir/memflags.rs and add 3 simple functions to MemFlags: volatile() -> Self,is_volatile() -> bool, set_volatile(&mut self).

Alternatives

I do not see any alternatives.

UPD: This might be useful for this issue: Support volatile store/loads

Wasmtime GitHub notifications bot (Apr 25 2020 at 13:28):

github-actions[bot] commented on Issue #1598:

Subscribe to Label Action

cc @bnjbvr

<details>
This issue or pull request has been labeled: "cranelift"

Thus the following users have been cc'd because of the following labels:

bnjbvr: cranelift

To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.

Learn more.
</details>

Wasmtime GitHub notifications bot (Apr 25 2020 at 14:41):

sunfishcode commented on Issue #1598:

Giving frontends a way to provide explicit memory dependencies, and using that to implement volatile semantics would be cool. If anyone's interested in a designing a system to do this, it'd be good to start with a design document so we can look at how it'll impact compile times in fast-compilation use cases, and other tradeoffs involved.

We may want to do something simpler for now that just addresses volatile, and I have a possible alternative. Traditional volatile flags on regular memory accesses turn out to be very error prone. The recommended solution is to transform volatile into a call or separate opcode, and optionally inline the call after all optimizations. It's now how it systematically eliminates a common class of volatile bugs, however it does add complexity in the backend. So instead, how about the following design:

Add new two instructions to the IR:

- y = volatile_unary(x) - observe the value of x, do side effects, and return a new value
- volatile_rmw(p) - observe the value of *p, do side effects, and then clobber *p.

At a very late stage, volatile_transform would be translated as if it were an identity operation, and volatile_rmw would be translated as if it were a no-op. With those, frontends could then translate a volatile load like this:
   volatile_rmw(p)
   x0 = load(p)
   x = volatile_unary(x0)
and a volatile store like this:
   x0 = volatile_unary(x)
   store(p, x0)
   volatile_rmw(p)
That is, we'd use plain load and store instructions, but boxed in with volatile operations, both in control flow and data flow, to prevent any code motion or redundancy elimination. It'll get chatty, especially in the volatile-atomic case, but volatile is rare so it seems like an ok tradeoff. This should let most of the compiler automatically do the right thing, at low cost in overall complexity. And, volatile_rmw and volatile_transform would be handy besides for crafting compiler unit tests :-).

Wasmtime GitHub notifications bot (Apr 25 2020 at 14:56):

jyn514 commented on Issue #1598:

The recommended solution is to transform volatile into a call or separate opcode, and optionally inline the call after all optimizations.

Sort of like a memory fence? It marks this load/store as unable to be removed? That sounds reasonable to me.

I can't quite run csmith yet, but as soon as I do I'm happy to test out the combination of mem2reg and volatile to make sure there's no misoptimization.

make MemFlags some opaque number just like SourceLoc and then pass a MemoryModel to the compile function

I don't know that I like the idea of more indirection. If Cranelift uses opaque IDs for these flags, it makes it much harder to find the intended purpose - it's in an entirely separate part of the documentation. I also don't see the benefit over normal MemFlags.

Wasmtime GitHub notifications bot (Apr 25 2020 at 15:21):

bjorn3 commented on Issue #1598:

I also don't see the benefit over normal MemFlags.

The benefit is that the MemoryModel can assign arbitrary meanings to it, like this load can only alias with these other loads to prevent a stacked borrows violation. My proposal is that all properties of loads and stores are handled by the MemoryModel, so that you can easily switch between a C memory model, a Java memory model, stacked borrows and any other memory model without changing Cranelift itself.

Also note that I didn't mean to assign a property like volatile to a specific number, but instead let MemoryModel store a map between the number and any relevant properties.

Last updated: Apr 17 2025 at 20:03 UTC