10 messages were moved here from #cranelift > Silly questions by Akshanabha Chakraborty.
/// Flags for AtomicCas instruction
///
/// Each of these flags introduce a limited form of undefined behavior. The flags each enable
/// certain optimizations that need to make additional assumptions. Generally, the semantics of a
/// program does not change when a flag is removed, but adding a flag will.
///
/// In addition, the flags determine the endianness of the memory access. By default,
/// any memory access uses the native endianness determined by the target ISA. This can
/// be overridden for individual accesses by explicitly specifying little- or big-endian
/// semantics via the flags.
#[derive(Clone, Copy, Debug, Hash, PartialEq, Eq)]
#[cfg_attr(feature = "enable-serde", derive(serde::Serialize, serde::Deserialize))]
pub struct AtomicCasMemFlags {
// Initialized to all zeros to have all flags have their default value.
// This is interpreted through various methods below. Currently the bits of
// this are defined as:
//
// * 0/1/2/3/4/5/6/7 - trap code
// * 8/9/10 - atomic ordering
// * 11 - aligned flag
// * 12 - little endian
// * 13 - big endian
// * 14/15 - alias region
// * 16 (overflow) - checked bit (to be seen)
//
// Current properties upheld are:
//
// * only one of little/big endian is set
// * only one alias region can be set - once set it cannot be changed
bits: u32,
}
I have been able to squeeze it down to 17-bits. Are there any bits that can be shaved off from here
Do we need to merge MemFlags in as well? I'd really prefer not to lose that distinction if we can help it. We've already got the bits in AtomicRmwOp to represent the atomic ordering, which was the original goal, right?
atomic_rmw: Builder::new("AtomicRmw")
.imm(&imm.memflags)
.imm(&imm.atomic_rmw_data)
.value()
.value()
.build(),
atomic_cas: Builder::new("AtomicCas")
.imm(&imm.atomic_cas_memflags)
.value()
.value()
.value()
.typevar_operand(2)
.build(),
As i found out, AtomicCas was also saturated just like AtomicRmwOp due to 3 Values that it takes, so we have to merge MemFlags and AtomicOrdering as well (for just Cas). I too found out about it after finishing the Rmw Operation Merger.
Though, as I think separating MemFlags for AtomicCas is the best option since 2 bits (readonly, can_move) are always set to 0 and hence can be reused. We are exactly one bit off from packing it all in there.
@Chris Fallin I think we pin somehow express it using permutations as, ordering (5 states) * alias region (3 states) * endianness (3 states) * Alias (2 status) * Checked (2 states) = 180 states, should mathematically be representable in 8-bits which means we can pivot away from binary into decimal expression
I have a proof of concept here : https://play.rust-lang.org/?version=stable&mode=release&edition=2024&gist=fe6c47600459c6f568f8e4138fb2ff44
I think that if we have to do that, then we're getting a little too complex for maintainability, especially for a feature that is a nice-to-have at this point but not essential for performance (and unlikely to have any perf impact in practice).
We definitely cannot steal more bits from MemFlags; we have plans to use more bits there in a way that could actually improve performance with more alias regions (i.e., it's actually load-bearing for Wasmtime plans).
When we do get to the point of doing code-motion to a degree that atomic orderings matter for codegen quality, I think probably the way to go for AtomicCas is to externalize the info somehow -- we can't box it (InstructionData is Copy) but we could define a new entity type or something. I don't think now is the time to build that though
So I guess what I'm coming to is: this exploration has been valuable (thank you!) but maybe it's not the right time to do it
Exactly, you are right. I shall still record my findings for others.
Last updated: Mar 23 2026 at 16:19 UTC