wasmtime / issue #5423 aarch64: Use unsigned constants wh... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / issue #5423 aarch64: Use unsigned constants wh...

Wasmtime GitHub notifications bot (Dec 13 2022 at 02:17):

The current codegen suppresses some useful Rust compiler diagnostics by emitting expressions like 18446744073709551615i128 as u64. Is that constant being silently truncated to 64 bits? If it were written as just 18446744073709551615 and used in a context where its type is inferred to be u64, then Rust would tell us if the constant was nonsense. For anyone reading the generated code, emitting 0xffffffffffffffff instead would help answer that question. But we don't generally expect people to do that, so getting the Rust compiler to check for us is better.

I can confirm we don't have any constants that are getting silently truncated right now. As you guessed, I didn't implement the extra type-casting in my new codegen, and even then Rust didn't complain about the magnitude of constants in current backends. It wouldn't be hard to emit the same kind of casts that ISLE currently uses, but I think it's worth having a compile-time check for this kind of mistake.

This PR is because in addition to checking for too-big constants, Rust also rejects negation on unsigned constants. I don't think that's so useful to check at compile time, but it doesn't look like it's a feature that's been used much at all in backends, so I figured it wouldn't be a big deal to lose it.

I did initially write this patch with 0xFFFF_FFFF_FFFF_FFFF and 0xFFFF_FF80 and so on. I liked that writing u64_sub(0, 128) instead makes explicit that the result is supposed to be a u64. But there are other ways to fix this.

signed_min and signed_max are used exactly once each, like (imm $I32 (ImmExtend.Sign) (signed_max out_ty)). How would you feel about pushing the call to imm inside signed_min, so the sign-extension becomes part of constructing the immediate?
(rule (signed_min $I8) (imm $I8 (ImmExtend.Sign) 0x80))
(rule (signed_min $I16) (imm $I16 (ImmExtend.Sign) 0x8000))
Similarly, the immediate constructed from -1 could be something like:
(imm $I8 (ImmExtend.Sign) 0xFF)

Wasmtime GitHub notifications bot (Dec 13 2022 at 03:28):

cfallin commented on issue #5423:

That all makes sense, thanks -- I agree that finding a way to get compile-time checks on constant magnitudes would be ideal. re: this:

This PR is because in addition to checking for too-big constants, Rust also rejects negation on unsigned constants. I don't think that's so useful to check at compile time, but it doesn't look like it's a feature that's been used much at all in backends, so I figured it wouldn't be a big deal to lose it.

I actually do think that allowing negative constants is important, and papering over their lack by defining certain external constructors is not ideal -- do we do that each time we need another negative value? The fact that we do use negative constants in a few places now is evidence to me that they're useful, and they may come up in other situations later as well...

I wonder whether another approach might not be better: why not give ISLE types a notion of bit-width, and check for out-of-bounds literals at the ISLE level? That seems strictly better in another way: it lifts the check to an explicitly-defined one in our DSL, rather than a side-effect of our target language. So something like
(type u64 (primitive u64 integer 64))
then when lowering a ConstInt to sema (at which point we know the type), we can check that (i) the expected type is actually an integer primitive, and not some other primitive; and (ii) it's within range. Then we can decide what the acceptable range is: e.g., we could say that we accept -2^(bits-1) .. 2^bits, or the union of the range of signed and unsigned values of the given width. Thoughts?

Wasmtime GitHub notifications bot (Dec 13 2022 at 03:41):

cfallin commented on issue #5423:

e.g., we could say that we accept -2^(bits-1) .. 2^bits, or the union of the range of signed and unsigned values of the given width.

although now that I've written this out, I'm starting to dislike it more; the asymmetry just feels weird and wrong.

Alternative: lift a bit more of the integer type system into ISLE. Namely
(type u64 (primitive u64 integer 64 unsigned))
(type i64 (primitive i64 integer 64 signed))
such that in a u64 context we accept literals in 0..2^64 and in an i64 context we accept -2^63..2^63. Then we also add a builtin
(imm $I64 (u64 -1:i64))  ;; sketching some syntax, please bikeshed something better
where u64 used in term position is a cast (compiles to as u64, but also works with typechecking at ISLE level) and :i64 on a constant sets the type of the constant.

This is all a bit uncertain but I feel there's a better system for constants, and for integer types, buried in here somewhere...

Wasmtime GitHub notifications bot (Dec 13 2022 at 03:42):

cfallin edited a comment on issue #5423:

e.g., we could say that we accept -2^(bits-1) .. 2^bits, or the union of the range of signed and unsigned values of the given width.

although now that I've written this out, I'm starting to dislike it more; the asymmetry just feels weird and wrong.

Alternative: lift a bit more of the integer type system into ISLE. Namely
(type u64 (primitive u64 integer 64 unsigned))
(type i64 (primitive i64 integer 64 signed))
such that in a u64 context we accept literals in 0..2^64 and in an i64 context we accept -2^63..2^63. Then we also add some syntax
(imm $I64 (u64 -1:i64))  ;; sketching some syntax, please bikeshed something better
where u64 used in term position is a cast (compiles to as u64, but also works with typechecking at ISLE level) and :i64 on a constant sets the type of the constant.

This is all a bit uncertain but I feel there's a better system for constants, and for integer types, buried in here somewhere...

Wasmtime GitHub notifications bot (Dec 13 2022 at 21:25):

jameysharp commented on issue #5423:

I think it'd be great to give ISLE a little more knowledge about primitive numeric types. My codegen prototype is currently checking whether the name of an integer constant's type starts with "i" to decide how to treat negative constants, which certainly isn't ideal.

But also, I don't want to build that out right now. What do you think of the alternative patch I've just pushed here?

Last updated: Apr 18 2025 at 01:31 UTC