fitzgen opened PR #13193 from fitzgen:endian-newtypes to bytecodealliance:main:
Split out from https://github.com/bytecodealliance/wasmtime/pull/13107 while debugging s390x issues.
<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
fitzgen requested alexcrichton for a review on PR #13193.
fitzgen requested wasmtime-compiler-reviewers for a review on PR #13193.
fitzgen requested wasmtime-core-reviewers for a review on PR #13193.
cfallin submitted PR review:
Thanks! The overall goal seems pretty reasonable here -- some thoughts below.
cfallin created PR review comment:
Thanks for the newtype-correctness efforts here!
I'm finding the naming/conceptual definition a little confusing, to be honest, because (in my head) the integer types themselves have an abstract domain (
0..2^width) and their byte representations have endianness. This aligns with e.g. the builtinuNN::to_le_bytes()-- the LE representation is a sequence of bytes, not auNN.So to that end, what does it mean when we have a
Le(1234)? Does that mean the "actual value" is1234but the value is stored in memory as either1234(little-endian host) orbswap(1234)(bit-endian host)? Or vice-versa, the value is stored in memory however the host does and we interpret it as LE ("1234-native-endian, interpreted as LE")?I think I might have an easier time if we rename things a bit -- or failing that (since these types are very pervasive), document better. Suggestions:
ExplicitLittleEndian(n)andHostOrdered(n)- store
[u8; N]in the newtypes and make these traits wrappers around{from,to}_{le,ne}_bytes- something else I can't think of right now
I kind of favor the second, assuming that LLVM can see through it and not pessimize -- Godbolt example to demonstrate that I think it should: link
cfallin created PR review comment:
FWIW the
is_littlename was a bit confusing for me at first becauseis_little = falsesounds (naively) like "big endian"; it's not, it's "native endian", which could be actually big or little depending on host.Maybe
is_explicit_little?
cfallin created PR review comment:
asyncness change unrelated to endianness?
cfallin created PR review comment:
These seem like good fixes but possibly unrelated to endianness?
cfallin created PR review comment:
Lots of commented-out lines in testsuite lines here and below -- revert?
cfallin created PR review comment:
Unrelated to endianness?
cfallin created PR review comment:
(After writing this comment, see more general thoughts above on the confusion)
cfallin edited PR review comment.
cfallin edited PR review comment.
cfallin edited PR review comment.
github-actions[bot] added the label wasmtime:api on PR #13193.
github-actions[bot] added the label wasmtime:ref-types on PR #13193.
github-actions[bot] commented on PR #13193:
Subscribe to Label Action
cc @fitzgen
<details>
This issue or pull request has been labeled: "wasmtime:api", "wasmtime:ref-types"Thus the following users have been cc'd because of the following labels:
- fitzgen: wasmtime:ref-types
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
Learn more.
</details>
:memo: alexcrichton submitted PR review:
While I don't doubt that we could do better internally abuot endianness and conversions I'm personally relatively skeptical of the approach taken here. I share @cfallin's concerns as well with code like
let exnref = Le::from_ne(exnref);I find pretty confusing. Additionally I'm not sure I understand theValRawchanges, for example, because it should be the case that by definition everything inValRawis little-endian and the accessors shouldn't need endianness annotated on them.Was this able to help track down the issue you're running into? If so could you expand on that a bit? I'm curious if that can help calibrate my thinking about things here.
fitzgen commented on PR #13193:
Additionally I'm not sure I understand the
ValRawchanges, for example, because it should be the case that by definition everything inValRawis little-endian and the accessors shouldn't need endianness annotated on them.Yes, and this doesn't change any of that, it just adds
pub(crate)fn foo_le(&self) -> Le<u32>getters toValRawto get the value wrapped in theLenewtype, formalizing the invariant thatValType` things are stored as little-endian in the type system.Was this able to help track down the issue you're running into? If so could you expand on that a bit? I'm curious if that can help calibrate my thinking about things here.
The goal was to formalize in the type system where we have guaranteed-little-endian-regardless-of-native-endian integers and where we have native-endian integers and make it a compilation error to use the wrong endian in the wrong place. This limits the scope of where endianness bugs can pop up: only in JIT code or in the initial construction of an
Le/Ne(where you are effectively promising the given raw integer is in the claimed endianness).This helped me find a couple bugs inside the copying collector branch.
:memo: fitzgen submitted PR review.
:speech_balloon: fitzgen created PR review comment:
store
[u8; N]in the newtypes and make these traits wrappers around{from,to}_{le,ne}_bytesThis sounds fine to me
:memo: fitzgen submitted PR review.
:speech_balloon: fitzgen created PR review comment:
Whoops, couple things from the copying collector branch snuck in here, sorry.
fitzgen commented on PR #13193:
I will make
ValRawinternally have[u8; 4]for GC refs as well.
fitzgen edited a comment on PR #13193:
Additionally I'm not sure I understand the
ValRawchanges, for example, because it should be the case that by definition everything inValRawis little-endian and the accessors shouldn't need endianness annotated on them.Yes, and this doesn't change any of that, it just adds
pub(crate) fn foo_le(&self) -> Le<u32>getters toValRawto get the value wrapped in theLenewtype, formalizing the invariant thatValTypethings are stored as little-endian in the type system.Was this able to help track down the issue you're running into? If so could you expand on that a bit? I'm curious if that can help calibrate my thinking about things here.
The goal was to formalize in the type system where we have guaranteed-little-endian-regardless-of-native-endian integers and where we have native-endian integers and make it a compilation error to use the wrong endian in the wrong place. This limits the scope of where endianness bugs can pop up: only in JIT code or in the initial construction of an
Le/Ne(where you are effectively promising the given raw integer is in the claimed endianness).This helped me find a couple bugs inside the copying collector branch.
alexcrichton commented on PR #13193:
Personally event a small adjustment on this patch is not something I'd like to see landed, so I'd like to game this out more first and better understand what's happening here. For example I don't think we should use
[u8; N]throughout Wasmtime as that seems like it's highly likely to fall of LLVM's happy path and become pretty heavily deoptimized in some cases. I also don't personally see much value inNesince the default, in theory, is "native endian everywhere unless otherwise marked". Personally I also don't see much value in pervasively usingLethroughout the runtime as parameters/arguments/etc. I only think that this would make sense within type definitions to clarify "at rest this is little-endian unconditionally". As params/results/etc this seems like pure noise to me.The goal was to formalize in the type system where we have guaranteed-little-endian-regardless-of-native-endian integers
I think this is a reasonable goal, but I'm not sure why we would need these wrappers outside of type definitions to achieve this. I also think it's reasonable to have everywhere be native-endian-unless-documented-otherwise where today documentation is
// fooand tomorrow it could beLe<T>which would be nicer to have.
alexcrichton edited a comment on PR #13193:
Personally even a small adjustment on this patch is not something I'd like to see landed, so I'd like to game this out more first and better understand what's happening here. For example I don't think we should use
[u8; N]throughout Wasmtime as that seems like it's highly likely to fall of LLVM's happy path and become pretty heavily deoptimized in some cases. I also don't personally see much value inNesince the default, in theory, is "native endian everywhere unless otherwise marked". Personally I also don't see much value in pervasively usingLethroughout the runtime as parameters/arguments/etc. I only think that this would make sense within type definitions to clarify "at rest this is little-endian unconditionally". As params/results/etc this seems like pure noise to me.The goal was to formalize in the type system where we have guaranteed-little-endian-regardless-of-native-endian integers
I think this is a reasonable goal, but I'm not sure why we would need these wrappers outside of type definitions to achieve this. I also think it's reasonable to have everywhere be native-endian-unless-documented-otherwise where today documentation is
// fooand tomorrow it could beLe<T>which would be nicer to have.
Last updated: May 03 2026 at 22:13 UTC