uweigand opened PR #2488 from endian-memory-v4
to main
:
WebAssembly memory operations are by definition little-endian even on
big-endian target platforms. However, other memory accesses will require
native target endianness (e.g. to access parts of the VMContext that is
also accessed by VM native code). This means on big-endian targets,
the code generator will have to handle both little- and big-endian
memory accesses. However, there is currently no way to encode that
distinction into the Cranelift IR that describes memory accesses.This patch provides such a way by adding an explicit endianness marker
to each instance of MemFlags. Since each Cranelift IR instruction that
describes memory accesses already has an instance of MemFlags attached,
that will now automatically provide endianness information.Beyond the core mechanics of adding the endianness marker to MemFlags,
this patch also updates all creators of MemFlags instances. In order
to ensure each MemFlags has the correct endianness marker, it is now
mandatory to pass such a marker whenever constructing a MemFlags.This in turn requires all places constructing a MemFlags (except for
the WebAssembly translator) to pass in the native target endianness.
This can be determined from the TargetIsa where that is readily
available; front-end code wil retrieve the target endianness from a new
member of the TargetFrontendConfig structure filled by the back end.In addition, the GlobalValueData::Load structure was changed to hold
a full MemFlags instead of just a "readonly" flag, which not only
simplifies related code a bit but now allows passing through of
endianness information as well.Finally, IR parsing and debug printing routines were updated to
handle the new endianness marker; this in turn required updating
Cranelift "filetests" test cases that contain memory instructions.This patch addresses issue #2124.
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
uweigand updated PR #2488 from endian-memory-v4
to main
:
WebAssembly memory operations are by definition little-endian even on
big-endian target platforms. However, other memory accesses will require
native target endianness (e.g. to access parts of the VMContext that is
also accessed by VM native code). This means on big-endian targets,
the code generator will have to handle both little- and big-endian
memory accesses. However, there is currently no way to encode that
distinction into the Cranelift IR that describes memory accesses.This patch provides such a way by adding an explicit endianness marker
to each instance of MemFlags. Since each Cranelift IR instruction that
describes memory accesses already has an instance of MemFlags attached,
that will now automatically provide endianness information.Beyond the core mechanics of adding the endianness marker to MemFlags,
this patch also updates all creators of MemFlags instances. In order
to ensure each MemFlags has the correct endianness marker, it is now
mandatory to pass such a marker whenever constructing a MemFlags.This in turn requires all places constructing a MemFlags (except for
the WebAssembly translator) to pass in the native target endianness.
This can be determined from the TargetIsa where that is readily
available; front-end code wil retrieve the target endianness from a new
member of the TargetFrontendConfig structure filled by the back end.In addition, the GlobalValueData::Load structure was changed to hold
a full MemFlags instead of just a "readonly" flag, which not only
simplifies related code a bit but now allows passing through of
endianness information as well.Finally, IR parsing and debug printing routines were updated to
handle the new endianness marker; this in turn required updating
Cranelift "filetests" test cases that contain memory instructions.This patch addresses issue #2124.
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
bjorn3 created PR Review Comment:
I think using the native endian by default would be nicer. Except for WASM you almost always want to use the native endian.
bjorn3 submitted PR Review.
uweigand submitted PR Review.
uweigand created PR Review Comment:
But where would you get that? It depends on the target.
bjorn3 submitted PR Review.
bjorn3 created PR Review Comment:
An empty
MemFlags
would mean native endian, but you could add a flag to override it to either little endian or big endian. A little endian target will then look if the big endian flag exists and use little endian otherwise. A big endian target will look if the little endian flag exists and use big endian otherwise. You could add a method toMemFlags
that accepts the native endian and computes the real endian of the memory instruction to avoid duplication of this logic.
uweigand submitted PR Review.
uweigand created PR Review Comment:
Also, the point of requiring every MemFlags to carry explicit endianness was to make it impossible to accidentally "lose" endianness e.g. during IR transformations. This is of course a decision that can be made either way: is it more important to try and prevent this type of accidental bug, at the cost of added verbosity, or the other way round?
uweigand updated PR #2488 from endian-memory-v4
to main
:
WebAssembly memory operations are by definition little-endian even on
big-endian target platforms. However, other memory accesses will require
native target endianness (e.g. to access parts of the VMContext that is
also accessed by VM native code). This means on big-endian targets,
the code generator will have to handle both little- and big-endian
memory accesses. However, there is currently no way to encode that
distinction into the Cranelift IR that describes memory accesses.This patch provides such a way by adding an explicit endianness marker
to each instance of MemFlags. Since each Cranelift IR instruction that
describes memory accesses already has an instance of MemFlags attached,
that will now automatically provide endianness information.Beyond the core mechanics of adding the endianness marker to MemFlags,
this patch also updates all creators of MemFlags instances. In order
to ensure each MemFlags has the correct endianness marker, it is now
mandatory to pass such a marker whenever constructing a MemFlags.This in turn requires all places constructing a MemFlags (except for
the WebAssembly translator) to pass in the native target endianness.
This can be determined from the TargetIsa where that is readily
available; front-end code wil retrieve the target endianness from a new
member of the TargetFrontendConfig structure filled by the back end.In addition, the GlobalValueData::Load structure was changed to hold
a full MemFlags instead of just a "readonly" flag, which not only
simplifies related code a bit but now allows passing through of
endianness information as well.Finally, IR parsing and debug printing routines were updated to
handle the new endianness marker; this in turn required updating
Cranelift "filetests" test cases that contain memory instructions.This patch addresses issue #2124.
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
cfallin submitted PR Review.
cfallin created PR Review Comment:
I think this is a case where we want to prefer explicitness: otherwise it's too easy to introduce assumptions that break only on big-endian systems. Hopefully we eventually have such a system on CI, but it's still better to lift the safety to the API/type level.
cfallin submitted PR Review.
cfallin submitted PR Review.
cfallin created PR Review Comment:
If we have doc-comments here, I think we should start with what the flag means then what the consequences (undefined behavior if you violate the implication of the flag) are. So something like:
- notrap: Program promises that the access will not trap; if it does, the result is undefined (usually a fatal signal). Causes backends to omit trap metadata for resulting access. Set by
MemFlags::trusted()
, and in a virtual-machine / sandboxing context, usually used for "trusted" loads to VM data structures, i.e., addresses not controlled by untrusted guest code.- aligned: Program promises that the access is aligned to the natural alignment for the element type. If not, result is undefined (may result in traps on some machines).
- readonly: Program promises that the access is to readonly data. If used for a store, or if the underlying memory content is changed, then results are undefined. In practice, allows the compiler to assume that one loaded value can be reused and assumed to be up-to-date.
- bigendian: (your comment)
cfallin created PR Review Comment:
Outdated doc-comment? In principle, any of the
MemFlags
bits could apply to a global value load, not justreadonly
, I think.
cfallin created PR Review Comment:
Also, to add to above: it might be nice to add some rationale here; surely some readers will be confused by this, as it feels a bit odd coming from e.g. LLVM. We should state that this arises from the goal to have a completely machine-independent IR, whose meaning doesn't change across targets; in contrast, other IRs often are generated with target information already in mind. The client that generates IR may wish to consult target info to optimally choose "native" endian for cases where it doesn't matter, but it's better to be explicit about this in all cases.
cfallin edited PR Review Comment.
cfallin closed without merge PR #2488.
Last updated: Dec 23 2024 at 12:05 UTC