wasmtime / issue #3329 cranelift: Fix big-endian regressi... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / issue #3329 cranelift: Fix big-endian regressi...

Wasmtime GitHub notifications bot (Sep 11 2021 at 18:01):

This makes these functions behave differently depending on the endianness. Should they instead get an endianness arg. I presume this will break interpreting clif for a little endian target on an s390x system.

Wasmtime GitHub notifications bot (Sep 11 2021 at 18:01):

bjorn3 edited a comment on issue #3329:

This makes these functions behave differently depending on the endianness. Should they instead get an endianness arg? I presume this will break interpreting clif for a little endian target on an s390x system.

Wasmtime GitHub notifications bot (Sep 11 2021 at 20:34):

uweigand commented on issue #3329:

This makes these functions behave differently depending on the endianness. Should they instead get an endianness arg? I presume this will break interpreting clif for a little endian target on an s390x system.

Interesting. I had only looked at the use from the filetest function runner via write_value_to/read_value_from. Those clearly need to use the current host's native byte order. For the interpreter, given that this is emulating the cranelift IR load/store operations, it should really respect the explicit byte order encoded in the memflags of those operations, if any, and fall back to the default byte order of the target ISA otherwise. This would indeed argue for an endianness arg.

On the other hand, I'm not sure whether the interpreter will work correctly in cross-endian mode even then, I suspect there may be other problems hidden. Is the interpreter currently being tested by and CI test cases? I haven't noticed any failures ...

Wasmtime GitHub notifications bot (Sep 11 2021 at 20:34):

uweigand edited a comment on issue #3329:

This makes these functions behave differently depending on the endianness. Should they instead get an endianness arg? I presume this will break interpreting clif for a little endian target on an s390x system.

Interesting. I had only looked at the use from the filetest function runner via write_value_to/read_value_from. Those clearly need to use the current host's native byte order. For the interpreter, given that this is emulating the cranelift IR load/store operations, it should really respect the explicit byte order encoded in the memflags of those operations, if any, and fall back to the default byte order of the target ISA otherwise. This would indeed argue for an endianness arg.

On the other hand, I'm not sure whether the interpreter will work correctly in cross-endian mode even then, I suspect there may be other problems hidden. Is the interpreter currently being tested by any CI test cases? I haven't noticed any failures ...

Wasmtime GitHub notifications bot (Sep 11 2021 at 20:44):

uweigand commented on issue #3329:

Looking at the interpreter code a bit more, it currently doesn't even seem to have any notion of what the target ISA is? This may require more significant changes to get correct.

I'd prefer to get this patch in for now to get back the old behavior of write_value_to/read_value_from and fix the currently existing regression in filetest. Getting the interpreter to work correctly in cross-endian mode can then be done separately.

Wasmtime GitHub notifications bot (Sep 11 2021 at 20:44):

bjorn3 commented on issue #3329:

Is the interpreter currently being tested by any CI test cases? I haven't noticed any failures ...

Yes, but I don't think there are any byte order tests for the interpreter.

Wasmtime GitHub notifications bot (Sep 11 2021 at 21:05):

afonso360 commented on issue #3329:

No, the interpreter currently only does native endianness, and I wouldn't be surprised if we assume little endian somewhere in there. We've only recently added support for memory loads/stores in #3187 and #3302 but they don't deal with endianness yet.

Since the interpreter is being brought up as a fuzzing oracle, adding a cross-endian mode would be nice, but I agree with @uweigand that we should do that as a separate PR. I've added this to our tracking list for fuzzing in #3050.

Wasmtime GitHub notifications bot (Sep 13 2021 at 17:23):

cfallin commented on issue #3329:

@afonso360 here's a slightly wacky idea that might help us ensure interpreter correctness: could we augment the interpreter's heap state to track a "native endian" sort of taint on certain address ranges, and only allow those addresses to be loaded with the equivalent native-endian load?

My thought that started this is: we want the interpreter to be completely deterministic, and generally we want CLIF to be completely deterministic. We allowed for "native endian" loads and stores in order to allow frontends to generate efficient CLIF without knowing the endianness of the target. However, it is undefined behavior (at the CLIF level) -- or at least it should be -- to store a native-endian value then actually observe its endianness, i.e., use an explicit-endianness load or load individual bytes or partial words.

So the thought is: let's enforce that in the interpreter. This would provide additional fuzzing coverage for any endianness bugs as well, which seems important to me as long as we care about big-endian platforms (and we do!). Thoughts?

Wasmtime GitHub notifications bot (Sep 13 2021 at 18:44):

bjorn3 commented on issue #3329:

However, it is undefined behavior (at the CLIF level) -- or at least it should be -- to store a native-endian value then actually observe its endianness, i.e., use an explicit-endianness load or load individual bytes or partial words.

I don't agree with that. IMO using the default endianness should mean the native endian of the target, not an undefined endianness. This is how I interpreted it in cg_clif. I don't ever use explicit endianness markers as I always need the native endianness of the target.

Wasmtime GitHub notifications bot (Sep 13 2021 at 19:52):

cfallin commented on issue #3329:

To clarify, the above did not mean that the store would use "an undefined endianness". Rather, I was proposing that we declare that, as a principle, CLIF whose result is endian-dependent should be disallowed (or more precisely, specific-endian loads/stores should not interact with native-endian loads/stores). One could see this as following from the principle that CLIF is platform-independent and its meaning is fixed and deterministic, regardless of the underlying platform; if we don't have that property, then building an interpreter suddenly becomes more difficult. (The open question is what happens when native-endian loads/stores of different widths interact, in the interpreter; @afonso360 thoughts?)

If cg_clif only uses native-endian loads/stores, then I think it satisfies the above constraint trivially (native-endian loads/stores will never interact with specific-endian loads/stores if the latter are never used).

Wasmtime GitHub notifications bot (Sep 13 2021 at 20:26):

afonso360 commented on issue #3329:

Rather, I was proposing that we declare that, as a principle, CLIF whose result is endian-dependent should be disallowed (or more precisely, specific-endian loads/stores should not interact with native-endian loads/stores).

I'm not sure we can get away that easily, even with native only loads/stores we can easily determine the target endianness with loads and stores of different sizes (i.e. store.i16 0x00FF and load.i8 to the same address to should give us different results depending on endianness).

This means that pretty much every non trivial CLIF file out there is undefined now that we added the big endian possibility.
The only way I can see us getting the "CLIF is platform-independent" property back, is to default to one endianness, which also feels like we have second class citizens, which isn't great :oh_no:

One could see this as following from the principle that CLIF is platform-independent and its meaning is fixed and deterministic, regardless of the underlying platform; if we don't have that property, then building an interpreter suddenly becomes more difficult. (The open question is what happens when native-endian loads/stores of different widths interact, in the interpreter; @afonso360 thoughts?)

My inital plan was to add endianness to the interpreter. When loading or storing to an address, if the endianness of the interpreter is different from the host we bswap the memory before/after the operation. (also taking into consideration whats requested by MemFlags)

I don't know if this fully solves the issue (this was what I first though about when thinking about adding this), but I like this over the alternative of just doing whatever the native arch does, since it means we can emulate BE targets.

Wasmtime GitHub notifications bot (Sep 13 2021 at 20:28):

afonso360 edited a comment on issue #3329:

Rather, I was proposing that we declare that, as a principle, CLIF whose result is endian-dependent should be disallowed (or more precisely, specific-endian loads/stores should not interact with native-endian loads/stores).

I'm not sure we can get away that easily, even with native only loads/stores we can easily determine the target endianness with loads and stores of different sizes (i.e. store.i16 0x00FF and load.i8 to the same address to should give us different results depending on endianness).

This means that pretty much every non trivial CLIF file out there is undefined now that we added the big endian possibility.
The only way I can see us getting the "CLIF is platform-independent" property back, is to default to one endianness, which also feels like we have second class citizens, which isn't great :oh_no:

One could see this as following from the principle that CLIF is platform-independent and its meaning is fixed and deterministic, regardless of the underlying platform; if we don't have that property, then building an interpreter suddenly becomes more difficult. (The open question is what happens when native-endian loads/stores of different widths interact, in the interpreter; @afonso360 thoughts?)

My inital plan was to add endianness to the interpreter. When loading or storing to an address, if the endianness of the interpreter is different from the host we bswap the memory before/after the operation. (also taking into consideration whats requested by MemFlags)

I don't know if this fully solves the issue (this was what I first though about when thinking about adding this), but I like this over the alternative of just doing whatever the native arch does, since it means we can emulate BE targets.

It shouldn't be too hard to implement if this is all we have to do, all memory accesses go through the same interface, so we only really need to update one place.

Wasmtime GitHub notifications bot (Sep 17 2021 at 21:48):

cfallin commented on issue #3329:

Hmm, this is a bit of a tricky design problem...

At a high level, I think the "CLIF is platform independent" property is really valuable. It just seems like a continuing source of endianness bugs and confusion if it's possible to write a CLIF file whose results according to the interpreter are different when running on two different systems. That's the sort of base invariant that we want to think really hard about before compromising, because otherwise the implications percolate upward to other issues.

The visibility of endianness can arise in two ways:

Native stores/loads of different widths (e.g., store a u32 and load the first u8), and

Native loads/stores interacting with explicit-endianness loads/stores

To define a single expected interpreter result, we have to either disallow both of the above (native loads/stores exist but only ever interact with other native loads/stores of the same width), or remove native loads/stores.

I think we've sort of backed ourselves into a corner here. By including native-endian loads/stores, essentially we're making CLIF parametric on endianness: its behavior can depend on the endianness of the platform it's compiled on. The goal of this is to enable a sort of "late binding", where CLIF producers can generate code without needing to know the target's endianness. The alternatives are this either: (i) disallow any CLIF that observes endianness, which I don't think is possible as long as we want to be a general target, or (ii) bind the endianness earlier, i.e. as part of the builder, with some target knowledge. Then a "native" load/store takes its endianness at build time, and all CLIF has explicit endianness.

Possibly relevant is that early binding of all sorts of platform properties is the norm in other compilers, AFAIK; e.g. LLVM has the data layout that specifies, among other things, endianness, and is part of the IR.

I know that that's a reversal from the endianness discussion we had last year but we haven't really been forced to define the single, platform-independent expected result of CLIF until we had a platform-independent interpreter. Now that we do, the convenience we gained with the "native" ops as shorthand for the platform endianness may be worth less than having well-defined behavior across platforms.

Thoughts?

Wasmtime GitHub notifications bot (Sep 17 2021 at 21:54):

cfallin commented on issue #3329:

But also, I think we should be mindful of the fact that s390x has a regression right now that this PR fixes, and all of my thoughts above are out-of-scope for this PR, so I think I'm inclined to merge this now and fix the regression, and then open a new issue to discuss further!

Last updated: Apr 18 2025 at 11:03 UTC