wasmtime / issue #5881 cranelift-interpreter: Implement `... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / issue #5881 cranelift-interpreter: Implement `...

Wasmtime GitHub notifications bot (Feb 25 2023 at 12:01):

afonso360 opened issue #5881:

:wave: Hey,

Feature

Currently the interpreter always does native endianness memory loads and stores. This is the default and its the most common option. However they can also be tagged little and big. In those cases we should implement the correct memory accesses.

Benefit

This makes our interpreter slightly more complete! And paves the way for us to get better fuzzing support on s390x since we can start issuing these memflags.

Implementation

We need to update our State::checked_{load,store} implementation to be endianness aware. This is probably as simple as passing the MemFlags from the original load, that contain that information and a few other bits that will probably be useful in the future.

checked_load just calls DataValue::read_from_slice, which always works on the native endianness. We probably need some form of Datavalue::read_from_slice_ne/Datavalue::read_from_slice_le or something along those lines?

We have a bunch of tests for endianness (Thanks s390x!)

https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/filetests/filetests/runtests/atomic-rmw-subword-big.clif

https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/filetests/filetests/runtests/atomic-rmw-subword-little.clif

It might be worth Ctrl-F'ing the runtests folder for little or big and checking if the interpreter already supports those operations

Wasmtime GitHub notifications bot (Feb 25 2023 at 12:01):

afonso360 labeled issue #5881:

:wave: Hey,

Feature

Currently the interpreter always does native endianness memory loads and stores. This is the default and its the most common option. However they can also be tagged little and big. In those cases we should implement the correct memory accesses.

Benefit

This makes our interpreter slightly more complete! And paves the way for us to get better fuzzing support on s390x since we can start issuing these memflags.

Implementation

We need to update our State::checked_{load,store} implementation to be endianness aware. This is probably as simple as passing the MemFlags from the original load, that contain that information and a few other bits that will probably be useful in the future.

checked_load just calls DataValue::read_from_slice, which always works on the native endianness. We probably need some form of Datavalue::read_from_slice_ne/Datavalue::read_from_slice_le or something along those lines?

We have a bunch of tests for endianness (Thanks s390x!)

https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/filetests/filetests/runtests/atomic-rmw-subword-big.clif

https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/filetests/filetests/runtests/atomic-rmw-subword-little.clif

It might be worth Ctrl-F'ing the runtests folder for little or big and checking if the interpreter already supports those operations

Wasmtime GitHub notifications bot (Feb 25 2023 at 12:01):

afonso360 labeled issue #5881:

:wave: Hey,

Feature

Currently the interpreter always does native endianness memory loads and stores. This is the default and its the most common option. However they can also be tagged little and big. In those cases we should implement the correct memory accesses.

Benefit

This makes our interpreter slightly more complete! And paves the way for us to get better fuzzing support on s390x since we can start issuing these memflags.

Implementation

We need to update our State::checked_{load,store} implementation to be endianness aware. This is probably as simple as passing the MemFlags from the original load, that contain that information and a few other bits that will probably be useful in the future.

checked_load just calls DataValue::read_from_slice, which always works on the native endianness. We probably need some form of Datavalue::read_from_slice_ne/Datavalue::read_from_slice_le or something along those lines?

We have a bunch of tests for endianness (Thanks s390x!)

https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/filetests/filetests/runtests/atomic-rmw-subword-big.clif

https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/filetests/filetests/runtests/atomic-rmw-subword-little.clif

It might be worth Ctrl-F'ing the runtests folder for little or big and checking if the interpreter already supports those operations

Wasmtime GitHub notifications bot (Feb 25 2023 at 12:01):

afonso360 labeled issue #5881:

:wave: Hey,

Feature

Currently the interpreter always does native endianness memory loads and stores. This is the default and its the most common option. However they can also be tagged little and big. In those cases we should implement the correct memory accesses.

Benefit

This makes our interpreter slightly more complete! And paves the way for us to get better fuzzing support on s390x since we can start issuing these memflags.

Implementation

We need to update our State::checked_{load,store} implementation to be endianness aware. This is probably as simple as passing the MemFlags from the original load, that contain that information and a few other bits that will probably be useful in the future.

checked_load just calls DataValue::read_from_slice, which always works on the native endianness. We probably need some form of Datavalue::read_from_slice_ne/Datavalue::read_from_slice_le or something along those lines?

We have a bunch of tests for endianness (Thanks s390x!)

https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/filetests/filetests/runtests/atomic-rmw-subword-big.clif

https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/filetests/filetests/runtests/atomic-rmw-subword-little.clif

It might be worth Ctrl-F'ing the runtests folder for little or big and checking if the interpreter already supports those operations

Wasmtime GitHub notifications bot (Feb 25 2023 at 12:03):

afonso360 edited issue #5881:

:wave: Hey,

Feature

Currently the interpreter always does native endianness memory loads and stores. This is the default and its the most common option. However they can also be tagged little and big. In those cases we should implement the correct memory accesses.

Benefit

This makes our interpreter slightly more complete! And paves the way for us to get better fuzzing support on s390x since we can start issuing these memflags.

Implementation

We need to update our State::checked_{load,store} implementation to be endianness aware. This is probably as simple as passing the MemFlags from the original load, that contain that information and a few other bits that will probably be useful in the future.

checked_load just calls DataValue::read_from_slice, which always works on the native endianness. We probably need some form of Datavalue::read_from_slice_ne/Datavalue::read_from_slice_le or something along those lines?

We have a bunch of tests for endianness (Thanks s390x!)

https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/filetests/filetests/runtests/atomic-rmw-subword-big.clif

https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/filetests/filetests/runtests/atomic-rmw-subword-little.clif

It might be worth Ctrl-F'ing the runtests folder for little or big and checking if the interpreter already supports those operations

If anyone needs help working on this feel free to reach out!

Wasmtime GitHub notifications bot (Feb 26 2023 at 15:43):

jan-justin commented on issue #5881:

Hey,

I am having a look into this and kindly wanted some clarification on what was intended by:

This is probably as simple as passing the MemFlags from the original load...

My confusion is regarding where to get the MemFlags from, as not all instruction implementations that end up calling State::checked_{load,store} will have the relevant flags retrievable from InstructionData::memflags.

I do not see any other obvious way to obtain the MemFlags. Did I miss something?

Wasmtime GitHub notifications bot (Feb 26 2023 at 15:52):

afonso360 commented on issue #5881:

I had a look, I think we don't have MemFlags in Stack{Load,Store} right? I think in those cases we should be able to use MemFlags::trusted(), which creates a native endian access.

Similarly it looks like we emit a checked_load when loading GlobalValues, we should also be able to use MemFlags::trusted() for those.

Is there any other? We might not always be able to use MemFlags::trusted(), If the access may trap or may be unaligned we should use MemFlags::new().

Wasmtime GitHub notifications bot (Feb 26 2023 at 15:52):

afonso360 edited a comment on issue #5881:

I had a look, I think we don't have MemFlags in stack_{load,store} right? I think in those cases we should be able to use MemFlags::trusted(), which creates a native endian access.

Similarly it looks like we emit a checked_load when loading GlobalValues, we should also be able to use MemFlags::trusted() for those.

Is there any other? We might not always be able to use MemFlags::trusted(), If the access may trap or may be unaligned we should use MemFlags::new().

Wasmtime GitHub notifications bot (Feb 26 2023 at 15:54):

afonso360 edited a comment on issue #5881:

I had a look, I think we don't have MemFlags in stack_{load,store} right? I think in those cases we should be able to use MemFlags::trusted(), which creates a native endian access. (This is what we do in the legalizer)

Similarly it looks like we emit a checked_load when loading GlobalValues, we should also be able to use MemFlags::trusted() for those. (This is also what we do in the legalizer)

Is there any other? We might not always be able to use MemFlags::trusted(), If the access may trap or may be unaligned we should use MemFlags::new().

Wasmtime GitHub notifications bot (Feb 26 2023 at 18:14):

jan-justin commented on issue #5881:

Thanks! That helps. I will take that under advisement.

You are correct regarding the MemFlags for stack_{load,store}.

I saw the checked_load for the loading of GlobalValues, but I was going to have the flags passed in. Good thing you spotted it first!

So then the only question that remains for me is whether or not some inconsistency may arise from relying on the default endianness in the above cases? What happens when say the flag is explicitly set for big endian when doing a load, but the default endianness is little endian and used for say atomic_cas which now relies on default endianness? Can this even happen, or is it a non-issue?

Wasmtime GitHub notifications bot (Feb 26 2023 at 18:31):

afonso360 commented on issue #5881:

That... Is a big can of worms. We've been discussing how to make CLIF endianness independent for a while in #3369.

There is more to this than that thread, I don't quite remember the full scope of the discussion, But we've talked about it a lot. I think last time we did reach some sort of conclusion about how to fix it, but I can't remember. (cc: @cfallin)

I think a good middle term right now would be to have a "native" endianness property in the interpreter that we can change, and defaults to the host's endianness. I think that correctly represents the current way of dealing with things and should be fairly easy to remove once we no longer need it.

Wasmtime GitHub notifications bot (Feb 27 2023 at 10:33):

jan-justin commented on issue #5881:

I see, that is indeed a big can of worms.

In a feeble attempt to stuff those worms back into the can, what can be done for this particular issue? The steps you initially described no longer seem relevant, or am I mistaken? Is the middle ground solution, as you described it, something that is relatively straightforward for me to pursue?

Wasmtime GitHub notifications bot (Feb 27 2023 at 11:22):

afonso360 commented on issue #5881:

The steps you initially described no longer seem relevant, or am I mistaken? Is the middle ground solution, as you described it, something that is relatively straightforward for me to pursue?

I think they still are. Here's how we can model this in the interpreter.

The interpreter itself now has a notion of "native" endianness. Probably some property in the Interpreter struct. It can differ from the host endianness but by default they should match.

When doing memory accesses in State::checked_{load,store} we can first query the memflags. If they have an explicit endianness we use that, if they rely on the native endianness we use the one in the interpreter struct. (we can query this using MemFlags::endianness)

Now, as you've discovered we still have two places in CLIF where we rely on native endianness. GlobalValue loads, and stack_{loads,stores}. The only thing we really need is some MemFlags that we can pass into State::checked_{load,store} and those should be MemFlags::trusted()

I think this should be fairly straight forward, and pretty much matches up with the original plan, we were only missing the default "native endianness" in the interpreter.

By the way, in hindsight this probably shouldn't have been marked as E-Easy :sweat_smile: , sorry about that! Also feel free to reach out on zulip if you want, I might give you a faster answer there.

Wasmtime GitHub notifications bot (Feb 27 2023 at 11:23):

afonso360 edited a comment on issue #5881:

The steps you initially described no longer seem relevant, or am I mistaken? Is the middle ground solution, as you described it, something that is relatively straightforward for me to pursue?

I think they still are. Here's how we can model this in the interpreter.

The interpreter itself now has a notion of "native" endianness. Probably some property in the Interpreter struct. It can differ from the host endianness but by default they should match.

When doing memory accesses in State::checked_{load,store} we can first query the memflags. If they have an explicit endianness we use that, if they rely on the native endianness we use the one in the interpreter struct. (we can query this using MemFlags::endianness)

Now, as you've discovered we still have two places in CLIF where we rely on native endianness. GlobalValue loads, and stack_{loads,stores}. The only thing we really need is some MemFlags that we can pass into State::checked_{load,store} and those should be MemFlags::trusted()

I think this should be fairly straight forward, and pretty much matches up with the original plan, we were only missing the default "native endianness" in the interpreter.

By the way, in hindsight this probably shouldn't have been marked as E-Easy :sweat_smile: , sorry about that! Also feel free to reach out on zulip if you want, I might give you a faster answer there.

Wasmtime GitHub notifications bot (Feb 27 2023 at 11:35):

afonso360 edited a comment on issue #5881:

The steps you initially described no longer seem relevant, or am I mistaken? Is the middle ground solution, as you described it, something that is relatively straightforward for me to pursue?

I think they still are. Here's how we can model this in the interpreter.

We should add some sort of "native" endianness to the interpreter. Probably some property in the Interpreter struct. It can differ from the host endianness but by default they should match.

When doing memory accesses in State::checked_{load,store} we can first query the memflags. If they have an explicit endianness we use that, if they rely on the native endianness we use the one in the interpreter struct. (we can query this using MemFlags::endianness)

Now, as you've discovered we still have two places in CLIF where we rely on native endianness. GlobalValue loads, and stack_{loads,stores}. The only thing we really need is some MemFlags that we can pass into State::checked_{load,store} and those should be MemFlags::trusted()

I think this should be fairly straight forward, and pretty much matches up with the original plan, we were only missing the default "native endianness" in the interpreter.

By the way, in hindsight this probably shouldn't have been marked as E-Easy :sweat_smile: , sorry about that! Also feel free to reach out on zulip if you want, I might give you a faster answer there.

Wasmtime GitHub notifications bot (Feb 27 2023 at 11:36):

afonso360 edited a comment on issue #5881:

The steps you initially described no longer seem relevant, or am I mistaken? Is the middle ground solution, as you described it, something that is relatively straightforward for me to pursue?

I think they still are. Here's how we can model this in the interpreter.

We should add some sort of "native" endianness to the interpreter. Probably some property in the Interpreter struct. It can differ from the host endianness but by default they should match.

This is only used as a last resort, when we are performing some operation, and the CLIF behaviour was unclear about the endianness. Such as when MemFlags don't specify an endianness.

When doing memory accesses in State::checked_{load,store} we can first query the memflags. If they have an explicit endianness we use that, if they rely on the native endianness we use the one in the interpreter struct. (we can query this using MemFlags::endianness)

Now, as you've discovered we still have two places in CLIF where we rely on native endianness. GlobalValue loads, and stack_{loads,stores}. The only thing we really need is some MemFlags that we can pass into State::checked_{load,store} and those should be MemFlags::trusted()

I think this should be fairly straight forward, and pretty much matches up with the original plan, we were only missing the default "native endianness" in the interpreter.

By the way, in hindsight this probably shouldn't have been marked as E-Easy :sweat_smile: , sorry about that! Also feel free to reach out on zulip if you want, I might give you a faster answer there.

Wasmtime GitHub notifications bot (Feb 27 2023 at 12:07):

jan-justin commented on issue #5881:

Thanks for the clarification, I really do appreciate it!

That sounds like a plan to me and I will have a look at it.

It's not a problem :smiley: . I will definitely reach out to you on zulip at some point, however, I figured my questions were better placed here for the sake of anyone else who could potentially take over, if it came to that.

Wasmtime GitHub notifications bot (Mar 02 2023 at 12:31):

afonso360 closed issue #5881:

:wave: Hey,

Feature

Currently the interpreter always does native endianness memory loads and stores. This is the default and its the most common option. However they can also be tagged little and big. In those cases we should implement the correct memory accesses.

Benefit

This makes our interpreter slightly more complete! And paves the way for us to get better fuzzing support on s390x since we can start issuing these memflags.

Implementation

We need to update our State::checked_{load,store} implementation to be endianness aware. This is probably as simple as passing the MemFlags from the original load, that contain that information and a few other bits that will probably be useful in the future.

checked_load just calls DataValue::read_from_slice, which always works on the native endianness. We probably need some form of Datavalue::read_from_slice_ne/Datavalue::read_from_slice_le or something along those lines?

We have a bunch of tests for endianness (Thanks s390x!)

https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/filetests/filetests/runtests/atomic-rmw-subword-big.clif

https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/filetests/filetests/runtests/atomic-rmw-subword-little.clif

It might be worth Ctrl-F'ing the runtests folder for little or big and checking if the interpreter already supports those operations

If anyone needs help working on this feel free to reach out!

Last updated: Apr 17 2025 at 16:03 UTC