wasmtime / PR #2874 Support IBM z/Architecture · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / PR #2874 Support IBM z/Architecture

Wasmtime GitHub notifications bot (May 03 2021 at 17:11):

uweigand opened PR #2874 from s390x-backend to main:

This adds support for the IBM z/Architecture (s390x-ibm-linux).

The status of the s390x backend in its current form is:

Wasmtime is fully functional and passes all tests on s390x.

All back-end features supported, with the exception of SIMD.

There is still a lot of potential for performance improvements.

Currently the only supported processor type is z15.

Wasmtime GitHub notifications bot (May 03 2021 at 17:14):

cfallin assigned PR #2874 to cfallin.

Wasmtime GitHub notifications bot (May 03 2021 at 18:32):

uweigand updated PR #2874 (assigned to cfallin) from s390x-backend to main.

Wasmtime GitHub notifications bot (May 03 2021 at 18:35):

uweigand updated PR #2874 (assigned to cfallin) from s390x-backend to main.

Wasmtime GitHub notifications bot (May 03 2021 at 19:17):

uweigand updated PR #2874 (assigned to cfallin) from s390x-backend to main.

Wasmtime GitHub notifications bot (May 07 2021 at 18:04):

cfallin submitted PR Review.

Wasmtime GitHub notifications bot (May 07 2021 at 18:04):

cfallin created PR Review Comment:

It might be good to have a note here about the fact that no FP is used; this is slightly different than e.g. what the ASCII-art doc comment in abi_impl.rs describes for vanilla/common ABI details. Specifically, documenting how the SP is decremented once, including the maximum-needed outgoing args space, would be helpful, I think.

Wasmtime GitHub notifications bot (May 07 2021 at 18:04):

cfallin created PR Review Comment:

AArch64-ism left in the comments here (but I think the same thing is true of this platform, namely that no addressing modes that modify registers are used?).

Wasmtime GitHub notifications bot (May 07 2021 at 18:04):

cfallin submitted PR Review.

Wasmtime GitHub notifications bot (May 07 2021 at 18:04):

cfallin created PR Review Comment:

Same comment here as above -- assert our assumption re: register range.

Wasmtime GitHub notifications bot (May 07 2021 at 18:04):

cfallin created PR Review Comment:

Can we assert that this is only used in a way that is consistent with the regalloc metadata we generate (if I understand correctly, that would be rt2 == rt + 1 or similar)?

Wasmtime GitHub notifications bot (May 07 2021 at 18:04):

cfallin created PR Review Comment:

It might be useful to factor out helpers for bit-manipulation like this (this is a sign-extend IIUC, so maybe sign_extend_to_u64?); I know we've been far from consistent about this in the rest of the backends, but there is no better time to start :-)

Same comment to the masking below, too (mask_to_bits or similar).

Wasmtime GitHub notifications bot (May 07 2021 at 18:04):

cfallin created PR Review Comment:

Should this be Load64SExt8? (I'm not sure if a 1-to-64-bit extension is common -- perhaps this arises in Bextend? -- but this caught my eye, and deserves a comment if correct to note why it differs)

Wasmtime GitHub notifications bot (May 07 2021 at 18:04):

cfallin created PR Review Comment:

This and the below load-op fusion cases are I think similar to this bug we saw on x64: #2576.

The fundamental issue was that we were lowering (and thus matching on the load and sinking it) a compare multiple times, whenever its flags output was used, because we do not handle flags the same way we handle other registers (and it looks like this backend is the same). The input-matching and sinking support in the core lowering logic assumes that one will only match and then sink once; in certain cases the repeats interact poorly with instruction ordering and cause undefined values to be used.

We really should come up with an impossible-to-do-by-construction way of encoding this restriction in the LowerCtx API, but until then, if this is also the case here (again, I'm pattern-matching on the bug but am not certain everything is the same in this backend!), the simplest fix is just to avoid load-op merging with compares.

Wasmtime GitHub notifications bot (May 07 2021 at 20:35):

bjorn3 submitted PR Review.

Wasmtime GitHub notifications bot (May 07 2021 at 20:35):

bjorn3 created PR Review Comment:

I would assume that Load64SExt8 is wrong here too. The loaded bool is likely stored as a zero extended byte, not sign extended byte and as such it would need to be a load + arithmetic shift left of 63 bits I think.

Wasmtime GitHub notifications bot (May 10 2021 at 14:01):

uweigand updated PR #2874 (assigned to cfallin) from s390x-backend to main.

Wasmtime GitHub notifications bot (May 10 2021 at 14:02):

uweigand submitted PR Review.

Wasmtime GitHub notifications bot (May 10 2021 at 14:02):

uweigand created PR Review Comment:

OK, added comment.

Wasmtime GitHub notifications bot (May 10 2021 at 14:03):

uweigand submitted PR Review.

Wasmtime GitHub notifications bot (May 10 2021 at 14:03):

uweigand created PR Review Comment:

This comment is obsolete anyway (copied from a old version) - the current mapper design doesn't even distinguish between a pre- and post-map any more. I've removed the comment.

Wasmtime GitHub notifications bot (May 10 2021 at 14:04):

uweigand submitted PR Review.

Wasmtime GitHub notifications bot (May 10 2021 at 14:04):

uweigand created PR Review Comment:

No, it will actually be used with larger ranges. I've now just implemented the correct semantics for all cases.

Wasmtime GitHub notifications bot (May 10 2021 at 14:06):

uweigand submitted PR Review.

Wasmtime GitHub notifications bot (May 10 2021 at 14:06):

uweigand created PR Review Comment:

Here, we cannot implement the correct semantics, because the remapper will not map one contiguous range to another contiguous range. But here this really doesn't matter -- this code will never get called since these instructions are never present before register allocation. (The only user of these instructions is the prologue/epilogue code.)

Wasmtime GitHub notifications bot (May 10 2021 at 14:06):

uweigand submitted PR Review.

Wasmtime GitHub notifications bot (May 10 2021 at 14:06):

uweigand created PR Review Comment:

This is also moot. There actually are no 1-bit memory access instructions in cranelift, so this can never happen. I've removed all the 1-bit cases here.

Wasmtime GitHub notifications bot (May 10 2021 at 14:06):

uweigand created PR Review Comment:

OK, done.

Wasmtime GitHub notifications bot (May 10 2021 at 14:06):

uweigand submitted PR Review.

Wasmtime GitHub notifications bot (May 10 2021 at 14:10):

uweigand created PR Review Comment:

Good point. It actually depends on the caller -- if lower_icmp_to_flags is used to lower a plain Icmp, then it is actually correct to sink memory loads (just like for any other instruction where we're sinking memory accesses). However, lower_icmp_to_flags is also used to combine longer instruction sequences, and in those cases, sinking memory loads is only legal if all intermediate instructions only have a single user (there is currently no obvious way to check for that condition in the back-end, unfortunately). I've now added a may_sink_memory flag to the routine, which is set by the caller (somewhat conservatively at the moment, this can be improved once we have support for better checks).

Wasmtime GitHub notifications bot (May 10 2021 at 14:10):

uweigand submitted PR Review.

Wasmtime GitHub notifications bot (May 10 2021 at 14:10):

uweigand edited PR Review Comment.

Wasmtime GitHub notifications bot (May 10 2021 at 20:46):

cfallin submitted PR Review.

Wasmtime GitHub notifications bot (May 10 2021 at 20:46):

cfallin created PR Review Comment:

Right, I agree -- the issue is precisely when icmp's result is matched more than once as an operand of another instruction. The fix looks good -- thanks!

Wasmtime GitHub notifications bot (May 10 2021 at 20:49):

cfallin submitted PR Review.

Wasmtime GitHub notifications bot (May 10 2021 at 20:49):

cfallin created PR Review Comment:

Ah, yes, this makes sense -- thanks.

Wasmtime GitHub notifications bot (May 10 2021 at 20:50):

cfallin submitted PR Review.

Wasmtime GitHub notifications bot (May 10 2021 at 20:50):

cfallin created PR Review Comment:

Perfect, this is very helpful -- thank you!

Wasmtime GitHub notifications bot (May 10 2021 at 20:52):

cfallin submitted PR Review.

Wasmtime GitHub notifications bot (May 10 2021 at 20:53):

cfallin merged PR #2874.

Last updated: Apr 17 2025 at 00:13 UTC