Stream: git-wasmtime

Topic: wasmtime / PR #5976 aarch64: Specialize constant vector s...


view this post on Zulip Wasmtime GitHub notifications bot (Mar 10 2023 at 03:24):

alexcrichton opened PR #5976 from aarch64-constant-shift to main:

This commit adds special lowering rules for
vector-shifts-by-constant-amounts to use dedicated instructions which cuts down on the codegen here quite a bit for constant values.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (Mar 10 2023 at 04:06):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 10 2023 at 04:06):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 10 2023 at 04:06):

cfallin created PR review comment:

Orthogonal to your changes here but I'm surprised we haven't moved from inline constants we branch around to the VCodeConstant infrastructure; I guess I am confusing this with the x64 backend, where we do have that. Using the end-of-func constant pool here as well might be a nice improvement for a followup PR, especially if the data lands in the middle of an inner loop and pushes the body into another icache line or fetch cycle...

view this post on Zulip Wasmtime GitHub notifications bot (Mar 10 2023 at 15:15):

alexcrichton submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 10 2023 at 15:15):

alexcrichton created PR review comment:

Agreed! The fallback for shuffle does an inline load like this as well, and I'm curious enough to hopefully try to give this a stab in the near future

view this post on Zulip Wasmtime GitHub notifications bot (Mar 10 2023 at 15:32):

alexcrichton updated PR #5976 from aarch64-constant-shift to main.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 10 2023 at 19:02):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 10 2023 at 19:02):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 10 2023 at 19:02):

cfallin created PR review comment:

I might be missing something obvious but is there any reason we can't match this left-shift-by-0 (when masked) in the same way and pass the value straight through?

view this post on Zulip Wasmtime GitHub notifications bot (Mar 10 2023 at 19:54):

alexcrichton submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 10 2023 at 19:54):

alexcrichton created PR review comment:

This seems to me like something ideally left to an egraph-style optimization which removes the shift before codegen, and I ideally wanted to do the same thing for the other lowerings but it was just the restrictions on arm64 instruction incoded which prevented ignoring the shift-right-by-zero case. That being said I may as well add it to be consistent with the other rules!

view this post on Zulip Wasmtime GitHub notifications bot (Mar 10 2023 at 20:02):

alexcrichton updated PR #5976 from aarch64-constant-shift to main.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 10 2023 at 20:02):

alexcrichton has enabled auto merge for PR #5976.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 10 2023 at 20:10):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 10 2023 at 20:10):

cfallin created PR review comment:

Ah, yes, that's true -- it's a reasonable algebraic simplification that we should include in the mid-end as well (might increase reach of other opts etc).

view this post on Zulip Wasmtime GitHub notifications bot (Mar 10 2023 at 20:17):

jameysharp submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 10 2023 at 20:17):

jameysharp created PR review comment:

I'm confused. We do have egraph rules for removing shifts and rotates by 0. Are they not firing here? Are these filetests running with optimization disabled, maybe?

If you need to match on shifts by zero to avoid generating illegal instructions in a particular backend, that's important since we don't necessarily run the egraph pass. But I would be inclined not to bother special-casing anything in the backend that isn't necessary for correctness if it can be done in the egraph pass.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 10 2023 at 20:24):

alexcrichton submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 10 2023 at 20:24):

alexcrichton created PR review comment:

The shifts here are by the full bit width, e.g. a shift by 8 for i8x16, which I don't think is accounted for in the egraph rules (but I just checked to see the checks for 0-shifts, so this would "just" be an extension of that rule). I don't know whether filetests run with or without optimizations, but I would guess without.

Avoiding the 0-case is required for the shift-right cases for correctness since it looks like right-shifts with zero-width can't be encoded as such an instruction (I think this has to do with how right-shift is a negative left-shift on AArch64 or something like that, so maybe this is a too-strict assertion somewhere, not sure, I didn't dig deeply enough to go that far).

I went ahead and added the case here as requested for shift-lefts even though it isn't strictly necessary as required by the backend. In some sense though it does sort of feel like if someone reads that rule it depends on the time of day as to whether they would judge whether or not the rule should be included, so in that sense I don't know whether to leave it in or not as it's not obviously correct nor obviously incorrect.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 10 2023 at 20:30):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 10 2023 at 20:30):

cfallin created PR review comment:

I agree with @jameysharp in general that we shouldn't pull too many opts into lowering, because of cost in non-optimizing mode and because they can't interact with the main rewriting fixpoint when we're already here. I don't feel too strongly about this case; I guess I would note that it's a nice symmetry with the right-shift case here (which is why it popped out at me originally). On the other hand, we don't special-case x + 0 in the backends, nor should we. So... I'll happily hand the baton to @jameysharp for final judgment here and only request that we extend the mid-end rule to understand masking and cover this case.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 13 2023 at 16:09):

jameysharp submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 13 2023 at 16:09):

jameysharp created PR review comment:

Well, if I have the baton…

  1. I think our general principle should be that algebraic simplifications should not be in backend rules. We should save our backend complexity budget for special cases where we can select a better instruction sequence due to target-specific features.
  2. I don't feel strongly enough about this principle to block merging this PR. I'd lean toward removing the special case for left-shifts but you're welcome to merge as-is if you want to on the basis of @cfallin's r+.

We'll have to figure out how best to handle the wrapping semantics of shifts in the mid-end rules. We should review any existing mid-end rules that match on constant shift amounts to make sure they're handling this correctly, too.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 13 2023 at 19:33):

alexcrichton updated PR #5976 from aarch64-constant-shift to main.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 13 2023 at 19:33):

alexcrichton submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 13 2023 at 19:33):

alexcrichton created PR review comment:

Works for me! I've removed the left-shift case and rebased as well :+1:

view this post on Zulip Wasmtime GitHub notifications bot (Mar 13 2023 at 19:33):

alexcrichton has enabled auto merge for PR #5976.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 13 2023 at 21:59):

alexcrichton updated PR #5976 from aarch64-constant-shift to main.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 13 2023 at 22:00):

alexcrichton has enabled auto merge for PR #5976.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 13 2023 at 23:16):

alexcrichton merged PR #5976.


Last updated: Jan 24 2025 at 00:11 UTC