Stream: git-wasmtime

Topic: wasmtime / PR #10260 x64: rework new assembler rule prior...


view this post on Zulip Wasmtime GitHub notifications bot (Feb 20 2025 at 19:06):

abrown requested cfallin for a review on PR #10260.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 20 2025 at 19:06):

abrown requested wasmtime-compiler-reviewers for a review on PR #10260.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 20 2025 at 19:06):

abrown opened PR #10260 from abrown:assembler-priorities to bytecodealliance:main:

See commit messages for details.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 20 2025 at 19:19):

cfallin submitted PR review:

Thanks! Some thoughts on ordering below but otherwise LGTM.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 20 2025 at 19:19):

cfallin created PR review comment:

It may be a personal preference, but I find the rule-sets easier to read / double-check when they are sorted descending by priority, rather than organized by some other column (type, here). Would you mind regrouping that way so we can see the overall prioritization?

view this post on Zulip Wasmtime GitHub notifications bot (Feb 20 2025 at 19:35):

abrown submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 20 2025 at 19:35):

abrown created PR review comment:

I agree that we should do descending priority (3 ... 2 ... 1) but this type-based reorganization is @alexcrichton's doing. @alexcrichton, thoughts?

view this post on Zulip Wasmtime GitHub notifications bot (Feb 20 2025 at 19:37):

abrown submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 20 2025 at 19:37):

abrown created PR review comment:

(another question is what should the priorities be: shouldn't GPR-matching be highest priority since we expect that to be picked most often? Etc.)

view this post on Zulip Wasmtime GitHub notifications bot (Feb 20 2025 at 19:37):

abrown submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 20 2025 at 19:37):

abrown created PR review comment:

Here's reordered by @alexcrichton's priorities and I don't think the grouping is intuitive:

(rule 3 (x64_add $I8  src1 (is_mem src2))     (x64_addb_rm src1 src2))
(rule 3 (x64_add $I16 src1 (is_mem src2))     (x64_addw_rm src1 src2))
(rule 3 (x64_add $I32 src1 (is_gpr_mem src2)) (x64_addl_rm src1 src2))
(rule 3 (x64_add $I64 src1 (is_gpr_mem src2)) (x64_addq_rm src1 src2))
(rule 2 (x64_add $I8  src1 (is_gpr src2))     (x64_addl_rm src1 src2))
(rule 2 (x64_add $I16 src1 (is_gpr src2))     (x64_addl_rm src1 src2))
(rule 2 (x64_add $I32 src1 (is_simm8 src2))   (x64_addl_mi_sxb src1 src2))
(rule 2 (x64_add $I64 src1 (is_simm8 src2))   (x64_addq_mi_sxb src1 src2))
(rule 1 (x64_add $I8  src1 (is_imm8 src2))    (x64_addb_mi src1 src2))
(rule 1 (x64_add $I16 src1 (is_imm16 src2))   (x64_addw_mi src1 src2))
(rule 1 (x64_add $I32 src1 (is_imm32 src2))   (x64_addl_mi src1 src2))
(rule 1 (x64_add $I64 src1 (is_simm32 src2))  (x64_addq_mi_sxl src1 src2))

view this post on Zulip Wasmtime GitHub notifications bot (Feb 20 2025 at 19:47):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 20 2025 at 19:47):

cfallin created PR review comment:

Hmm, yeah, that's pretty subtle too.

I guess I'm curious about the motivation to "compress" priorities and make more use of non-overlap: if we know for sure the order in which we want rules to match, I personally don't see a huge issue with linearizing priorities (i.e. encoding a more total order than strictly required to resolve overlaps), if it makes the code clearer.

Alternately if rules are grouped into "totally ordered subgroups", closer to what this PR had, it's not the worst thing if it's fully regular and if the groups are visually separated. I think I was partly reacting to the reverse ordering (lower prio rules first) and the bunched-up view that makes it harder to see the overall shape.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 20 2025 at 22:31):

alexcrichton submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 20 2025 at 22:31):

alexcrichton created PR review comment:

Personally I prefer logical grouping by "thing" rather than priority per-se, an example is isub in pulley. IIRC I've written rules like this on x64 as well, but I may also be misremembering.

As for the original motivation: personally I think we should try to avoid "just keep adding one to the priority until it compiles" because (a) it feels bad and (b) I'm led to believe the generated code is worse to be a sequence of matches rather than "one big match".

In the abstract there shouldn't be priorities in these rules at all, everything is disjoint. It should be one giant match and we should be able to do (GprMemImm.Mem addr) or something like that. That doesn't work because GprMemImm is a newtype wrapper around RegMemImm which can't be pattern-matched in ISLE.

In my mind I'm trying to nudge in the direction of removing the priorities here by reducing the number of priorities. Here I agree though that having an ascending/descending list of priorities helps see things. For example @abrown with the list you pasted it seems like we can clean this up a bit:

(rule 3 (x64_add $I8  src1 (is_mem src2))     (x64_addb_rm src1 src2))
(rule 3 (x64_add $I16 src1 (is_mem src2))     (x64_addw_rm src1 src2))
(rule 3 (x64_add $I32 src1 (is_gpr_mem src2)) (x64_addl_rm src1 src2))
(rule 3 (x64_add $I64 src1 (is_gpr_mem src2)) (x64_addq_rm src1 src2))
(rule 2 (x64_add $I8  src1 (is_gpr src2))     (x64_addl_rm src1 src2))
(rule 2 (x64_add $I16 src1 (is_gpr src2))     (x64_addl_rm src1 src2))
;; imm things

to instead be

(rule 3 (x64_add $I8  src1 (is_gpr_mem src2)) (x64_addb_rm src1 src2))
(rule 3 (x64_add $I16 src1 (is_gpr_mem src2)) (x64_addw_rm src1 src2))
(rule 3 (x64_add $I32 src1 (is_gpr_mem src2)) (x64_addl_rm src1 src2))
(rule 3 (x64_add $I64 src1 (is_gpr_mem src2)) (x64_addq_rm src1 src2))

;; imm things

I think that should have no loss of generality? (and we could probably remove the is_mem constructor since uses of that probably want to gbe is_gpr_mem instead).


So altogether, how about this:

view this post on Zulip Wasmtime GitHub notifications bot (Feb 20 2025 at 22:43):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 20 2025 at 22:43):

cfallin created PR review comment:

That sounds good to me; all my opinions are weakly-held, I mostly just want to be able to easily decipher how the rules layer, and that seems like an improvement.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 21 2025 at 18:28):

abrown submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 21 2025 at 18:28):

abrown created PR review comment:

Nah, @alexcrichton, we can't just use is_gpr_mem everywhere. For those smaller widths, we are actually emitting two different instructions based whether is_mem or is_gpr fires.

I'm not sure I understand how to decipher the last paragraph there and I would say I don't really care too much either way except that we need to actually get the priorities correct and clear or we have to fix things again like in e6a97ad. What is weird here is that some priorities don't really matter but others do, like the imm extractors you point out.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 21 2025 at 18:29):

abrown edited PR review comment.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 21 2025 at 19:08):

alexcrichton submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 21 2025 at 19:08):

alexcrichton created PR review comment:

Oh d'oh that's right, I keep forgetting about the subtelties here...

Ah yeah I also don't mean to say priorities don't matter, because you're right they do for immediates. How about:

;; special case smaller immediates for a smaller instruction encoding
(rule 3 (x64_add $I32 src1 (is_simm8 src2))   (x64_addl_mi_sxb src1 src2))
(rule 3 (x64_add $I64 src1 (is_simm8 src2))   (x64_addq_mi_sxb src1 src2))

;; matching immediates to the corresponding instruction
(rule 2 (x64_add $I8  src1 (is_imm8 src2))    (x64_addb_mi src1 src2))
(rule 2 (x64_add $I16 src1 (is_imm16 src2))   (x64_addw_mi src1 src2))
(rule 2 (x64_add $I32 src1 (is_imm32 src2))   (x64_addl_mi src1 src2))
(rule 2 (x64_add $I64 src1 (is_simm32 src2))  (x64_addq_mi_sxl src1 src2))

;; extend 8/16-bit register-to-register addition to avoid false dependencies
;; on previous instructions
(rule 1 (x64_add $I8  src1 (is_gpr src2))     (x64_addl_rm src1 src2))
(rule 1 (x64_add $I16 src1 (is_gpr src2))     (x64_addl_rm src1 src2))

;; matching the size of the operand to the size of the instruction
(rule 0 (x64_add $I8  src1 (is_mem src2))     (x64_addb_rm src1 src2))
(rule 0 (x64_add $I16 src1 (is_mem src2))     (x64_addw_rm src1 src2))
(rule 0 (x64_add $I32 src1 (is_gpr_mem src2)) (x64_addl_rm src1 src2))
(rule 0 (x64_add $I64 src1 (is_gpr_mem src2)) (x64_addq_rm src1 src2))

view this post on Zulip Wasmtime GitHub notifications bot (Feb 21 2025 at 19:09):

alexcrichton submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 21 2025 at 19:09):

alexcrichton created PR review comment:

Technically the 0-priority rules for 8/16-bit addition could also be is_gpr_mem I think? The 1-priority rules are just an optimization

view this post on Zulip Wasmtime GitHub notifications bot (Feb 22 2025 at 01:06):

abrown updated PR #10260.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 22 2025 at 01:37):

abrown updated PR #10260.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 22 2025 at 01:38):

abrown submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 22 2025 at 01:38):

abrown created PR review comment:

:+1: I'll remove is_mem for now... might need it again in the future.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 22 2025 at 01:38):

abrown has enabled auto merge for PR #10260.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 22 2025 at 02:15):

abrown merged PR #10260.


Last updated: Feb 28 2025 at 03:10 UTC