abrown opened PR #1456 from fix-1421
to master
:
As described in #1421, REX prefixes should only be inferred for 64-bit mode instructions; 32-bit mode instructions can't have a REX prefix. Also, this adds the ability for some additional SIMD instructions to use
infer_rex
.
<!--Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
abrown submitted PR Review.
abrown created PR Review Comment:
I've felt for a while that there surely must be a better way than creating custom functions for every combination of
sib x offset x rex x inregs x outregs
: what if we generated this code instead by addingcheck_if_needs_rex
,check_if_needs_sib_or_offset
, etc., on theEncodingRecipeBuilder
. Then during meta generation we could create code like that insize_plus_maybe_sib_inreg1_plus_rex_prefix_for_inreg0_inreg1
.@julian-seward1, what are your thoughts on this in relation to a new backend? I would imagine that the recipe part of all this is not necessary, but is the "generation of REX-inferring code" still something that would be useful?
abrown requested julian-seward1 for a review on PR #1456.
julian-seward1 submitted PR Review.
julian-seward1 created PR Review Comment:
For the new x64 back end, REX generation is pretty much a non-problem. There are two functions that emit all "standard form" instructions. That is, those which have the structure OTHER_PREFIXES,REX,OPCODES,MODRM,SIB,IMM. There is one for the reg-reg case,
emit_REX_OPCODES_MODRM_encG_encE
and one for the reg-mem case,emit_REX_OPCODES_MODRM_SIB_IMM_encG_memE
.https://github.com/cfallin/wasmtime/blob/arm64/cranelift/codegen/src/isa/x64/inst.rs#L1656
https://github.com/cfallin/wasmtime/blob/arm64/cranelift/codegen/src/isa/x64/inst.rs#L1787
For both functions, you pass the opcodes, the 'G' register field, and the 'E' field, which is either another register, or it's a memory address (an amode), and the functions will figure out the REX prefix on their own, and then skip it if it is redundant. You can give flags for a bit of special-case handling -- "retain teh REX prefix even if it is redundant", "set REX.W to zero (to indicate a 32-bit operation)", but that's about it.
I'm sure this will cover the vast majority of instructions we need to generate, up to and including SSE4.2. It won't do AVX or anything after that since it has no way to generate VEX or EVEX prefixes; but when the day comes, we can cook up similar routines.
Using the two routines, the top level emit-an-x64-instruction routine becomes simple; basically it's just a bunch of suitably parameterised calls to the above functions.
https://github.com/cfallin/wasmtime/blob/arm64/cranelift/codegen/src/isa/x64/inst.rs#L1787
What's implemented in the file is tested and believed to be correct. Starting at around line 2500 there are no less than 400 encoding tests.
There is no attempt made here to support the 32-bit x86 architecture. If indeed we do ever support that, I would argue in favour of treating it as a totally separate, unrelated architecture. The alternative is to complicate and slow down the x64 code, which sounds like a poor tradeoff. This probably sounds like a pretty radical proposal which runs against existing practice. This separate-target strategy worked well for the Valgrind x86 back ends. Not so well for the front ends, because there was a lot of duplication; but they are much bigger and more complex than the back ends.
julian-seward1 edited PR Review Comment.
julian-seward1 submitted PR Review.
julian-seward1 created PR Review Comment:
Does this perhaps need to test some third register number to see if it has bit 3 set (== is_extended_reg, iiuc) ? REX has 4 bits and it seems like you're only checking 3 here.
abrown submitted PR Review.
abrown created PR Review Comment:
The way I understand this is that each of these functions are specialized to the operands of the recipe (or instruction class, if you prefer). This function, for example, is only supposed to be used on recipes with
.operands_in(vec![fpr, gpr])
in which thatgpr
could be used with a SIB byte.
abrown submitted PR Review.
abrown created PR Review Comment:
So we wouldn't need to check for the third operand (REX.X?) in this case.
julian-seward1 submitted PR Review.
julian-seward1 created PR Review Comment:
Well, so long as you're sure that no third register can participate.
julian-seward1 edited PR Review Comment.
abrown submitted PR Review.
abrown created PR Review Comment:
:grinning_face_with_smiling_eyes: I'm only as sure as the filetests confirm. But, yes, this is how I understand the recipes/size calculator functions to work currently.
julian-seward1 submitted PR Review.
julian-seward1 submitted PR Review.
julian-seward1 submitted PR Review.
iximeow submitted PR Review.
iximeow created PR Review Comment:
The recipes that might call this encoding function (here) takes an fpr and a gpr as inputs, so the second argument is always just a register. No memory operand with a third register here!
iximeow edited PR Review Comment.
iximeow submitted PR Review.
abrown merged PR #1456.
Last updated: Jan 24 2025 at 00:11 UTC