jlb6740 commented on Issue #1665:
@julian-seward1
Hi .. this current snapshot isn't much different than you saw earlier but it has been rebased against the latest merge which which forced a few updates to my beginnings.I am testing the example:
RUST_BACKTRACE=1 RUST_LOG=debug cargo run -- wasm --target x86_64 --set use_new_backend=1 -dvTD ben_simple.wat
github-actions[bot] commented on Issue #1665:
Subscribe to Label Action
cc @bnjbvr
<details>
This issue or pull request has been labeled: "cranelift", "cranelift:area:aarch64", "cranelift:meta"Thus the following users have been cc'd because of the following labels:
- bnjbvr: cranelift
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
Learn more.
</details>
julian-seward1 commented on Issue #1665:
I should maybe add .. basically good, it looks like it's moving along the right lines.
julian-seward1 commented on Issue #1665:
@jlb6740 I realised just now that you won't be able to use the
emit_REX_OPCODES_MODRM_SIB_IMM_*G_*E
andemit_REX_OPCODES_MODRM_*G_*E
family of functions to do emission of XMM insns without some (minor) modifications.Specifically, these are currently only set up to produce a legacy prefix of (none) or 0x66. But in your case, you now need to be able to generate the legacy prefixes
(none) or
66
orF2
orF3
(working from memory here). I suggest the following:
delete
const F_PREFIX_66
(it's a kludge)create a new enum
enum LegacyPrefix = { PfxNone, Pfx66, PfxF2, PfxF3 }
add a new argument
prefix: LegacyPrefix
toemit_REX_OPCODES_*
, in betweensink
andopcodes
at the place where the 66 byte would previously have been emitted (
sink.put1(0x66)
), instead inspectprefix
and emit the relevant bytechange all the places that specify
F_PREFIX_66
to passPfx66
instead. There aren't many.Note that the Intel docs are a bit confusing about this. In particular, the REX byte is silently omitted from the description. For example, for
CVTPI2PD
, the encoding is shown in the docs as66 0F 2A /r
, but what that actually means is66 <RexByte if necessary> 0F 2A /r
.0F
is the first opcode byte and2A
is the second.66
isn't an opcode byte but rather a prefix. The fact that the Rex byte -- if needed -- goes between them explains why we couldn't just pass0x660F2A
as theopcodes
arg toemit_REX_OPCODES_*
here. Instead we would have to pass0x0F2A
foropcodes
andPfx66
for the proposed newprefix
argument.
jlb6740 commented on Issue #1665:
Hi @julian-seward1 this is the 3rd update ... overwrote what was there yesterday. Is the proper way to push an insns similar to this in emit_test.rs:
insns.push((
Inst::mov64_m_r(Addr::imm_reg(0, rax), w_rdi),
"488B38",
"movq 0(%rax), %rdi",
));
Note, I modified the expected encoding here and from "488B38" to "488B37" and all tests still passed when running "cargo test -- --nocapture". I take it I did not run the test correctly or could this indicate some issue in testing?
julian-seward1 commented on Issue #1665:
@jlb6740 thanks for the respin! I'll have a look later today.
That test should fail if you changed the expected output, so I suspect it isn't getting run. Maybe try
cargo test --all
? Or(cd cranelift && cargo test --all)
? I definitely had them running and failing before.
jlb6740 commented on Issue #1665:
I guess RM_R_Op should then be changed to INT_RM_R_Op
Errr, this is tricky, for reasons to do with copy coalescing in regalloc. At least for now, I'd prefer to have an Xmm-to-Xmm move instruction that operates only on the entire register. We can come back and refine this later. So you'd drop the Scalar bit here, giving SSE_Mov_R_R. And correspondingly emit whatever Intel recommends for Xmm-to-Xmm moves .. is it movapd reg, reg perhaps?
@julian-seward1 There are a couple of comments particularly around naming that I partially did or held off completely but you may prefer still need to be done. For example, I think the Intel manual describes cpuid features flags SSE, AVX, AVX512, etc that I think make good prefixes (we use SSE in this pull request to form SSE_RM_R_Op), but there is no corresponding INT cpuid requirement for the current RM_R_Op instructions.
Last updated: Jan 24 2025 at 00:11 UTC