jlb6740 commented on Issue #1665:
@julian-seward1
Hi .. this current snapshot isn't much different than you saw earlier but it has been rebased against the latest merge which which forced a few updates to my beginnings.I am testing the example:
RUST_BACKTRACE=1 RUST_LOG=debug cargo run -- wasm --target x86_64 --set use_new_backend=1 -dvTD ben_simple.wat
github-actions[bot] commented on Issue #1665:
Subscribe to Label Action
cc @bnjbvr
<details>
This issue or pull request has been labeled: "cranelift", "cranelift:area:aarch64", "cranelift:meta"Thus the following users have been cc'd because of the following labels:
- bnjbvr: cranelift
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
Learn more.
</details>
julian-seward1 commented on Issue #1665:
I should maybe add .. basically good, it looks like it's moving along the right lines.
julian-seward1 commented on Issue #1665:
@jlb6740 I realised just now that you won't be able to use the
emit_REX_OPCODES_MODRM_SIB_IMM_*G_*Eandemit_REX_OPCODES_MODRM_*G_*Efamily of functions to do emission of XMM insns without some (minor) modifications.Specifically, these are currently only set up to produce a legacy prefix of (none) or 0x66. But in your case, you now need to be able to generate the legacy prefixes
(none) or
66orF2orF3(working from memory here). I suggest the following:
delete
const F_PREFIX_66(it's a kludge)create a new enum
enum LegacyPrefix = { PfxNone, Pfx66, PfxF2, PfxF3 }add a new argument
prefix: LegacyPrefixtoemit_REX_OPCODES_*, in betweensinkandopcodesat the place where the 66 byte would previously have been emitted (
sink.put1(0x66)), instead inspectprefixand emit the relevant bytechange all the places that specify
F_PREFIX_66to passPfx66instead. There aren't many.Note that the Intel docs are a bit confusing about this. In particular, the REX byte is silently omitted from the description. For example, for
CVTPI2PD, the encoding is shown in the docs as66 0F 2A /r, but what that actually means is66 <RexByte if necessary> 0F 2A /r.0Fis the first opcode byte and2Ais the second.66isn't an opcode byte but rather a prefix. The fact that the Rex byte -- if needed -- goes between them explains why we couldn't just pass0x660F2Aas theopcodesarg toemit_REX_OPCODES_*here. Instead we would have to pass0x0F2AforopcodesandPfx66for the proposed newprefixargument.
jlb6740 commented on Issue #1665:
Hi @julian-seward1 this is the 3rd update ... overwrote what was there yesterday. Is the proper way to push an insns similar to this in emit_test.rs:
insns.push((
Inst::mov64_m_r(Addr::imm_reg(0, rax), w_rdi),
"488B38",
"movq 0(%rax), %rdi",
));
Note, I modified the expected encoding here and from "488B38" to "488B37" and all tests still passed when running "cargo test -- --nocapture". I take it I did not run the test correctly or could this indicate some issue in testing?
julian-seward1 commented on Issue #1665:
@jlb6740 thanks for the respin! I'll have a look later today.
That test should fail if you changed the expected output, so I suspect it isn't getting run. Maybe try
cargo test --all? Or(cd cranelift && cargo test --all)? I definitely had them running and failing before.
jlb6740 commented on Issue #1665:
I guess RM_R_Op should then be changed to INT_RM_R_Op
Errr, this is tricky, for reasons to do with copy coalescing in regalloc. At least for now, I'd prefer to have an Xmm-to-Xmm move instruction that operates only on the entire register. We can come back and refine this later. So you'd drop the Scalar bit here, giving SSE_Mov_R_R. And correspondingly emit whatever Intel recommends for Xmm-to-Xmm moves .. is it movapd reg, reg perhaps?
@julian-seward1 There are a couple of comments particularly around naming that I partially did or held off completely but you may prefer still need to be done. For example, I think the Intel manual describes cpuid features flags SSE, AVX, AVX512, etc that I think make good prefixes (we use SSE in this pull request to form SSE_RM_R_Op), but there is no corresponding INT cpuid requirement for the current RM_R_Op instructions.
Last updated: Dec 13 2025 at 19:03 UTC