Stream: git-cranelift

Topic: cranelift / Issue #1379 x86: Consistently permit REX pref...


view this post on Zulip GitHub (Feb 07 2020 at 00:09):

iximeow opened Issue #1379:

It looks like our SIMD encodings waver between permitting and not permitting the REX prefix: compare uses of enc_32_64 and enc_both in x86/encoding.rs.

As a result, vector operations using r8-r15 as parts of a memory operand, or xmm8-xmm15 as a register operand, get rejected when searching for encodings and, I believe, result in Cranelift not using those xmm registers entirely in some circumstances (we're missing a rex-friendly MOVUPS encoding so I don't think we can encode a spill of xmm8-xmm15).

I ran across this trying to emit movups xmm7, [r11] when restoring SIMD registers in Windows prologues. This currently results in a constraint violation panic when compiling. I believe this is because I specified the location after inserting the instruction, so Cranelift picked the FPR8/GPR8-constrained encoding, and I just gave it an impossible operand, whereas regalloc would have been unable to use r11 as an operand and forced shuffling until a fitting register were available.

I'm pretty sure we can make the blanket change from enc_32_64 to enc_both for the region I linked above. However, I don't know the maybe_isap variants, and I figure it'd be better to confirm this before changing stuff around.

cc @abrown and @bnjbvr as people who've recently touched these things, I think?

view this post on Zulip GitHub (Feb 07 2020 at 00:23):

abrown commented on Issue #1379:

For some history, I opened #1090 to track this and then tried something in #1149 that I later abandoned. I think you're right about enc_32_64 -> enc_both but I have this nagging feeling that some of the instructions might use a REX bit to distinguish between different instructions (e.g. an 64x2 vs a 32x4)--but I can't recall which have this behavior and which not. Then again, I could be wrong about that but it bears more investigation. I can take a look next week sometime unless you want to do it now?

view this post on Zulip GitHub (Feb 07 2020 at 00:35):

iximeow commented on Issue #1379:

Oh! I tried looking for open issues and missed that one. I'd have bumped that instead if I saw it.

You're right - I don't think any lane choices are made by REX prefix bits but at least cvtsi2sd does change operands:

f2 0f 2a c0:    cvtsi2sd xmm0, eax
f2 4f 0f 2a c0: cvtsi2sd xmm0, r8

I entirely support being careful here :) I just wanted to note the issue lest I forget to mention, if you can take a look next week that's way ahead of when I can.

view this post on Zulip GitHub (Feb 28 2020 at 23:28):

alexcrichton transferred Issue #1379:

It looks like our SIMD encodings waver between permitting and not permitting the REX prefix: compare uses of enc_32_64 and enc_both in x86/encoding.rs.

As a result, vector operations using r8-r15 as parts of a memory operand, or xmm8-xmm15 as a register operand, get rejected when searching for encodings and, I believe, result in Cranelift not using those xmm registers entirely in some circumstances (we're missing a rex-friendly MOVUPS encoding so I don't think we can encode a spill of xmm8-xmm15).

I ran across this trying to emit movups xmm7, [r11] when restoring SIMD registers in Windows prologues. This currently results in a constraint violation panic when compiling. I believe this is because I specified the location after inserting the instruction, so Cranelift picked the FPR8/GPR8-constrained encoding, and I just gave it an impossible operand, whereas regalloc would have been unable to use r11 as an operand and forced shuffling until a fitting register were available.

I'm pretty sure we can make the blanket change from enc_32_64 to enc_both for the region I linked above. However, I don't know the maybe_isap variants, and I figure it'd be better to confirm this before changing stuff around.

cc @abrown and @bnjbvr as people who've recently touched these things, I think?


Last updated: Jan 24 2025 at 00:11 UTC