hrydgard opened Issue #1366:
We're having issues, only on Windows, with floating point variables becoming corrupt across calls into Cranelift from Wasmer.
The following function
callee_saved_gprs
in the codegen is supposed to return a list of the callee-saved registers:However, it, nor its callers, seem to concern themselves with saving the floating point callee-saved registers. Quoth https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=vs-2019#callercallee-saved-registers:
The registers RBX, RBP, RDI, RSI, RSP, R12, R13, R14, R15, and XMM6-15 are considered nonvolatile and must be saved and restored by a function that uses them.
callee_saved_gprs
only lists RBX, RBP, RDI, RSI, RSP, R12, R13, R14, R15, ignoring XMM6-15.I believe this may be the cause of our troubles, it definitely looks wrong. What do you think?
hrydgard labeled Issue #1366:
We're having issues, only on Windows, with floating point variables becoming corrupt across calls into Cranelift from Wasmer.
The following function
callee_saved_gprs
in the codegen is supposed to return a list of the callee-saved registers:However, it, nor its callers, seem to concern themselves with saving the floating point callee-saved registers. Quoth https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=vs-2019#callercallee-saved-registers:
The registers RBX, RBP, RDI, RSI, RSP, R12, R13, R14, R15, and XMM6-15 are considered nonvolatile and must be saved and restored by a function that uses them.
callee_saved_gprs
only lists RBX, RBP, RDI, RSI, RSP, R12, R13, R14, R15, ignoring XMM6-15.I believe this may be the cause of our troubles, it definitely looks wrong. What do you think?
hrydgard edited Issue #1366:
We're having issues, only on Windows, with floating point variables becoming corrupt across calls into Cranelift from Wasmer.
The following function
callee_saved_gprs
in the codegen is supposed to return a list of the callee-saved registers:However, it, nor its callers, seem to concern themselves with saving the floating point callee-saved registers. Quoth https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=vs-2019#callercallee-saved-registers:
The registers RBX, RBP, RDI, RSI, RSP, R12, R13, R14, R15, and XMM6-15 are considered nonvolatile and must be saved and restored by a function that uses them.
callee_saved_gprs
only lists RBX, RBP, RDI, RSI, RSP, R12, R13, R14, R15, ignoring XMM6-15.I believe this may be the cause of our troubles, it definitely looks wrong. What do you think?
(If the reserved stack spaced for this would be prohibitive and code is generated assuming no callee saved registers, things could still work internally as they do now, but you'd still need to save XMM6-15 on entering Cranelift and restore them when exiting Cranelift).
hrydgard edited Issue #1366:
We're having issues, only on Windows, with floating point variables becoming corrupt across calls into Cranelift from Wasmer.
The following function
callee_saved_gprs
in the codegen is supposed to return a list of the callee-saved registers:However, it, nor its callers, seem to concern themselves with saving the floating point callee-saved registers (with gprs in its name, that kind makes sense). Quoth https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=vs-2019#callercallee-saved-registers:
The registers RBX, RBP, RDI, RSI, RSP, R12, R13, R14, R15, and XMM6-15 are considered nonvolatile and must be saved and restored by a function that uses them.
callee_saved_gprs
only lists RBX, RBP, RDI, RSI, RSP, R12, R13, R14, R15, ignoring XMM6-15.I believe this may be the cause of our troubles, it definitely looks wrong. What do you think?
(If the reserved stack spaced for this would be prohibitive and code is generated assuming no callee saved registers, things could still work internally as they do now, but you'd still need to save XMM6-15 on entering Cranelift and restore them when exiting Cranelift).
sstangl commented on Issue #1366:
Windows does indeed require restoring those registers on exit if changed.
IonMonkey implementation for reference: https://searchfox.org/mozilla-central/rev/cbd110d718bc89a499d3f996af24532abbf6f8ea/js/src/jit/x64/Trampoline-x64.cpp#72
hrydgard commented on Issue #1366:
We've done more investigation, and manually saving and restoring the registers around calls into Cranelift using some inline assembly fixes our variable corruption issues.
I don't know if this is actually a real security issue, since values are leaking from VM to host, and the host is in control of the VM's memory anyway. I guess it's possible that values could leak from host to VM if there are bugs in the register allocator, since nothing seems to wipe the XMM registers when entering the VM (though I haven't looked closely for that).
Either way, this clearly needs to be addressed!
hrydgard edited a comment on Issue #1366:
We've done more investigation, and manually saving and restoring the XMM registers around calls into Cranelift using some inline assembly fixes our variable corruption issues.
I don't know if this is actually a real security issue, since values are leaking from VM to host, and the host is in control of the VM's memory anyway. I guess it's possible that values could leak from host to VM if there are bugs in the register allocator, since nothing seems to wipe the XMM registers when entering the VM (though I haven't looked closely for that).
Either way, this clearly needs to be addressed!
hrydgard edited a comment on Issue #1366:
We've done more investigation, and manually saving and restoring the XMM registers around calls into Cranelift using some inline assembly fixes our variable corruption issues.
I don't know if this is actually a real security issue, since values are leaking from VM to host, and the host is in control of the VM's memory anyway. I guess it's possible that values could leak from host to VM if there are bugs in the register allocator, since nothing seems to wipe the XMM registers when entering the VM (though I haven't looked closely for that).
Either way, this clearly needs to be addressed!
For reference, here's our terrifying workaround (call store_xmm before calling into cranelift, restore_xmm afterwards):
// XMM6-15 pub fn store_xmm(temp: &mut [f32; 10 * 4]) { unsafe { asm!(r#" vmovdqu %xmm6, 0($0) vmovdqu %xmm7, 16($0) vmovdqu %xmm8, 32($0) vmovdqu %xmm9, 48($0) vmovdqu %xmm10, 64($0) vmovdqu %xmm11, 80($0) vmovdqu %xmm12, 96($0) vmovdqu %xmm13, 112($0) vmovdqu %xmm14, 128($0) vmovdqu %xmm15, 144($0) "# : : "r"(temp) ); } } pub fn restore_xmm(temp: &[f32; 10 * 4]) { unsafe { asm!(r#" vmovdqu 0($0), %xmm6 vmovdqu 16($0), %xmm7 vmovdqu 32($0), %xmm8 vmovdqu 48($0), %xmm9 vmovdqu 64($0), %xmm10 vmovdqu 80($0), %xmm11 vmovdqu 96($0), %xmm12 vmovdqu 112($0), %xmm13 vmovdqu 128($0), %xmm14 vmovdqu 144($0), %xmm15 "# : : "r"(temp) ); } }
hrydgard edited a comment on Issue #1366:
We've done more investigation, and manually saving and restoring the XMM registers around calls into Cranelift using some inline assembly fixes our variable corruption issues.
I don't know if this is actually a real security issue, since values are leaking from VM to host, and the host is in control of the VM's memory anyway. I guess it's possible that values could leak from host to VM if there are bugs in the register allocator, since nothing seems to wipe the XMM registers when entering the VM (though I haven't looked closely for that).
Either way, this clearly needs to be addressed!
For reference, here's our (nightly-only) terrifying workaround (call store_xmm before calling into cranelift, restore_xmm afterwards):
// XMM6-15 pub fn store_xmm(temp: &mut [f32; 10 * 4]) { unsafe { asm!(r#" vmovdqu %xmm6, 0($0) vmovdqu %xmm7, 16($0) vmovdqu %xmm8, 32($0) vmovdqu %xmm9, 48($0) vmovdqu %xmm10, 64($0) vmovdqu %xmm11, 80($0) vmovdqu %xmm12, 96($0) vmovdqu %xmm13, 112($0) vmovdqu %xmm14, 128($0) vmovdqu %xmm15, 144($0) "# : : "r"(temp) ); } } pub fn restore_xmm(temp: &[f32; 10 * 4]) { unsafe { asm!(r#" vmovdqu 0($0), %xmm6 vmovdqu 16($0), %xmm7 vmovdqu 32($0), %xmm8 vmovdqu 48($0), %xmm9 vmovdqu 64($0), %xmm10 vmovdqu 80($0), %xmm11 vmovdqu 96($0), %xmm12 vmovdqu 112($0), %xmm13 vmovdqu 128($0), %xmm14 vmovdqu 144($0), %xmm15 "# : : "r"(temp) ); } }
sstangl commented on Issue #1366:
We've done more investigation, and manually saving and restoring the XMM registers around calls into Cranelift using some inline assembly fixes our variable corruption issues.
Well yes, that's logically equivalent to fixing Cranelift's Microsoft x64 calling convention implementation. Except in the case of Cranelift ever calling out to native code, in which case that boundary would also need to know to save/restore registers.
The workaround isn't terrifying -- that's pretty much what Cranelift needs to do internally, except that Cranelift can know what XMM regs were actually allocated and therefore can skip saving/restoring unused volatile registers.
hrydgard commented on Issue #1366:
Right, though I suspect that the calling-out-to-native-code scenario is not actually an issue, at least not a correctness one, because Cranelift internally doesn't seem to be aware of the concept of callee-save FP/SIMD registers, so they would simply already be making sure that they're all "dead" when calling out.
hrydgard edited a comment on Issue #1366:
Right, though I suspect that the calling-out-to-native-code scenario is not actually an issue, at least not a correctness one, because Cranelift internally doesn't seem to be aware of the concept of callee-saved FP/SIMD registers, so they would simply already be making sure that they're all "dead" when calling out.
hrydgard commented on Issue #1366:
So is anyone from Cranelift going to look at this, or should I just mash together a hacky PR that saves the registers on entrance and restores them on exit, like we did in our code? That way Cranelift can keep pretending that the Windows ABI does not have callee-save FP registers...
hrydgard edited a comment on Issue #1366:
So is anyone from Cranelift going to look at this, or should I just mash together a hacky PR that saves the registers on entrance and restores them on exit, like we did in our surrounding code? That way Cranelift can keep pretending that the Windows ABI does not have callee-save FP registers...
sstangl commented on Issue #1366:
Although I suppose we'd take a hacky-but-correct patch, the real patch shouldn't be much more work.
You'd just need to make an analogue of the
callee_saved_gprs_used()
in fastcall_prologue_epilogue(). Probably get that function to return both GPRs and FPRs used, so that you don't walk the CFG twice.
insert_common_epilogue()
would need to learn about FPRs also.
hrydgard commented on Issue #1366:
Oh, you're actually part of the project, I misunderstood from your tone before that you were just another passersby :) I mean, this is a critical bug that can cause really tricky secondary bugs and could possibly (although not super likely) be a CVE-worthy security issue, I initially kind of expected you guys to pretty much take this and run with it.
Anyway, I don't have the time to go deep here, so a patch from us would likely just be what I said before, just save and restore all the registers on entry/exit. I agree that it's probably not that much work to fix it properly, but I don't know this code base at all - I've written JIT compilers before, things are not always as easy as they first seem...
iximeow commented on Issue #1366:
Having read the MSDN x86_64 calling convention page more than a few times I didn't remember it mentioning XMM registers being nonvolatile before... it turns out even in January of last year they weren't mentioned: so says the page via archive.org. It got added somewhere between then and October. Not a huge fan of this getting a few words added towards the end of one page!!
Regardless, I'm putting together a change to preserve xmm6-xmm15.
alexcrichton transferred Issue #1366:
We're having issues, only on Windows, with floating point variables becoming corrupt across calls into Cranelift from Wasmer.
The following function
callee_saved_gprs
in the codegen is supposed to return a list of the callee-saved registers:However, it, nor its callers, seem to concern themselves with saving the floating point callee-saved registers (with gprs in its name, that kind makes sense). Quoth https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=vs-2019#callercallee-saved-registers:
The registers RBX, RBP, RDI, RSI, RSP, R12, R13, R14, R15, and XMM6-15 are considered nonvolatile and must be saved and restored by a function that uses them.
callee_saved_gprs
only lists RBX, RBP, RDI, RSI, RSP, R12, R13, R14, R15, ignoring XMM6-15.I believe this may be the cause of our troubles, it definitely looks wrong. What do you think?
(If the reserved stack spaced for this would be prohibitive and code is generated assuming no callee saved registers, things could still work internally as they do now, but you'd still need to save XMM6-15 on entering Cranelift and restore them when exiting Cranelift).
Last updated: Jan 24 2025 at 00:11 UTC