Stream: git-cranelift

Topic: cranelift / Issue #1366 Windows (Fastcall) calling conven...


view this post on Zulip GitHub (Jan 27 2020 at 15:04):

hrydgard opened Issue #1366:

We're having issues, only on Windows, with floating point variables becoming corrupt across calls into Cranelift from Wasmer.

The following function callee_saved_gprs in the codegen is supposed to return a list of the callee-saved registers:

https://github.com/bytecodealliance/cranelift/blob/23e9bdb2d99fb8f554793cb87f1578cc543c355f/cranelift-codegen/src/isa/x86/abi.rs#L370

However, it, nor its callers, seem to concern themselves with saving the floating point callee-saved registers. Quoth https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=vs-2019#callercallee-saved-registers:

The registers RBX, RBP, RDI, RSI, RSP, R12, R13, R14, R15, and XMM6-15 are considered nonvolatile and must be saved and restored by a function that uses them.

callee_saved_gprs only lists RBX, RBP, RDI, RSI, RSP, R12, R13, R14, R15, ignoring XMM6-15.

I believe this may be the cause of our troubles, it definitely looks wrong. What do you think?

view this post on Zulip GitHub (Jan 27 2020 at 15:04):

hrydgard labeled Issue #1366:

We're having issues, only on Windows, with floating point variables becoming corrupt across calls into Cranelift from Wasmer.

The following function callee_saved_gprs in the codegen is supposed to return a list of the callee-saved registers:

https://github.com/bytecodealliance/cranelift/blob/23e9bdb2d99fb8f554793cb87f1578cc543c355f/cranelift-codegen/src/isa/x86/abi.rs#L370

However, it, nor its callers, seem to concern themselves with saving the floating point callee-saved registers. Quoth https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=vs-2019#callercallee-saved-registers:

The registers RBX, RBP, RDI, RSI, RSP, R12, R13, R14, R15, and XMM6-15 are considered nonvolatile and must be saved and restored by a function that uses them.

callee_saved_gprs only lists RBX, RBP, RDI, RSI, RSP, R12, R13, R14, R15, ignoring XMM6-15.

I believe this may be the cause of our troubles, it definitely looks wrong. What do you think?

view this post on Zulip GitHub (Jan 27 2020 at 15:19):

hrydgard edited Issue #1366:

We're having issues, only on Windows, with floating point variables becoming corrupt across calls into Cranelift from Wasmer.

The following function callee_saved_gprs in the codegen is supposed to return a list of the callee-saved registers:

https://github.com/bytecodealliance/cranelift/blob/23e9bdb2d99fb8f554793cb87f1578cc543c355f/cranelift-codegen/src/isa/x86/abi.rs#L370

However, it, nor its callers, seem to concern themselves with saving the floating point callee-saved registers. Quoth https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=vs-2019#callercallee-saved-registers:

The registers RBX, RBP, RDI, RSI, RSP, R12, R13, R14, R15, and XMM6-15 are considered nonvolatile and must be saved and restored by a function that uses them.

callee_saved_gprs only lists RBX, RBP, RDI, RSI, RSP, R12, R13, R14, R15, ignoring XMM6-15.

I believe this may be the cause of our troubles, it definitely looks wrong. What do you think?

(If the reserved stack spaced for this would be prohibitive and code is generated assuming no callee saved registers, things could still work internally as they do now, but you'd still need to save XMM6-15 on entering Cranelift and restore them when exiting Cranelift).

view this post on Zulip GitHub (Jan 27 2020 at 15:53):

hrydgard edited Issue #1366:

We're having issues, only on Windows, with floating point variables becoming corrupt across calls into Cranelift from Wasmer.

The following function callee_saved_gprs in the codegen is supposed to return a list of the callee-saved registers:

https://github.com/bytecodealliance/cranelift/blob/23e9bdb2d99fb8f554793cb87f1578cc543c355f/cranelift-codegen/src/isa/x86/abi.rs#L370

However, it, nor its callers, seem to concern themselves with saving the floating point callee-saved registers (with gprs in its name, that kind makes sense). Quoth https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=vs-2019#callercallee-saved-registers:

The registers RBX, RBP, RDI, RSI, RSP, R12, R13, R14, R15, and XMM6-15 are considered nonvolatile and must be saved and restored by a function that uses them.

callee_saved_gprs only lists RBX, RBP, RDI, RSI, RSP, R12, R13, R14, R15, ignoring XMM6-15.

I believe this may be the cause of our troubles, it definitely looks wrong. What do you think?

(If the reserved stack spaced for this would be prohibitive and code is generated assuming no callee saved registers, things could still work internally as they do now, but you'd still need to save XMM6-15 on entering Cranelift and restore them when exiting Cranelift).

view this post on Zulip GitHub (Jan 28 2020 at 04:20):

sstangl commented on Issue #1366:

Windows does indeed require restoring those registers on exit if changed.

IonMonkey implementation for reference: https://searchfox.org/mozilla-central/rev/cbd110d718bc89a499d3f996af24532abbf6f8ea/js/src/jit/x64/Trampoline-x64.cpp#72

view this post on Zulip GitHub (Jan 28 2020 at 08:37):

hrydgard commented on Issue #1366:

We've done more investigation, and manually saving and restoring the registers around calls into Cranelift using some inline assembly fixes our variable corruption issues.

I don't know if this is actually a real security issue, since values are leaking from VM to host, and the host is in control of the VM's memory anyway. I guess it's possible that values could leak from host to VM if there are bugs in the register allocator, since nothing seems to wipe the XMM registers when entering the VM (though I haven't looked closely for that).

Either way, this clearly needs to be addressed!

view this post on Zulip GitHub (Jan 28 2020 at 08:38):

hrydgard edited a comment on Issue #1366:

We've done more investigation, and manually saving and restoring the XMM registers around calls into Cranelift using some inline assembly fixes our variable corruption issues.

I don't know if this is actually a real security issue, since values are leaking from VM to host, and the host is in control of the VM's memory anyway. I guess it's possible that values could leak from host to VM if there are bugs in the register allocator, since nothing seems to wipe the XMM registers when entering the VM (though I haven't looked closely for that).

Either way, this clearly needs to be addressed!

view this post on Zulip GitHub (Jan 28 2020 at 08:43):

hrydgard edited a comment on Issue #1366:

We've done more investigation, and manually saving and restoring the XMM registers around calls into Cranelift using some inline assembly fixes our variable corruption issues.

I don't know if this is actually a real security issue, since values are leaking from VM to host, and the host is in control of the VM's memory anyway. I guess it's possible that values could leak from host to VM if there are bugs in the register allocator, since nothing seems to wipe the XMM registers when entering the VM (though I haven't looked closely for that).

Either way, this clearly needs to be addressed!

For reference, here's our terrifying workaround (call store_xmm before calling into cranelift, restore_xmm afterwards):

// XMM6-15
pub fn store_xmm(temp: &mut [f32; 10 * 4]) {
    unsafe {
        asm!(r#"
            vmovdqu %xmm6, 0($0)
            vmovdqu %xmm7, 16($0)
            vmovdqu %xmm8, 32($0)
            vmovdqu %xmm9, 48($0)
            vmovdqu %xmm10, 64($0)
            vmovdqu %xmm11, 80($0)
            vmovdqu %xmm12, 96($0)
            vmovdqu %xmm13, 112($0)
            vmovdqu %xmm14, 128($0)
            vmovdqu %xmm15, 144($0)
            "#
            : : "r"(temp) );
    }
}

pub fn restore_xmm(temp: &[f32; 10 * 4]) {
    unsafe {
        asm!(r#"
            vmovdqu 0($0), %xmm6
            vmovdqu 16($0), %xmm7
            vmovdqu 32($0), %xmm8
            vmovdqu 48($0), %xmm9
            vmovdqu 64($0), %xmm10
            vmovdqu 80($0), %xmm11
            vmovdqu 96($0), %xmm12
            vmovdqu 112($0), %xmm13
            vmovdqu 128($0), %xmm14
            vmovdqu 144($0), %xmm15
            "#
            : : "r"(temp) );
    }
}

view this post on Zulip GitHub (Jan 28 2020 at 08:50):

hrydgard edited a comment on Issue #1366:

We've done more investigation, and manually saving and restoring the XMM registers around calls into Cranelift using some inline assembly fixes our variable corruption issues.

I don't know if this is actually a real security issue, since values are leaking from VM to host, and the host is in control of the VM's memory anyway. I guess it's possible that values could leak from host to VM if there are bugs in the register allocator, since nothing seems to wipe the XMM registers when entering the VM (though I haven't looked closely for that).

Either way, this clearly needs to be addressed!

For reference, here's our (nightly-only) terrifying workaround (call store_xmm before calling into cranelift, restore_xmm afterwards):

// XMM6-15
pub fn store_xmm(temp: &mut [f32; 10 * 4]) {
    unsafe {
        asm!(r#"
            vmovdqu %xmm6, 0($0)
            vmovdqu %xmm7, 16($0)
            vmovdqu %xmm8, 32($0)
            vmovdqu %xmm9, 48($0)
            vmovdqu %xmm10, 64($0)
            vmovdqu %xmm11, 80($0)
            vmovdqu %xmm12, 96($0)
            vmovdqu %xmm13, 112($0)
            vmovdqu %xmm14, 128($0)
            vmovdqu %xmm15, 144($0)
            "#
            : : "r"(temp) );
    }
}

pub fn restore_xmm(temp: &[f32; 10 * 4]) {
    unsafe {
        asm!(r#"
            vmovdqu 0($0), %xmm6
            vmovdqu 16($0), %xmm7
            vmovdqu 32($0), %xmm8
            vmovdqu 48($0), %xmm9
            vmovdqu 64($0), %xmm10
            vmovdqu 80($0), %xmm11
            vmovdqu 96($0), %xmm12
            vmovdqu 112($0), %xmm13
            vmovdqu 128($0), %xmm14
            vmovdqu 144($0), %xmm15
            "#
            : : "r"(temp) );
    }
}

view this post on Zulip GitHub (Jan 28 2020 at 15:11):

sstangl commented on Issue #1366:

We've done more investigation, and manually saving and restoring the XMM registers around calls into Cranelift using some inline assembly fixes our variable corruption issues.

Well yes, that's logically equivalent to fixing Cranelift's Microsoft x64 calling convention implementation. Except in the case of Cranelift ever calling out to native code, in which case that boundary would also need to know to save/restore registers.

The workaround isn't terrifying -- that's pretty much what Cranelift needs to do internally, except that Cranelift can know what XMM regs were actually allocated and therefore can skip saving/restoring unused volatile registers.

view this post on Zulip GitHub (Jan 28 2020 at 15:22):

hrydgard commented on Issue #1366:

Right, though I suspect that the calling-out-to-native-code scenario is not actually an issue, at least not a correctness one, because Cranelift internally doesn't seem to be aware of the concept of callee-save FP/SIMD registers, so they would simply already be making sure that they're all "dead" when calling out.

view this post on Zulip GitHub (Jan 28 2020 at 15:22):

hrydgard edited a comment on Issue #1366:

Right, though I suspect that the calling-out-to-native-code scenario is not actually an issue, at least not a correctness one, because Cranelift internally doesn't seem to be aware of the concept of callee-saved FP/SIMD registers, so they would simply already be making sure that they're all "dead" when calling out.

view this post on Zulip GitHub (Feb 05 2020 at 15:22):

hrydgard commented on Issue #1366:

So is anyone from Cranelift going to look at this, or should I just mash together a hacky PR that saves the registers on entrance and restores them on exit, like we did in our code? That way Cranelift can keep pretending that the Windows ABI does not have callee-save FP registers...

view this post on Zulip GitHub (Feb 05 2020 at 15:22):

hrydgard edited a comment on Issue #1366:

So is anyone from Cranelift going to look at this, or should I just mash together a hacky PR that saves the registers on entrance and restores them on exit, like we did in our surrounding code? That way Cranelift can keep pretending that the Windows ABI does not have callee-save FP registers...

view this post on Zulip GitHub (Feb 05 2020 at 15:59):

sstangl commented on Issue #1366:

Although I suppose we'd take a hacky-but-correct patch, the real patch shouldn't be much more work.

You'd just need to make an analogue of the callee_saved_gprs_used() in fastcall_prologue_epilogue(). Probably get that function to return both GPRs and FPRs used, so that you don't walk the CFG twice.

insert_common_epilogue() would need to learn about FPRs also.

view this post on Zulip GitHub (Feb 05 2020 at 22:39):

hrydgard commented on Issue #1366:

Oh, you're actually part of the project, I misunderstood from your tone before that you were just another passersby :) I mean, this is a critical bug that can cause really tricky secondary bugs and could possibly (although not super likely) be a CVE-worthy security issue, I initially kind of expected you guys to pretty much take this and run with it.

Anyway, I don't have the time to go deep here, so a patch from us would likely just be what I said before, just save and restore all the registers on entry/exit. I agree that it's probably not that much work to fix it properly, but I don't know this code base at all - I've written JIT compilers before, things are not always as easy as they first seem...

view this post on Zulip GitHub (Feb 06 2020 at 00:22):

iximeow commented on Issue #1366:

Having read the MSDN x86_64 calling convention page more than a few times I didn't remember it mentioning XMM registers being nonvolatile before... it turns out even in January of last year they weren't mentioned: so says the page via archive.org. It got added somewhere between then and October. Not a huge fan of this getting a few words added towards the end of one page!!

Regardless, I'm putting together a change to preserve xmm6-xmm15.

view this post on Zulip GitHub (Feb 28 2020 at 23:28):

alexcrichton transferred Issue #1366:

We're having issues, only on Windows, with floating point variables becoming corrupt across calls into Cranelift from Wasmer.

The following function callee_saved_gprs in the codegen is supposed to return a list of the callee-saved registers:

https://github.com/bytecodealliance/cranelift/blob/23e9bdb2d99fb8f554793cb87f1578cc543c355f/cranelift-codegen/src/isa/x86/abi.rs#L370

However, it, nor its callers, seem to concern themselves with saving the floating point callee-saved registers (with gprs in its name, that kind makes sense). Quoth https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=vs-2019#callercallee-saved-registers:

The registers RBX, RBP, RDI, RSI, RSP, R12, R13, R14, R15, and XMM6-15 are considered nonvolatile and must be saved and restored by a function that uses them.

callee_saved_gprs only lists RBX, RBP, RDI, RSI, RSP, R12, R13, R14, R15, ignoring XMM6-15.

I believe this may be the cause of our troubles, it definitely looks wrong. What do you think?

(If the reserved stack spaced for this would be prohibitive and code is generated assuming no callee saved registers, things could still work internally as they do now, but you'd still need to save XMM6-15 on entering Cranelift and restore them when exiting Cranelift).


Last updated: Dec 23 2024 at 13:07 UTC