iximeow commented on Issue #1216:
The CI failure is due to tripping this check that we wouldn't misreport unwind information in cases where we have callee-save floating-point registers. I tried reproducing this locally by building the module in
multi_func_value/func.wast
byclif-util wasm funct_module.wast --target x86_64-windows -D
, but for some reason that seems to build fine, using a significantly smaller amount of stack space? I ranclif-util
from Linux, but the only thing that should be a factor here is the target specification. I'm not sure why it doesn't reproduce this way. Is there a flag I should set that influences lowering to clif as well?Anyway, I printed the faulting prologue in CI to double-check that the panic is correctly hit, and indeed a function in the module uses well over the 240-byte limit I've noted above:
x86_push.i64 v105 offset 2: copy_special %rsp -> %rbp offset 5: x86_push.i64 v106 offset 7: x86_push.i64 v107 offset 9: x86_push.i64 v108 offset 11: x86_push.i64 v109 offset 13: x86_push.i64 v110 offset 15: x86_push.i64 v111 offset 17: adjust_sp_down_imm 336 offset 24: v112 = stack_addr.i64 ss18 offset 32: store.f64 notrap aligned v113, v112 offset 37: store.f64 notrap aligned v114, v112+16 offset 43: store.f64 notrap aligned v115, v112+32Since this function uses callee-save floats, we miscompile it on Windows today. Should I disable this test on Windows? (not sure if there's a mechanism for this in spectests, honestly) It would be fixed in the same PR that adjusts stack layout to not be constrained to 240 byte frames when using callee-save xmm registers.
iximeow edited a comment on Issue #1216:
The CI failure is due to tripping this check that we wouldn't misreport unwind information in cases where we have callee-save floating-point registers. I tried reproducing this locally by building the module in
multi_func_value/func.wast
byclif-util wasm funct_module.wast --target x86_64-windows -D
, but for some reason that seems to build fine, using a significantly smaller amount of stack space? I ranclif-util
from Linux, but the only thing that should be a factor here is the target specification. I'm not sure why it doesn't reproduce this way. Is there a flag I should set that influences lowering to clif as well?Anyway, I printed the faulting prologue in CI to double-check that the panic is correctly hit, and indeed a function in the module uses well over the 240-byte limit I've noted above:
x86_push.i64 v105 offset 2: copy_special %rsp -> %rbp offset 5: x86_push.i64 v106 offset 7: x86_push.i64 v107 offset 9: x86_push.i64 v108 offset 11: x86_push.i64 v109 offset 13: x86_push.i64 v110 offset 15: x86_push.i64 v111 offset 17: adjust_sp_down_imm 336 offset 24: v112 = stack_addr.i64 ss18 offset 32: store.f64 notrap aligned v113, v112 offset 37: store.f64 notrap aligned v114, v112+16 offset 43: store.f64 notrap aligned v115, v112+32Since this function uses callee-save floats, we miscompile it on Windows today. Should I disable this test on Windows? (not sure if there's a mechanism for this in spectests, honestly) It would be fixed in the same PR that adjusts stack layout to not be constrained to 240 byte frames when using callee-save xmm registers.
Edit: this panic is only reached if you want a fastcall function with all of:
- a >240 byte stack frame
- callee-save floats
- unwind information
This PR fixes callee-save float preservation in fastcall functions generally. Asking for unwind information causes the panic. I'm not sure how easily Windows targets can get the former without the latter.
github-actions[bot] commented on Issue #1216:
Subscribe to Label Action
This issue or pull request has been labeled: "c", "r", "a", "n", "e", "l", "i", "f", "t"
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
github-actions[bot] commented on Issue #1216:
Subscribe to Label Action
This issue or pull request has been labeled: "c", "r", "a", "n", "e", "l", "i", "f", "t"
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
peterhuene deleted a comment on Issue #1216:
Subscribe to Label Action
This issue or pull request has been labeled: "c", "r", "a", "n", "e", "l", "i", "f", "t"
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
peterhuene deleted a comment on Issue #1216:
Subscribe to Label Action
This issue or pull request has been labeled: "c", "r", "a", "n", "e", "l", "i", "f", "t"
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
iximeow commented on Issue #1216:
Nudging this again as I think this is as good as this fix can get without getting into the weeds on stack layout details.
I have three(ish) issues to open assuming a :+1: :
- [ ] Use of an
F64
type instead of the more accurateF64x2
for callee-save FPRs, which _works_ but just isn't fully expressive of intent. See here- [ ] Fix how Cranelift lays out Fastcall stacks to avoid the 240-byte stack frame limit when using unwind information and callee-save XMM registers. See here for conversation and here for a test I've had to disable as a consequence.
- [ ] Might not be worth a tracking issue, but there are two TODO's contingent on an encoding existing that would result in changes here after this lands.
github-actions[bot] commented on Issue #1216:
Subscribe to Label Action
This issue or pull request has been labeled: "cranelift"
<details> <summary>Users Subscribed to "cranelift"</summary>
- @bnjbvr
</details>
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
iximeow commented on Issue #1216:
Pardon the rebase, just want to get this on a more recent
master
so as to pick up a fix for nightly builds failing in CI.
bjorn3 commented on Issue #1216:
@iximeow I think you forgot to push :)
iximeow commented on Issue #1216:
@bjorn3 even worse, I'd pushed to the wrong remote. Thanks for the heads up!
peterhuene commented on Issue #1216:
It looks like there's at least a few more tests to disable due to the frame size limitation this adds to functions using Windows x64 calling convention.
iximeow commented on Issue #1216:
it passes :tada:
GitHub appears to believe that
3be2d1c
doesn't exist, so I pushed an empty commit (ea9f77a
) to fool CI into running after yesterday's availability issues. After ignoring more tests, it's green again 🥴
peterhuene commented on Issue #1216:
@iximeow I'll review this soon and if everyone signs off, I think we should merge this before my unwind refactoring changes. I can easily update that PR to integrate this PR's changes to save the FPRs in the windows unwind info.
bjorn3 commented on Issue #1216:
We probably want @bjorn3 or another core cranelift maintainer to sign off as well.
I am actually not a maintainer. :)
peterhuene commented on Issue #1216:
:scream: my apologies, although your review is obviously always appreciated :) I'm not actually a core cranelift maintainer either, so I'm more comfortable when others sign off on bits I haven't been directly involved in.
bjorn3 commented on Issue #1216:
:scream: my apologies, although your review is obviously always appreciated :)
You are not the first to think I am a maintainer of a project. :) I often review PR's on projects I am (slightly) interested in. And sometimes the PR author assumes I am a maintainer.
peterhuene commented on Issue #1216:
I'll ping @bnjbvr for the core maintainer sign-off, then.
iximeow edited a comment on Issue #1216:
Nudging this again as I think this is as good as this fix can get without getting into the weeds on stack layout details.
I have three(ish) issues to open assuming a :+1: :
- [ ] Use of an
F64
type instead of the more accurateF64x2
for callee-save FPRs, which _works_ but just isn't fully expressive of intent. See here- [x] Fix how Cranelift lays out Fastcall stacks to avoid the 240-byte stack frame limit when using unwind information and callee-save XMM registers. See here for conversation and here for a test I've had to disable as a consequence.
- [ ] Might not be worth a tracking issue, but there are two TODO's contingent on an encoding existing that would result in changes here after this lands.
github-actions[bot] commented on Issue #1216:
Subscribe to Label Action
This issue or pull request has been labeled: "wasmtime:api"
<details> <summary>Users Subscribed to "wasmtime:api"</summary>
- @peterhuene
</details>
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
iximeow commented on Issue #1216:
I had to force push to rebase and fix a merge conflict that crept in. Unwind-producing functions are
Result
-y, which of course resulted in some Wasmtime changes as well. Given the wider scope, I'd like one last +1 before clicking the tempting green button :)
iximeow edited a comment on Issue #1216:
Nudging this again as I think this is as good as this fix can get without getting into the weeds on stack layout details.
I have three(ish) issues to open assuming a :+1: :
- [x] Use of an
F64
type instead of the more accurateF64x2
for callee-save FPRs, which _works_ but just isn't fully expressive of intent. See here- [x] Fix how Cranelift lays out Fastcall stacks to avoid the 240-byte stack frame limit when using unwind information and callee-save XMM registers. See here for conversation and here for a test I've had to disable as a consequence.
- [ ] Might not be worth a tracking issue, but there are two TODO's contingent on an encoding existing that would result in changes here after this lands.
iximeow edited a comment on Issue #1216:
Nudging this again as I think this is as good as this fix can get without getting into the weeds on stack layout details.
I have three(ish) issues to open assuming a :+1: :
- [x] Use of an
F64
type instead of the more accurateF64x2
for callee-save FPRs, which _works_ but just isn't fully expressive of intent. See here- [x] Fix how Cranelift lays out Fastcall stacks to avoid the 240-byte stack frame limit when using unwind information and callee-save XMM registers. See here for conversation and here for a test I've had to disable as a consequence.
- [x] Might not be worth a tracking issue, but there are two TODO's contingent on an encoding existing that would result in changes here after this lands.
peterhuene commented on Issue #1216:
:tada: thanks for fixing this @iximeow! :tada: I realize it was a slog of a PR.
Now I get to merge these changes in to my refactoring PR and we'll see about prioritizing the fixing of the stack layout of the register saves :smile:
hrydgard commented on Issue #1216:
Awesome, thanks for getting this done!
Last updated: Nov 22 2024 at 16:03 UTC