jeffcharles requested elliottt for a review on PR #7590.
jeffcharles requested wasmtime-compiler-reviewers for a review on PR #7590.
jeffcharles opened PR #7590 from jeffcharles:winch-br-if-stack-cleanup
to bytecodealliance:main
:
<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
I discovered that executing the following module compiled with Winch causes a segmentation fault:(module (func (export "") call 1 call 1 br_if 0 drop ) (func (;1;) (result i32) i32.const 1 ) )
The cause appears to be that the code that would increment
rsp
to balance the machine stack is jumped over when thebr_if
path is taken.What appears to work is, if the frame's
sp_offset
is different than the currentsp_offset
, we invert the existing jump and execute instructions to balance the stack before jumping to the frame's label, and reset thesp_offset
before continuing with the fallthrough code (falling through in terms of thebr_if
, not the machine code jump). In the case where there is no difference betweensp_offset
s, we can use the existingjne
machine code emission since we don't need to run any stack balancing instructions.In terms of machine instructions, we'd be changing
sub rsp, 4 test <reg> <reg> jne end add rsp, 4 end: add rsp, 8 pop rbp ret
to
sub rsp, 4 test <reg> <reg> jeq fallthrough add rsp, 4 jmp end fallthrough: add rsp, 4 end: add rsp, 8 pop rbp ret
jeffcharles submitted PR review.
jeffcharles created PR review comment:
The old file test instructions resulted in an imbalanced stack, these new instructions are balanced. I suspect visiting the
unreachable
instruction may have put Winch into a state where it did not validate thesp_offset
at the end was equal to the locals size.Old:
;; 0: 55 push rbp ;; 1: 4889e5 mov rbp, rsp ;; 4: 4883ec10 sub rsp, 0x10 (offset += 16, 16) ;; 8: 897c240c mov dword ptr [rsp + 0xc], edi ;; c: 4c893424 mov qword ptr [rsp], r14 ;; 10: 448b5c240c mov r11d, dword ptr [rsp + 0xc] ;; 15: 4883ec04 sub rsp, 4 (offset += 4, 20) ;; 19: 44891c24 mov dword ptr [rsp], r11d ;; 1d: 4883ec0c sub rsp, 0xc (offset += 12, 32) ;; 21: bf01000000 mov edi, 1 ;; 26: e800000000 call 0x2b ;; 2b: 4883c40c add rsp, 0xc (offset -= 12, 20) ;; 2f: 4883ec04 sub rsp, 4 (offset += 4, 24) ;; 33: 890424 mov dword ptr [rsp], eax ;; 36: 4883ec08 sub rsp, 8 (offset += 8, 32) ;; 3a: bf01000000 mov edi, 1 ;; 3f: e800000000 call 0x44 ;; 44: 4883c408 add rsp, 8 (offset -= 8, 24) ;; 48: 4883ec04 sub rsp, 4 (offset += 4, 28) ;; 4c: 890424 mov dword ptr [rsp], eax ;; 4f: 8b0c24 mov ecx, dword ptr [rsp] ;; 52: 4883c404 add rsp, 4 (offset -= 4, 24) ;; 56: 8b0424 mov eax, dword ptr [rsp] ;; 59: 4883c404 add rsp, 4 (offset -= 4, 20) ;; 5d: 85c9 test ecx, ecx ;; 5f: 0f8502000000 jne 0x67 ;; 65: 0f0b ud2 ;; 67: 4883c410 add rsp, 0x10 (offset -= 16, 4) ;; 6b: 5d pop rbp ;; 6c: c3 ret
New:
;; 0: 55 push rbp ;; 1: 4889e5 mov rbp, rsp ;; 4: 4883ec10 sub rsp, 0x10 (offset += 16, 16) ;; 8: 897c240c mov dword ptr [rsp + 0xc], edi ;; c: 4c893424 mov qword ptr [rsp], r14 ;; 10: 448b5c240c mov r11d, dword ptr [rsp + 0xc] ;; 15: 4883ec04 sub rsp, 4 (offset += 4, 20) ;; 19: 44891c24 mov dword ptr [rsp], r11d ;; 1d: 4883ec0c sub rsp, 0xc (offset += 12, 32) ;; 21: bf01000000 mov edi, 1 ;; 26: e800000000 call 0x2b ;; 2b: 4883c40c add rsp, 0xc (offset -= 12, 20) ;; 2f: 4883ec04 sub rsp, 4 (offset += 4, 24) ;; 33: 890424 mov dword ptr [rsp], eax ;; 36: 4883ec08 sub rsp, 8 (offset += 8, 32) ;; 3a: bf01000000 mov edi, 1 ;; 3f: e800000000 call 0x44 ;; 44: 4883c408 add rsp, 8 (offset -= 8, 24) ;; 48: 4883ec04 sub rsp, 4 (offset += 4, 28) ;; 4c: 890424 mov dword ptr [rsp], eax ;; 4f: 8b0c24 mov ecx, dword ptr [rsp] ;; 52: 4883c404 add rsp, 4 (offset -= 4, 24) ;; 56: 8b0424 mov eax, dword ptr [rsp] ;; 59: 4883c404 add rsp, 4 (offset -= 4, 20) ;; 5d: 85c9 test ecx, ecx ;; 5f: 0f8409000000 je 0x6e ;; 65: 4883c404 add rsp, 4 (offset -= 4, 16) ;; 69: e902000000 jmp 0x70 ;; 6e: 0f0b ud2 ;; 70: 4883c410 add rsp, 0x10 (offset -= 16, 0) ;; 74: 5d pop rbp ;; 75: c3 ret
jeffcharles updated PR #7590.
saulecabrera submitted PR review.
saulecabrera submitted PR review.
saulecabrera created PR review comment:
Minor note here: instead of checking for equality, can we check for
current_sp_offset > frame_sp_offset
? We have checks everywhere else that guarantee that the current sp should never be less than the target SP; doing so will also align with howensure_sp_for_jump
is built and it also conveys a clearer message that the stack will be manipulated.
saulecabrera created PR review comment:
It seems that we can avoid some repetition here, as only the label and the
CmpKind
are different for the branch instruction:let (label, cmp, needs_cleanup) = if current_sp_offset > frame_sp_offset { (masm.get_label(), IntCmpKind::Eq, true) } else { (•frame.label(), IntCmpKind::Ne, false) }; self.masm.branch(...); if needs_cleanup { .. }
saulecabrera edited PR review comment.
jeffcharles submitted PR review.
jeffcharles created PR review comment:
Should
SPOffset
implementOrd
? I can call.as_u32()
on both sides, but I _think_ it would be reasonable to be able to compare them directly.
jeffcharles updated PR #7590.
jeffcharles requested saulecabrera for a review on PR #7590.
saulecabrera submitted PR review.
saulecabrera created PR review comment:
Yeah, I think it's fine for it to implement
Ord
.
saulecabrera submitted PR review.
saulecabrera created PR review comment:
We can do it in a follow-up PR too.
jeffcharles updated PR #7590.
saulecabrera submitted PR review.
saulecabrera merged PR #7590.
Last updated: Jan 24 2025 at 00:11 UTC