MarinPostma opened PR #9781 from MarinPostma:winch-aarch64-rem
to bytecodealliance:main
:
- pulley: Remove hardcoded instruction sizes in interpreter (#9769)
- Update
arbitrary
to 1.4.1 (#9550)- Update ADOPTERS.md (#9773)
- Add a guide for adding instructions to Pulley (#9765)
- pulley: Implement more of loads/stores (#9775)
- implement copy/clone for RemKind
- implement remainder operation for aarch64
- add tests
<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
MarinPostma updated PR #9781.
MarinPostma submitted PR review.
MarinPostma created PR review comment:
@saulecabrera I'm not sure about that, but if we are performing an unsigned remainder operation, then do we really care about sign extending? Maybe it's not necessary for division either? Is there a scenario where we can't just blindly consider the
w
register as ax
register?
MarinPostma edited PR #9781:
Implement remainder operation for
aarch64
in winch.one more step in the #8321 series
MarinPostma has marked PR #9781 as ready for review.
MarinPostma requested abrown for a review on PR #9781.
MarinPostma requested wasmtime-compiler-reviewers for a review on PR #9781.
MarinPostma requested alexcrichton for a review on PR #9781.
MarinPostma requested wasmtime-core-reviewers for a review on PR #9781.
MarinPostma submitted PR review.
MarinPostma created PR review comment:
The same reasonning applies to div, I think. We could shave a few instructions in the case of unsigned division too.
abrown requested saulecabrera for a review on PR #9781.
github-actions[bot] commented on PR #9781:
Subscribe to Label Action
cc @saulecabrera
<details>
This issue or pull request has been labeled: "winch"Thus the following users have been cc'd because of the following labels:
- saulecabrera: winch
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
Learn more.
</details>
saulecabrera submitted PR review:
Looking good. Left some comments inline.
As a general comment, given that the same invariant as division apply here, could we modify the existing
div_rrr
so that it can be re-used fromrem_rrr
?
saulecabrera created PR review comment:
Also, the same invariants as signed division apply here, right? i.e., we must still check for overflow.
saulecabrera created PR review comment:
After https://github.com/bytecodealliance/wasmtime/pull/9762 was merged, I noticed that we were unconditionally emitting a sign-extension here. Instead we need to request the proper extension depending on the division kind. I opened https://github.com/bytecodealliance/wasmtime/pull/9784 to fix this. We should apply the same fix here.
saulecabrera created PR review comment:
Can we use
self.trapz
here?
saulecabrera created PR review comment:
In practice, nothing is stopping us from passing a
w
register here, however given that we're dealing with the 64-bit instruction variant, it will take into account the full 64-bits not just the lower 32. In case that the high 32 bits are not zeroed out, this could lead to unexpected behavior. So we need to extend in both cases.
MarinPostma submitted PR review.
MarinPostma created PR review comment:
Actually I was expecting it to be the case, but as it turn out, it doesn't seem necessary; the following snippet
(module (func (param i32 i32) (result i32) local.get 0 local.get 1 i32.rem_u ) )
is compiled by cranelift to:
0000000000000000 <wasm[0]::function[0]>: 0: d503235f pacibz 4: a9bf7bfd stp x29, x30, [sp, #-16]! 8: 910003fd mov x29, sp c: 2a0403e3 mov w3, w4 10: 2a0503e5 mov w5, w5 14: b40000c5 cbz x5, 0x2c <wasm[0]::function[0]+0x2c> 18: 9ac50868 udiv x8, x3, x5 1c: 9b058d02 msub x2, x8, x5, x3 20: a8c17bfd ldp x29, x30, [sp], #16 24: d50323df autibz 28: d65f03c0 ret 2c: 0000c11f udf #49439
(with
rem_s
:0: d503235f pacibz 4: a9bf7bfd stp x29, x30, [sp, #-16]! 8: 910003fd mov x29, sp c: 52b00004 mov w4, #-2147483648 10: 93407c81 sxtw x1, w4 14: 92800003 mov x3, #-1 18: 9ac30c25 sdiv x5, x1, x3 1c: 9b0384a2 msub x2, x5, x3, x1 20: a8c17bfd ldp x29, x30, [sp], #16 24: d50323df autibz 28: d65f03c0 ret
)
At first I thought this was a bug, but it seems to be consistent across platform:
(module (func (export "_start") (result i32) (i32.const 0x80000000) (i32.const -1) i32.rem_s ) )
this yields
0
on both aarch64 and x86.
MarinPostma commented on PR #9781:
I was my initial intention to factorize the check, but as it turns out, we don't need to check for overflow for
rem
. I'm working on supporting i32 division, so we can drop the extension in a subsequent PR too.
MarinPostma updated PR #9781.
MarinPostma updated PR #9781.
saulecabrera submitted PR review.
saulecabrera created PR review comment:
Ah right, going over the Wasm spec, there's no explicit mention of overflow checks in this case.
saulecabrera submitted PR review.
saulecabrera merged PR #9781.
Last updated: Dec 23 2024 at 12:05 UTC