Stream: git-wasmtime

Topic: wasmtime / PR #9781 Winch: Implement rem for aarch64


view this post on Zulip Wasmtime GitHub notifications bot (Dec 10 2024 at 23:17):

MarinPostma opened PR #9781 from MarinPostma:winch-aarch64-rem to bytecodealliance:main:

<!--
Please make sure you include the following information:

Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.html

Please ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->

view this post on Zulip Wasmtime GitHub notifications bot (Dec 10 2024 at 23:17):

MarinPostma updated PR #9781.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 10 2024 at 23:23):

MarinPostma submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 10 2024 at 23:23):

MarinPostma created PR review comment:

@saulecabrera I'm not sure about that, but if we are performing an unsigned remainder operation, then do we really care about sign extending? Maybe it's not necessary for division either? Is there a scenario where we can't just blindly consider the w register as a x register?

view this post on Zulip Wasmtime GitHub notifications bot (Dec 10 2024 at 23:25):

MarinPostma edited PR #9781:

Implement remainder operation for aarch64 in winch.

one more step in the #8321 series

view this post on Zulip Wasmtime GitHub notifications bot (Dec 10 2024 at 23:26):

MarinPostma has marked PR #9781 as ready for review.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 10 2024 at 23:26):

MarinPostma requested abrown for a review on PR #9781.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 10 2024 at 23:26):

MarinPostma requested wasmtime-compiler-reviewers for a review on PR #9781.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 10 2024 at 23:26):

MarinPostma requested alexcrichton for a review on PR #9781.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 10 2024 at 23:26):

MarinPostma requested wasmtime-core-reviewers for a review on PR #9781.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 10 2024 at 23:27):

MarinPostma submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 10 2024 at 23:27):

MarinPostma created PR review comment:

The same reasonning applies to div, I think. We could shave a few instructions in the case of unsigned division too.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 11 2024 at 00:58):

abrown requested saulecabrera for a review on PR #9781.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 11 2024 at 02:15):

github-actions[bot] commented on PR #9781:

Subscribe to Label Action

cc @saulecabrera

<details>
This issue or pull request has been labeled: "winch"

Thus the following users have been cc'd because of the following labels:

To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.

Learn more.
</details>

view this post on Zulip Wasmtime GitHub notifications bot (Dec 11 2024 at 16:16):

saulecabrera submitted PR review:

Looking good. Left some comments inline.

As a general comment, given that the same invariant as division apply here, could we modify the existing div_rrr so that it can be re-used from rem_rrr?

view this post on Zulip Wasmtime GitHub notifications bot (Dec 11 2024 at 16:16):

saulecabrera created PR review comment:

Also, the same invariants as signed division apply here, right? i.e., we must still check for overflow.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 11 2024 at 16:16):

saulecabrera created PR review comment:

After https://github.com/bytecodealliance/wasmtime/pull/9762 was merged, I noticed that we were unconditionally emitting a sign-extension here. Instead we need to request the proper extension depending on the division kind. I opened https://github.com/bytecodealliance/wasmtime/pull/9784 to fix this. We should apply the same fix here.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 11 2024 at 16:16):

saulecabrera created PR review comment:

Can we use self.trapz here?

view this post on Zulip Wasmtime GitHub notifications bot (Dec 11 2024 at 16:16):

saulecabrera created PR review comment:

In practice, nothing is stopping us from passing a w register here, however given that we're dealing with the 64-bit instruction variant, it will take into account the full 64-bits not just the lower 32. In case that the high 32 bits are not zeroed out, this could lead to unexpected behavior. So we need to extend in both cases.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 11 2024 at 16:37):

MarinPostma submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 11 2024 at 16:37):

MarinPostma created PR review comment:

Actually I was expecting it to be the case, but as it turn out, it doesn't seem necessary; the following snippet

 (module
     (func (param i32 i32) (result i32)
          local.get 0
           local.get 1
           i32.rem_u
     )
 )

is compiled by cranelift to:

0000000000000000 <wasm[0]::function[0]>:
       0: d503235f      pacibz
       4: a9bf7bfd      stp     x29, x30, [sp, #-16]!
       8: 910003fd      mov     x29, sp
       c: 2a0403e3      mov     w3, w4
      10: 2a0503e5      mov     w5, w5
      14: b40000c5      cbz     x5, 0x2c <wasm[0]::function[0]+0x2c>
      18: 9ac50868      udiv    x8, x3, x5
      1c: 9b058d02      msub    x2, x8, x5, x3
      20: a8c17bfd      ldp     x29, x30, [sp], #16
      24: d50323df      autibz
      28: d65f03c0      ret
      2c: 0000c11f      udf     #49439

(with rem_s:

       0: d503235f      pacibz
       4: a9bf7bfd      stp     x29, x30, [sp, #-16]!
       8: 910003fd      mov     x29, sp
       c: 52b00004      mov     w4, #-2147483648
      10: 93407c81      sxtw    x1, w4
      14: 92800003      mov     x3, #-1
      18: 9ac30c25      sdiv    x5, x1, x3
      1c: 9b0384a2      msub    x2, x5, x3, x1
      20: a8c17bfd      ldp     x29, x30, [sp], #16
      24: d50323df      autibz
      28: d65f03c0      ret

)

At first I thought this was a bug, but it seems to be consistent across platform:

(module
    (func (export "_start") (result i32)
    (i32.const 0x80000000)
    (i32.const -1)
        i32.rem_s
    )
)

this yields 0 on both aarch64 and x86.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 11 2024 at 16:44):

MarinPostma commented on PR #9781:

I was my initial intention to factorize the check, but as it turns out, we don't need to check for overflow for rem. I'm working on supporting i32 division, so we can drop the extension in a subsequent PR too.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 11 2024 at 17:08):

MarinPostma updated PR #9781.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 11 2024 at 17:18):

MarinPostma updated PR #9781.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 11 2024 at 17:20):

saulecabrera submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 11 2024 at 17:20):

saulecabrera created PR review comment:

Ah right, going over the Wasm spec, there's no explicit mention of overflow checks in this case.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 11 2024 at 18:48):

saulecabrera submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 11 2024 at 19:05):

saulecabrera merged PR #9781.


Last updated: Dec 23 2024 at 12:05 UTC