wasmtime / PR #5118 cranelift: Mask high bits on `bmask` ... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / PR #5118 cranelift: Mask high bits on `bmask` ...

Wasmtime GitHub notifications bot (Oct 25 2022 at 12:14):

afonso360 opened PR #5118 from fix-bmask to main:

:wave: Hey,

The fuzzer pointed this out when I added bmask, it turns out we are not correctly masking off high bits in our bmask implementations. This is not triggered by our current tests since we always either use a iconst or pass the value in via a register argument, which never has "dirty" high bits.

The changes on AArch64 are fairly minimal and they only mostly impact i8 and i16.

However for RISC-V masking off 32 bits ends up generating a lot of code. I haven't checked but I think on RISC-V this also impacts brz and brnz since they also use normalize_value.

cc: @elliottt

Wasmtime GitHub notifications bot (Oct 25 2022 at 12:15):

afonso360 created PR review comment:

I'm not too familiar with RISCV, is there a better way of masking off the top 32bits? This ends up generating a lot of code, which is unfortunate.

Wasmtime GitHub notifications bot (Oct 25 2022 at 12:15):

afonso360 submitted PR review.

Wasmtime GitHub notifications bot (Oct 25 2022 at 12:15):

afonso360 edited PR review comment.

Wasmtime GitHub notifications bot (Oct 25 2022 at 12:16):

afonso360 edited PR review comment.

Wasmtime GitHub notifications bot (Oct 25 2022 at 12:25):

afonso360 updated PR #5118 from fix-bmask to main.

Wasmtime GitHub notifications bot (Oct 25 2022 at 12:27):

afonso360 submitted PR review.

Wasmtime GitHub notifications bot (Oct 25 2022 at 12:27):

afonso360 created PR review comment:

For reference, this generates the following assembly:
;   auipc t2,0
;   ld t2,12(t2)
;   j 12
;   .8byte 0xffffffff
;   and a4,a0,t2
The only other solution I can think of right now would be to shift left 32 and then shift right 32, which would only be 2 instructions.

Wasmtime GitHub notifications bot (Oct 25 2022 at 12:27):

afonso360 edited PR review comment.

Wasmtime GitHub notifications bot (Oct 25 2022 at 12:44):

afonso360 edited PR review comment.

Wasmtime GitHub notifications bot (Oct 25 2022 at 12:48):

afonso360 edited PR review comment.

Wasmtime GitHub notifications bot (Oct 25 2022 at 14:18):

afonso360 updated PR #5118 from fix-bmask to main.

Wasmtime GitHub notifications bot (Oct 25 2022 at 14:20):

afonso360 edited PR review comment.

Wasmtime GitHub notifications bot (Oct 25 2022 at 16:41):

elliottt submitted PR review.

Wasmtime GitHub notifications bot (Oct 25 2022 at 16:41):

elliottt created PR review comment:

Playing around with clang's output on godbolt shows that it does this conversion with addiw rd, rs, 0 (or the sext.w pseudo instruction). That will perform a 32-bit add of 0 to rs, and then sign-extend the value to 64-bits. As we do appear to be using srliw for a 32-bit right shift, I think this should work?

Wasmtime GitHub notifications bot (Oct 25 2022 at 17:03):

afonso360 submitted PR review.

Wasmtime GitHub notifications bot (Oct 25 2022 at 17:03):

afonso360 created PR review comment:

For normalize_value I think that is not the correct answer since it always guarantees that the top 32 bits will be clear, which may not always be the case for sext.w. However I think we can use that for bmask/brz/brnz and we get back to a 1 instruction solution which is nice!

Wasmtime GitHub notifications bot (Oct 25 2022 at 17:15):

elliottt submitted PR review.

Wasmtime GitHub notifications bot (Oct 25 2022 at 17:15):

elliottt created PR review comment:

normalize_value might be overly conservative as it is now. Switching to addiw rd, rs, 0 for the 32-bit case didn't produce any test failures for me locally, and I suspect that we don't even need that instruction.

Wasmtime GitHub notifications bot (Oct 25 2022 at 18:49):

afonso360 updated PR #5118 from fix-bmask to main.

Wasmtime GitHub notifications bot (Oct 25 2022 at 19:21):

afonso360 created PR review comment:

I've updated it to use addiw and changed the meaning of normalize_value so that it fits the new implementation. Thanks!

I suspect that we don't even need that instruction.
That would be even better, if you have any ideas about this let me know!

Wasmtime GitHub notifications bot (Oct 25 2022 at 19:21):

afonso360 submitted PR review.

Wasmtime GitHub notifications bot (Oct 25 2022 at 20:02):

afonso360 edited PR review comment.

Wasmtime GitHub notifications bot (Oct 26 2022 at 18:10):

elliottt submitted PR review.

Wasmtime GitHub notifications bot (Oct 26 2022 at 18:10):

elliottt submitted PR review.

Wasmtime GitHub notifications bot (Oct 26 2022 at 18:10):

elliottt created PR review comment:

It would probably be worth making this rule priority 0, and the one that handles the <= 16 case higher than the others; we'll test the arguments left-to-right, and fits_in_64 will be true for the 16-bit and smaller cases.

Wasmtime GitHub notifications bot (Oct 26 2022 at 18:10):

elliottt created PR review comment:

Thanks for updating this comment!

Wasmtime GitHub notifications bot (Oct 27 2022 at 11:42):

afonso360 updated PR #5118 from fix-bmask to main.

Wasmtime GitHub notifications bot (Oct 27 2022 at 11:45):

afonso360 updated PR #5118 from fix-bmask to main.

Wasmtime GitHub notifications bot (Oct 27 2022 at 16:45):

elliottt merged PR #5118.

Last updated: Apr 18 2025 at 10:03 UTC