Stream: git-wasmtime

Topic: wasmtime / PR #9088 pulley: pack `dst`, `src1` and `src2`...


view this post on Zulip Wasmtime GitHub notifications bot (Aug 07 2024 at 21:03):

Kmeakin opened PR #9088 from Kmeakin:km/pulley-packed-binary-operands to bytecodealliance:main:

Pack the dst, src1 and src2 register operands of binary operators into 2 bytes. Solves the TODO at the top of lib.rs

view this post on Zulip Wasmtime GitHub notifications bot (Aug 07 2024 at 21:03):

Kmeakin requested wasmtime-default-reviewers for a review on PR #9088.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 07 2024 at 21:03):

Kmeakin requested cfallin for a review on PR #9088.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 07 2024 at 21:22):

cfallin requested fitzgen for a review on PR #9088.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 07 2024 at 21:22):

cfallin commented on PR #9088:

Reassigned @fitzgen to review, I'm not sure how this intersects with your other plans to add instructions with large offsets etc; happy to review this myself if you'd like otherwise.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 07 2024 at 21:44):

Kmeakin updated PR #9088.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 07 2024 at 21:45):

Kmeakin edited PR #9088:

Pack the dst, src1 and src2 register operands of binary operators into 2 bytes. Solves the TODO at the top of lib.rs.

This required moving the special registers into their own register class, and adding smov and spush/spop to interact with SP/FP.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 07 2024 at 21:49):

Kmeakin updated PR #9088.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 07 2024 at 22:59):

fitzgen submitted PR review:

Thanks! BinaryOperands is exactly what I was imagining!

I'd prefer holding off on merging this until I can land the Cranelift backend for Pulley though, just so that I don't have to chase instruction set updates, and we can start evolving the backend and the instruction set together and keep them in sync.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 07 2024 at 22:59):

fitzgen submitted PR review:

Thanks! BinaryOperands is exactly what I was imagining!

I'd prefer holding off on merging this until I can land the Cranelift backend for Pulley though, just so that I don't have to chase instruction set updates, and we can start evolving the backend and the instruction set together and keep them in sync.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 07 2024 at 22:59):

fitzgen created PR review comment:

I was indeed thinking that the special registers would only be accessed from special instructions, but I think we can do even more here by bundling the whole function prologue and epilogue up into instructions that just take the size to decrement the stack frame by:

enter_frame 16
...
exit_frame 16
ret

where

enter_frame N

==>

push lr
push fp
fp = sp
sp = sp - N

and the opposite for exit_frame N.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 07 2024 at 22:59):

fitzgen created PR review comment:

Why all these changes? Seems unrelated?

view this post on Zulip Wasmtime GitHub notifications bot (Aug 07 2024 at 22:59):

fitzgen created PR review comment:

Could we split this out into its own PR?

view this post on Zulip Wasmtime GitHub notifications bot (Aug 07 2024 at 22:59):

fitzgen created PR review comment:

We shouldn't need this, or any other general instructions for the special registers. They should only be manipulated by instructions that are doing other stuff as well, eg call will set lr. But these registers aren't allocated by our register allocator and are reserved for their special purpose, so we shouldn't need to have general instructions for them.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 08 2024 at 02:04):

github-actions[bot] commented on PR #9088:

Subscribe to Label Action

cc @fitzgen

<details>
This issue or pull request has been labeled: "pulley"

Thus the following users have been cc'd because of the following labels:

To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.

Learn more.
</details>

view this post on Zulip Wasmtime GitHub notifications bot (Aug 09 2024 at 16:05):

Kmeakin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 09 2024 at 16:05):

Kmeakin created PR review comment:

Encodes the invariant that registers are always in the range 0..32, so indexing into the registers array can never fail. The other option would be to use the #[rustc_layout_scalar_valid_range_end(31)] attribute, but that is nightly-only.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 09 2024 at 16:24):

fitzgen submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 09 2024 at 16:24):

fitzgen created PR review comment:

Are you saying we get better codegen with enums here? That's neat, I wouldn't have expected the LLVM IR that rustc emits to change.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 09 2024 at 16:40):

Kmeakin updated PR #9088.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 09 2024 at 16:44):

Kmeakin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 09 2024 at 16:44):

Kmeakin created PR review comment:

https://godbolt.org/z/jPrq5bc45

rustc emits calls to assume for types that have niches, and starting from LLVM 19 function parameters can have range metadata attached directly

view this post on Zulip Wasmtime GitHub notifications bot (Aug 09 2024 at 16:48):

Kmeakin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 09 2024 at 16:48):

Kmeakin created PR review comment:

I'm a bit confused about the purpose of the N to enter_frame/exit_frame. Each push/pop will itself automatically adjust the sp by size_of<*u8>(). So in the case of no other registers being stored on the stack, the argument will be zero. In the case where other registers need to be saved on the stack, xpush{32,64} and xpop{32,64} will also adjust the sp by 4/8 bytes. So in either case the argument is not needed? I don't think there is any scenario where you would need to adjust the stack pointer without also reading/writing from/to the stack

view this post on Zulip Wasmtime GitHub notifications bot (Aug 09 2024 at 16:49):

Kmeakin updated PR #9088.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 09 2024 at 17:00):

Kmeakin updated PR #9088.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 09 2024 at 17:10):

Kmeakin updated PR #9088.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 09 2024 at 17:13):

fitzgen submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 09 2024 at 17:13):

fitzgen created PR review comment:

I'm a bit confused about the purpose of the N to enter_frame/exit_frame.

To allocate the stack frame for this function, including all of its regalloc spills and all that.

That said, the way that Cranelift is organized and the abstraction boundaries it has might make doing the full function prologue in a single instruction difficult. We might have to settle for two instructions:

  1. a macro instruction for push lr; push fp; fp = sp (and the opposite for exiting the function, maybe just fold this into ret)
  2. an instruction for allocating and deallocating stack space: sp = sp +/- imm32

See gen_prologue_frame_setup, gen_epilogue_frame_restore, and gen_sp_reg_adjust in ../pulley_shared/abi.rs in https://github.com/bytecodealliance/wasmtime/pull/9089 (or the equivalent methods in, e.g., ../aarch64/abi.rs). I'm thinking that each of these should produce a single macro instruction because they otherwise always generate the exact same sequence of instructions modulo immediate constant values.

The benefit of a single macro instruction, compared to a sequence of composable and more general instructions, is performance. The expensive part of an interpreter is roundtripping through the opcode loop, by using macro instructions we are amortizing that cost.

Does that make sense?

view this post on Zulip Wasmtime GitHub notifications bot (Aug 09 2024 at 18:05):

Kmeakin updated PR #9088.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 09 2024 at 19:35):

Kmeakin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 09 2024 at 19:35):

Kmeakin created PR review comment:

Decided the extra argument was not needed after all:
https://bytecodealliance.zulipchat.com/#narrow/stream/217126-wasmtime/topic/Pulley.20registers/near/459533622

view this post on Zulip Wasmtime GitHub notifications bot (Aug 09 2024 at 21:33):

Kmeakin updated PR #9088.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 09 2024 at 21:37):

Kmeakin updated PR #9088.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 16 2024 at 22:12):

fitzgen commented on PR #9088:

Heads up: the Pulley backend for Cranelift landed in #9089 and I think this can be rebased and landed now

view this post on Zulip Wasmtime GitHub notifications bot (Aug 16 2024 at 22:31):

Kmeakin updated PR #9088.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 19 2024 at 13:07):

Kmeakin updated PR #9088.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 19 2024 at 14:25):

Kmeakin requested abrown for a review on PR #9088.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 19 2024 at 14:25):

Kmeakin requested wasmtime-compiler-reviewers for a review on PR #9088.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 19 2024 at 14:25):

Kmeakin updated PR #9088.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 19 2024 at 14:27):

Kmeakin edited PR #9088:

Pack the dst, src1 and src2 register operands of binary operators into 2 bytes. Solves the TODO at the top of lib.rs.

~This required moving the special registers into their own register class, and adding smov and spush/spop to interact with SP/FP.~
I held off on splitting special registers into their own SReg class because I'm not sure how to integrate it with ISLE/Regalloc

view this post on Zulip Wasmtime GitHub notifications bot (Aug 19 2024 at 14:47):

Kmeakin updated PR #9088.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 19 2024 at 15:05):

Kmeakin updated PR #9088.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 19 2024 at 20:10):

fitzgen submitted PR review:

Thanks!

view this post on Zulip Wasmtime GitHub notifications bot (Aug 19 2024 at 20:27):

fitzgen merged PR #9088.


Last updated: Oct 23 2024 at 20:03 UTC