Stream: git-wasmtime

Topic: wasmtime / PR #6531 wip - add i32.popcnt and i64.popcnt t...


view this post on Zulip Wasmtime GitHub notifications bot (Jun 06 2023 at 20:50):

itsrainy opened PR #6531 from bytecodealliance:rainy/winch-popcnt to bytecodealliance:main.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 07 2023 at 16:38):

fitzgen requested saulecabrera for a review on PR #6531.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 07 2023 at 17:38):

saulecabrera submitted PR review:

Thanks for putting this together! The current changes look good to me. I have one thought/comment though: are you planning on providing a fallback implementation for popcnt if the has_popcnt flag is not enabled, similar to what Cranelift provides? If it's too burdensome to do as part of this PR a TODO is also totally fine I think, but we might want to consider updating the fuzzing configuration so that it always enables the has_popcnt flag to avoid failures when fuzzing winch. Even though we only enable the fuzzer locally, it's still useful for verifying our changes.

Also, you might also want to add <I32|I64>Popcnt here https://github.com/bytecodealliance/wasmtime/blob/main/fuzz/fuzz_targets/differential.rs#L335

view this post on Zulip Wasmtime GitHub notifications bot (Jun 12 2023 at 23:06):

itsrainy updated PR #6531.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 13 2023 at 14:04):

saulecabrera created PR review comment:

If I'm not wrong, in the case of a fallback we'd need extra temporary register because the scratch is clobbered by all the calls to load_constant right?

view this post on Zulip Wasmtime GitHub notifications bot (Jun 13 2023 at 14:04):

saulecabrera submitted PR review:

I think that it'd be nice to save a couple of instructions if possible, so I'm leaning towards the second fallback. This is also how SpiderMonkey handles this case https://searchfox.org/mozilla-central/source/js/src/jit/x86-shared/MacroAssembler-x86-shared-inl.h#110

I left a couple of extra comments, but feel free to ignore if those are things that you were already thinking about!

view this post on Zulip Wasmtime GitHub notifications bot (Jun 13 2023 at 14:04):

saulecabrera submitted PR review:

I think that it'd be nice to save a couple of instructions if possible, so I'm leaning towards the second fallback. This is also how SpiderMonkey handles this case https://searchfox.org/mozilla-central/source/js/src/jit/x86-shared/MacroAssembler-x86-shared-inl.h#110

I left a couple of extra comments, but feel free to ignore if those are things that you were already thinking about!

view this post on Zulip Wasmtime GitHub notifications bot (Jun 13 2023 at 14:04):

saulecabrera created PR review comment:

Could we replace the emit calls in this function with the emit functions in the assembler? That has the added benefit that it already handles constant loading based on the operand size.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 13 2023 at 14:04):

saulecabrera created PR review comment:

If that's the case, perhaps we can pass in a mutable reference to the CodeGenContext to MacroAssembler::popcnt and do the dispatching at that level and if we need to emit the fallback, we can request an extra temporary register via CodeGenContext::any_gpr? Similar to to the implementation of clz https://github.com/bytecodealliance/wasmtime/blob/main/winch/codegen/src/isa/x64/masm.rs#L343

view this post on Zulip Wasmtime GitHub notifications bot (Jun 13 2023 at 15:25):

itsrainy created PR review comment:

scratch is clobbered by all the calls to load_constant right?

Oh this is good to know! I didn't know that. I am still learning the relationships between the different reg types and how their different abstraction layers are used, so if it seems like I'm doing something weird, please let me know :)

I'll take the approach in your second comment!

view this post on Zulip Wasmtime GitHub notifications bot (Jun 13 2023 at 15:36):

itsrainy created PR review comment:

I think so. So, for instance I could replace this:

self.emit(Inst::AluRmiR {
    size: size.into(),
    op: AluRmiROpcode::Sub,
    src1: reg.into(),
    src2: masked1.into(),
    dst: reg.into(),
});

with something like this:

self.sub_rr(reg, masked, size);

right? I noticed some of the instructions don't have corresponding functions on the assembler (shiftr, and), should I add functions for those?

view this post on Zulip Wasmtime GitHub notifications bot (Jun 13 2023 at 15:56):

saulecabrera created PR review comment:

Yeah exactly.

For shift we have shift_ir and shift_rr in the assembler and for and we have and_rr and and_ir , I think that should cover all the cases for the lowering here. But if it doesn't feel free to make any additions to the assembler!

view this post on Zulip Wasmtime GitHub notifications bot (Jun 13 2023 at 21:39):

itsrainy updated PR #6531.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 13 2023 at 21:42):

itsrainy updated PR #6531.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 13 2023 at 22:11):

itsrainy updated PR #6531.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 13 2023 at 22:18):

itsrainy has marked PR #6531 as ready for review.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 13 2023 at 22:18):

itsrainy requested jameysharp for a review on PR #6531.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 13 2023 at 22:18):

itsrainy requested wasmtime-fuzz-reviewers for a review on PR #6531.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 13 2023 at 22:18):

itsrainy requested wasmtime-compiler-reviewers for a review on PR #6531.


Last updated: Jan 24 2025 at 00:11 UTC