itsrainy opened PR #6531 from bytecodealliance:rainy/winch-popcnt to bytecodealliance:main.
fitzgen requested saulecabrera for a review on PR #6531.
saulecabrera submitted PR review:
Thanks for putting this together! The current changes look good to me. I have one thought/comment though: are you planning on providing a fallback implementation for
popcntif thehas_popcntflag is not enabled, similar to what Cranelift provides? If it's too burdensome to do as part of this PR a TODO is also totally fine I think, but we might want to consider updating the fuzzing configuration so that it always enables thehas_popcntflag to avoid failures when fuzzing winch. Even though we only enable the fuzzer locally, it's still useful for verifying our changes.Also, you might also want to add
<I32|I64>Popcnthere https://github.com/bytecodealliance/wasmtime/blob/main/fuzz/fuzz_targets/differential.rs#L335
itsrainy updated PR #6531.
saulecabrera created PR review comment:
If I'm not wrong, in the case of a fallback we'd need extra temporary register because the scratch is clobbered by all the calls to
load_constantright?
saulecabrera submitted PR review:
I think that it'd be nice to save a couple of instructions if possible, so I'm leaning towards the second fallback. This is also how SpiderMonkey handles this case https://searchfox.org/mozilla-central/source/js/src/jit/x86-shared/MacroAssembler-x86-shared-inl.h#110
I left a couple of extra comments, but feel free to ignore if those are things that you were already thinking about!
saulecabrera submitted PR review:
I think that it'd be nice to save a couple of instructions if possible, so I'm leaning towards the second fallback. This is also how SpiderMonkey handles this case https://searchfox.org/mozilla-central/source/js/src/jit/x86-shared/MacroAssembler-x86-shared-inl.h#110
I left a couple of extra comments, but feel free to ignore if those are things that you were already thinking about!
saulecabrera created PR review comment:
Could we replace the
emitcalls in this function with the emit functions in the assembler? That has the added benefit that it already handles constant loading based on the operand size.
saulecabrera created PR review comment:
If that's the case, perhaps we can pass in a mutable reference to the
CodeGenContexttoMacroAssembler::popcntand do the dispatching at that level and if we need to emit the fallback, we can request an extra temporary register viaCodeGenContext::any_gpr? Similar to to the implementation ofclzhttps://github.com/bytecodealliance/wasmtime/blob/main/winch/codegen/src/isa/x64/masm.rs#L343
itsrainy created PR review comment:
scratch is clobbered by all the calls to load_constant right?
Oh this is good to know! I didn't know that. I am still learning the relationships between the different reg types and how their different abstraction layers are used, so if it seems like I'm doing something weird, please let me know :)
I'll take the approach in your second comment!
itsrainy created PR review comment:
I think so. So, for instance I could replace this:
self.emit(Inst::AluRmiR { size: size.into(), op: AluRmiROpcode::Sub, src1: reg.into(), src2: masked1.into(), dst: reg.into(), });with something like this:
self.sub_rr(reg, masked, size);right? I noticed some of the instructions don't have corresponding functions on the assembler (
shiftr,and), should I add functions for those?
saulecabrera created PR review comment:
Yeah exactly.
For
shiftwe haveshift_irandshift_rrin the assembler and forandwe haveand_rrandand_ir, I think that should cover all the cases for the lowering here. But if it doesn't feel free to make any additions to the assembler!
itsrainy updated PR #6531.
itsrainy updated PR #6531.
itsrainy updated PR #6531.
itsrainy has marked PR #6531 as ready for review.
itsrainy requested jameysharp for a review on PR #6531.
itsrainy requested wasmtime-fuzz-reviewers for a review on PR #6531.
itsrainy requested wasmtime-compiler-reviewers for a review on PR #6531.
Last updated: Dec 06 2025 at 06:05 UTC