jeffcharles opened PR #10028 from jeffcharles:winch-simd-splat
to bytecodealliance:main
:
<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
Part of https://github.com/bytecodealliance/wasmtime/issues/8093. Implements the following instructions on x64 with AVX2 support:
i8x16.splat
i16x8.splat
i32x4.splat
i64x2.splat
f32x4.splat
f64x2.splat
jeffcharles requested wasmtime-compiler-reviewers for a review on PR #10028.
jeffcharles requested abrown for a review on PR #10028.
jeffcharles requested wasmtime-core-reviewers for a review on PR #10028.
jeffcharles requested fitzgen for a review on PR #10028.
github-actions[bot] commented on PR #10028:
Subscribe to Label Action
cc @saulecabrera
<details>
This issue or pull request has been labeled: "winch"Thus the following users have been cc'd because of the following labels:
- saulecabrera: winch
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
Learn more.
</details>
saulecabrera requested saulecabrera for a review on PR #10028.
saulecabrera submitted PR review.
saulecabrera created PR review comment:
Instead of making this specific to x64, can we instead call this
lane_size
or something along those lines and move the implementation tomasm.rs
? I don't think we'd only need this for x64.
saulecabrera created PR review comment:
If I'm not wrong, this is the exact same comment/mask value used in
load_splat
. Could we extract this as a helper?
saulecabrera created PR review comment:
Taking a look at the callsites, and trying to think to future backends, I'm tempted to suggest adding a single entry here,
splat
, defined asfn splat(&mut self, context: &mut CodeGenContext, kind: SplatKind) { ... }
That's potentially the most flexible approach, and that would potentially make the definition more uniform across backends: as far as I can tell, the only reason why we have a special
splat_int
is because of the requirements in x64.For that suggestion to work though, we'd probably need to extend
SplatKind
to encode the source type, (e.g.,SplatKind::I8x16
) -- I don't think this changes anything fundamental, since we'd still be able to derive thelane_size
.
saulecabrera submitted PR review.
saulecabrera created PR review comment:
Could you return a
slice
here instead? The constant pool will copy the bytes into aVec<u8>
, but returning a slice will at least prevent heap allocations in case this gets used for anything else that's not constant pool related.
saulecabrera submitted PR review.
saulecabrera created PR review comment:
To be more specific, using
as_slice()
instead ofto_vec()
, similar to the other uses ofadd_constant
.
jeffcharles updated PR #10028.
jeffcharles updated PR #10028.
jeffcharles submitted PR review.
jeffcharles created PR review comment:
If I use
as_slice()
and return a&[u8]
, I get this message from the compiler:cannot return value referencing temporary value
. Likely because the array that's created byto_le_bytes()
gets dropped when the method returns. The arrays are also of different lengths between the variants. We could maybe use some unsafe code to create a reference to the underlying bytes. Otherwise I'm not sure how to avoid the heap allocation.
jeffcharles submitted PR review.
jeffcharles created PR review comment:
The one thing with changing
SplatKind
to encode anI
orF
is thev128.load*_splat
instructions aren't necessarily operating on integers or floats. It's operating on whatever state was stored in the memory. That said, it should still be fine to have the visitor specify aSplatKind::I32x4
forv128.load32_splat
even if it's operating on 4 floats, it'll still have the same result.
saulecabrera submitted PR review.
saulecabrera created PR review comment:
To be explicit, we could rename the current
SplatKind
toSplatLoadKind
and introduce the types inSplatKind
. That would tighten the implementation and keep the semantics as close as possible to the Wasm semantics.
saulecabrera edited PR review comment.
saulecabrera edited PR review comment.
saulecabrera submitted PR review.
saulecabrera created PR review comment:
Given that this is not used for anything that the constant pool, I don't think we need to introduce any unsafe code. Perhaps adding a comment stating that this method heap allocates and it's intended to be used mostly with the constant pool?
jeffcharles updated PR #10028.
jeffcharles updated PR #10028.
jeffcharles updated PR #10028.
jeffcharles updated PR #10028.
saulecabrera submitted PR review.
saulecabrera merged PR #10028.
Last updated: Jan 24 2025 at 00:11 UTC