Stream: git-wasmtime

Topic: wasmtime / PR #3848 Migrate clz, ctz, popcnt, bitrev, is_...


view this post on Zulip Wasmtime GitHub notifications bot (Feb 25 2022 at 03:08):

cfallin edited PR #3848 from isle-bitops to main:

Along the way this introduces a few new utility bits to the ISLE bindings, e.g. some constructors to do integer arithmetic on constants (I'm surprised we got this far without it!). The bit-twiddling algorithms are, at least to my eye, a bit more legible in this form, but I'm happy to add more comments here if desired since we're making a pass over everything anyway.

Also updated a few stray places I had missed earlier where we can use implicit conversions (the def_inst removals).

view this post on Zulip Wasmtime GitHub notifications bot (Feb 25 2022 at 16:54):

abrown submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 25 2022 at 16:54):

abrown submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 25 2022 at 16:54):

abrown created PR review comment:

It seems weird that these extractors (and above) are in any way related to Type... logically confusing, I can see the pragmatic part of it.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 25 2022 at 16:54):

abrown created PR review comment:

:+1:

view this post on Zulip Wasmtime GitHub notifications bot (Feb 25 2022 at 16:54):

abrown created PR review comment:

The CMOV here uses BSR's flag-setting behavior to work correctly. Just to check: we don't need to use ConsumesFlags/ProducesFlags here because these instructions are back-to-back and emitted in order? If that is the case, is a comment to that effect useful here?

view this post on Zulip Wasmtime GitHub notifications bot (Feb 25 2022 at 16:54):

abrown created PR review comment:

The comments below are pretty helpful; any high-level comments here that explain the general gist of how this is counting bits?

view this post on Zulip Wasmtime GitHub notifications bot (Feb 25 2022 at 16:54):

abrown created PR review comment:

;; Helper for emitting immediates with an `i64` value. Note that

view this post on Zulip Wasmtime GitHub notifications bot (Feb 25 2022 at 16:54):

abrown created PR review comment:

(Also, if this is allowed, then shouldn't the new CmoveOr and XmmCmoveOr sequences have been written in ISLE?)

view this post on Zulip Wasmtime GitHub notifications bot (Feb 25 2022 at 16:54):

abrown created PR review comment:

;; For SSE4.2 we use Mula's algorithm (https://arxiv.org/pdf/1611.07612.pdf)

?

view this post on Zulip Wasmtime GitHub notifications bot (Feb 25 2022 at 17:12):

fitzgen submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 25 2022 at 17:12):

fitzgen created PR review comment:

When I introduced the first of these extractors, I did it this way so that you read them near the type constraints because they are most often related to those, e.g. whether to use AVX512 is related to whether we are lowering a vector type.

/me shrugs

view this post on Zulip Wasmtime GitHub notifications bot (Feb 25 2022 at 17:44):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 25 2022 at 17:44):

cfallin created PR review comment:

Hmm, yeah, I realize now that this is exactly the same circumstance that led me to suggest CmoveOr previously; the output of the bsr is one of the inputs to the cmove. I'm thinking now maybe we can add a new variant to ConsumesFlags for this; I'll play with it a bit, and clean up by removing CmoveOr at the same time if I can work something out. Sorry for the inconsistency and churn!

view this post on Zulip Wasmtime GitHub notifications bot (Feb 25 2022 at 17:45):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 25 2022 at 17:45):

cfallin created PR review comment:

I had copied this comment from lower.rs but it seems that the instructions used only require SSE 4.2 (right?) so I can update accordingly.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 25 2022 at 17:48):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 25 2022 at 17:48):

cfallin created PR review comment:

Yup, the advantage is that it lets us use these in a context that's already common. Though I wonder now if it could make sense to have an extractor result value (the input) that is something like a unit type: in other words, the extractor takes nothing as input, but can be used in any context. I'll think more about it; it probably muddies the waters a bit wrt clean/simple semantics; but it could work...

view this post on Zulip Wasmtime GitHub notifications bot (Feb 25 2022 at 19:37):

cfallin updated PR #3848 from isle-bitops to main.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 25 2022 at 19:38):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 25 2022 at 19:38):

cfallin created PR review comment:

OK, so I've reworked things so that CmoveOr/XmmCmoveOr no longer exist. Thanks for noticing this! The i128 case needs a new ConsumesFlags variant that contains four consumer instructions, but this seems reasonable-ish to me.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 25 2022 at 19:38):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 25 2022 at 19:38):

cfallin created PR review comment:

Done!

view this post on Zulip Wasmtime GitHub notifications bot (Feb 25 2022 at 19:39):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 25 2022 at 19:39):

cfallin created PR review comment:

Added comments describing how this popcount algorithm works -- it was pretty fun to work out :-)

view this post on Zulip Wasmtime GitHub notifications bot (Feb 25 2022 at 20:37):

fitzgen submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 25 2022 at 20:37):

fitzgen submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 25 2022 at 20:37):

fitzgen created PR review comment:

Does it make sense to duplicate (imm $i64 0x7777777777777777) at each use site, rather than reusing the named register, to reduce register pressure?

view this post on Zulip Wasmtime GitHub notifications bot (Feb 25 2022 at 20:47):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 25 2022 at 20:47):

cfallin created PR review comment:

I thought about this a bit, but I think it's very likely better to load it once and reuse it: the instruction to load an arbitrary 64-bit value (movabs) is pretty long, the immediate can't fit in the normal simm32 so we need a register in any case, and the live-range across 7 instructions is not really any worse than across one (any other value that wants to be live across this sequence will conflict with the register anyway, whether loaded once or loaded three times).

view this post on Zulip Wasmtime GitHub notifications bot (Feb 25 2022 at 23:53):

cfallin updated PR #3848 from isle-bitops to main.

view this post on Zulip Wasmtime GitHub notifications bot (Feb 28 2022 at 17:45):

fitzgen merged PR #3848.


Last updated: Jan 24 2025 at 00:11 UTC