Stream: git-wasmtime

Topic: wasmtime / PR #10126 refactor: unify how bits are accesse...


view this post on Zulip Wasmtime GitHub notifications bot (Jan 27 2025 at 19:04):

abrown requested cfallin for a review on PR #10126.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 27 2025 at 19:04):

abrown opened PR #10126 from abrown:unify-entity-access to bytecodealliance:main:

While using MachLabel, a cranelift-entity-created type, I noticed that there were three ways to access the contained bits: .get(), .as_u32(), and .as_bits(). All performed essentially the same function and it was unclear which to use.

This change removes MachLabel::get(), replacing it with as_u32(). It also replaces all uses of from_bits() and as_bits() with from_u32() and as_u32(). Why? I would have preferred the "bits" naming since it seems more clear ("just unwrap this thing") and it could avoid a large rename if the type were changed in the future, I realized that there are vastly more uses of the "u32" naming that already exist--it's just easier.

While this refactoring _should_ result in no functional change, you may notice a couple of failing tests related to a pre-existing check on from_u32 that did not exist on from_bits. For some reason, from_u32 asserted that we would never pick u32::MAX for an entity value; unfortunately, some parsing code, decode_narrow_field, does just this. Why did we have such an assertion in the first place? Is it still needed? Should decode_narrow_field do something else?

<!--
Please make sure you include the following information:

Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.html

Please ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->

view this post on Zulip Wasmtime GitHub notifications bot (Jan 27 2025 at 19:04):

abrown requested wasmtime-compiler-reviewers for a review on PR #10126.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 27 2025 at 19:04):

abrown requested alexcrichton for a review on PR #10126.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 27 2025 at 19:04):

abrown requested wasmtime-core-reviewers for a review on PR #10126.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 27 2025 at 19:11):

cfallin commented on PR #10126:

While this refactoring should result in no functional change, you may notice a couple of failing tests related to a pre-existing check on from_u32 that did not exist on from_bits. For some reason, from_u32 asserted that we would never pick u32::MAX for an entity value; unfortunately, some parsing code, decode_narrow_field, does just this. Why did we have such an assertion in the first place? Is it still needed? Should decode_narrow_field do something else?

So I think you've run into the reason for the two separate pairs of methods (and we should document this better!): the as_u32/from_u32 methods are meant to convert to and from the contained index, while the as_bits/from_bits are meant to convert to and from the underlying encoding.

It so happens that these two representations coincide for all but one of the possible values. The divergence comes from what the entity index is encoding: it is conceptually an Option<Index>, with an in-band sentinel value that's used for none/invalid/PackedOption encoding/Default/etc. This is why the assert is present in the u32 variants: it's meant to encode and decode only the "present" subset of the value space. Arguably we should have an assert in as_u32 that prevents the sentinel value from leaking out as well; I'm not sure why one wasn't present.

This is why the ValueDataPacked code is operating using the "bits" API: it is re-encoding the underlying bits into a larger packed struct, rather than operating on the surface/semantic level.

I think MachLabel::get() is an artifact of an earlier representation and I'm happy to see it go away in favor of as_u32.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 27 2025 at 19:18):

abrown commented on PR #10126:

Yeah, as I look at it a bit more that makes sense. Something needs to change for as_bits and from_bits though because they seem to have been used elsewhere where we may have indeed wanted to avoid the sentinel...

view this post on Zulip Wasmtime GitHub notifications bot (Jan 27 2025 at 19:29):

cfallin commented on PR #10126:

One way to test that: perhaps temporarily change the encoding they return and accept by, say, flipping some bits -- fn as_bits(&self) -> u32 { self.0 ^ 42 } and likewise in from_bits. This should be a totally legal encoding change if we've kept to the right methods for low-level bit porting vs. access to the index. If we see test failures with that we should switch as needed to .as_u32().

view this post on Zulip Wasmtime GitHub notifications bot (Jan 27 2025 at 19:36):

abrown updated PR #10126.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 27 2025 at 19:59):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 27 2025 at 19:59):

cfallin created PR review comment:

Can we write something about this being an opaque encoding instead? (Currently it sounds like the underlying representation being mostly equal is a documented feature)

Something like: "The raw bit encoding is opaque and has no guaranteed correspondence to the entity's index. It encodes the entire state of this index value: either a valid index or an invalid-index sentinel. The value returned by this method should only be passed to from_bits."

and likewise below "should only be given bits from as_bits"

view this post on Zulip Wasmtime GitHub notifications bot (Jan 27 2025 at 21:17):

alexcrichton requested cfallin for a review on PR #10126.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 27 2025 at 21:18):

alexcrichton commented on PR #10126:

(transferring my review to @cfallin who I think has this covered)

view this post on Zulip Wasmtime GitHub notifications bot (Jan 27 2025 at 21:58):

abrown updated PR #10126.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 27 2025 at 21:58):

abrown has enabled auto merge for PR #10126.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 27 2025 at 22:36):

abrown merged PR #10126.


Last updated: Feb 28 2025 at 03:10 UTC