Stream: git-wasmtime

Topic: wasmtime / issue #10431 Deduplicate encoded stack map dat...


view this post on Zulip Wasmtime GitHub notifications bot (Mar 20 2025 at 15:16):

fitzgen opened issue #10431:

Something to consider for the future: if we frequently have multiple sequential entries for different PCs but which have the same stack slots, eg

... 0x1dc: offset of [8, 12] 0x124: offset of [8, 12, 24] copy 1 0x142: offset of [8, 12, 24] copy 2 0x15a: offset of [8, 12, 24] copy 3 0x166: offset of [8] ...

then it may make sense for each entry in the index to store non-overlapping PC ranges, rather than exact PCs, and we could effectively dedupe the index entries and the stack map data. That is, the previous example would become

... 0x1dc..0x1dd: offset of [8, 12] 0x124..0x15b: offset of [8, 12, 24] (only copy) 0x166..0x167: offset of [8] ...

The downsides are that

  1. We would need to change Cranelift to actually emit empty stack maps for safepoints without any live GC refs, otherwise if we have (pc=0x1234, [8]); (pc=0x1238, []); (pc=0x123b, [8]) and we don't see that middle entry in this builder, then we risk using [8] as our stack map at pc 0x1238, which is extending a dead gc ref's lifetime at best and is giving the collector uninitialized data at worst.
  2. Relatedly, we lose our ability to catch bugs where the return address PC we are tracing isn't an exact match for a stack map entry.

These are actually pretty scary, so maybe we don't want to do this, even if it would let us make these binary search indices much smaller.


All that said, we can actually already dedupe the stack map _data_ if we want to, and have multiple index entries point to the same stack map data (even if they aren't contiguous!) with the encoding scheme already in use in this PR. We just need to hash cons and cache stack-map-data to encoded offset in this builder. This doesn't have any of the downsides from above. Seems like it would be a pure win.

_Originally posted by @fitzgen in https://github.com/bytecodealliance/wasmtime/pull/10404#discussion_r1999443711_

view this post on Zulip Wasmtime GitHub notifications bot (Mar 20 2025 at 21:12):

alexcrichton added the wasmtime label to Issue #10431.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 20 2025 at 21:12):

alexcrichton added the wasm-proposal:gc label to Issue #10431.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 20 2025 at 21:12):

alexcrichton added the wasmtime:code-size label to Issue #10431.


Last updated: Apr 16 2025 at 22:03 UTC