jameysharp opened PR #8171 from jameysharp:table-traps
to bytecodealliance:main
:
Currently, every access to a table element does a bounds-check with a conditional branch to a block that explicitly traps.
Instead, when SPECTRE mitigations are enabled, let's change the address computation to return a null pointer for out-of-bounds accesses, and then allow the subsequent load or store to trap.
This is less code in that case since we can reuse instructions we needed anyway.
In addition, when the table has constant size and the element index is a constant and mid-end optimization is enabled, this allows the bounds-check to be constant folded away. Later, #8139 will let us optimize away the select_spectre_guard instruction in this case too.
Once we also implement #8160,
tests/disas/typed-funcrefs.wat
should be almost as fast as native indirect function calls.
jameysharp requested fitzgen for a review on PR #8171.
jameysharp requested wasmtime-compiler-reviewers for a review on PR #8171.
jameysharp requested wasmtime-core-reviewers for a review on PR #8171.
alexcrichton submitted PR review:
Nice :+1:
In addition to the question I put below, could you update
prepare_table_addr
to thread out some information "the pointer is known to not be null" or something like that? For example if spectre mitigations are disabled it's known that the load is in-bounds and therefore none of the trap metadata would be required.Or perhaps
prepare_table_addr
could return aMemFlags
which has the trap code pre-configured?
alexcrichton submitted PR review:
Nice :+1:
In addition to the question I put below, could you update
prepare_table_addr
to thread out some information "the pointer is known to not be null" or something like that? For example if spectre mitigations are disabled it's known that the load is in-bounds and therefore none of the trap metadata would be required.Or perhaps
prepare_table_addr
could return aMemFlags
which has the trap code pre-configured?
alexcrichton created PR review comment:
AFAIK spectre mitigations are typically a flag due to their performance impact, but in this case because the spectre-related code is straight-line and actually involves less branches, is this actually the more performant path? Would it be worth, for example, unconditionally using
select_spectre_guard
?
jameysharp submitted PR review.
jameysharp created PR review comment:
I actually totally confused myself with the same question at first! But I ran it by @cfallin who confirmed the following: A bounds-check branch is almost always predicted correctly, and that means that if we don't care about Spectre, an out-of-order CPU can make more progress by speculating the load and continuing on to dependent operations past that, rather than stalling on the bounds-check computation.
So although I haven't measured it, there should generally be a performance advantage to sticking with the
trapnz
approach, for anyone who ain't afraid of no Spectre.
jameysharp commented on PR #8171:
Returning the memflags is an interesting option! Let me think about that a bit.
jameysharp updated PR #8171.
jameysharp updated PR #8171.
jameysharp commented on PR #8171:
Yeah, returning
MemFlags
fromprepare_table_addr
tidied things up nicely. Thanks for the suggestion, Alex!
jameysharp has enabled auto merge for PR #8171.
jameysharp merged PR #8171.
Last updated: Jan 24 2025 at 00:11 UTC