wasmtime / PR #4827 components: Improve heuristic for spl... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / PR #4827 components: Improve heuristic for spl...

Wasmtime GitHub notifications bot (Aug 30 2022 at 21:51):

alexcrichton opened PR #4827 from limit-trampoline-size to main:

This commit is a (second?) attempt at improving the generation of
adapter modules to avoid excessively large functions for fuzz-generated
inputs.

The first iteration of adapters simply translated an entire type
inline per-function. This proved problematic however since the size of
the adapter function was on the order of the overall size of a type,
which can be exponential for a type that is otherwise defined in linear
size.

The second iteration of adapters performed a split where memory-based
types would always be translated with individual functions. The theory
here was that once a type was memory-based it was large enough to not
warrant inline translation in the original function and a separate
outlined function could be shared and otherwise used to deduplicate
portions of the original giant function. This again proved problematic,
however, since the splitting heuristic was quite naive and didn't take
into account large stack-based types.

This third iteration in this commit replaces the previous system with a
similar but slightly more general one. Each adapter function now has a
concept of fuel which is decremented each time a layer of a type is
translated. When fuel runs out further translations are deferred to
outlined functions. The fuel counter should hopefully provide a sort of
reasonable upper bound on the size of a function and the outlined
functions should ideally provide the ability to be called from multiple
places and therefore deduplicate what would otherwise be a massive
function.

This final iteration is another attempt at guaranteeing that an adapter
module is linear in size with respect to the input type section of the
original module. Additionally this iteration uniformly handles stack and
memory-based translations which means that stack-based translations
can't go wild in their function size and memory-based translations may
benefit slightly from having at least a little bit of inlining
internally.

The immediate impact of this is that the component_api fuzzer seems to
be running at a faster rate than before. Otherwise #4825 is sufficient
to invalidate preexisting fuzz-bugs and this PR is hopefully the final
nail in the coffin to prevent further timeouts for small inputs cropping
up.

Closes #4816

Wasmtime GitHub notifications bot (Aug 30 2022 at 21:52):

alexcrichton requested fitzgen for a review on PR #4827.

Wasmtime GitHub notifications bot (Aug 31 2022 at 17:02):

fitzgen submitted PR review.

Wasmtime GitHub notifications bot (Aug 31 2022 at 17:02):

fitzgen submitted PR review.

Wasmtime GitHub notifications bot (Aug 31 2022 at 17:02):

fitzgen created PR review comment:

Not for this PR or even the immediate future but it might be interesting to look at what types actually appear multiple times in the parameter/result type trees and use that as a heuristic for splitting things out, when you know the helper will be reused.

Wasmtime GitHub notifications bot (Aug 31 2022 at 17:09):

alexcrichton merged PR #4827.

Wasmtime GitHub notifications bot (Aug 31 2022 at 17:10):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Aug 31 2022 at 17:10):

alexcrichton created PR review comment:

Agreed! At some point I'm wondering if this is basically recreating all of LLVM's own inlining heuristics but in a worse way.

Last updated: Apr 18 2025 at 01:31 UTC