cfallin opened PR #6804 from cfallin:quadratic-islands
to bytecodealliance:main
:
Work-in-progress solution to #6798
cfallin updated PR #6804.
cfallin updated PR #6804.
cfallin edited PR #6804:
Currently,
MachBuffer
is used to resolve forward and backward references to labels (basic-block entry points and other branch targets) within function bodies as well as calls to functions within a whole module. It implements a single-pass algorithm that tracks "pending fixups" -- references to some other place -- and fills in the offsets when known, also handling upgrading ranges by emitting veneers on architectures that need that behavior.In #6798 it was reported that on aarch64, module emission is slow for a certain large module. This module ends up forcing many islands of veneers but has branches whose ranges cross those islands and need not be extended with a veneer yet. This case exposes quadratic behavior: the island-emission logic puts the fixup back in a list and replaces the fixup list with that leftover-fixup list after processing all fixups.
This PR instead does the following:
- When an island is emitted, all unresolved forward references get a veneer, even if their current label kind would still be in range. This prevents fixups from hanging around arbitrarily long when they have longer range kinds (e.g., aarch64's
Branch26
) while lots of islands for shorter-range references (e.g.Branch19
) are emitted.- When a label fixup reaches its "longest-range" kind (
PCRel32
for branches, for example), we put it in a separate list that we don't process again until the very last "last-chance island" after all emission.These two rules together will bound the amount of processing to
O(|fixups| * |label kinds|)
rather thanO(|fixups|^2)
.The first new rule theoretically causes more veneers to be emitted, but in practice is not too likely to make a difference, I think. Unfortunately my aarch64 laptop is unavailable at the moment; if someone on that platform could benchmark the impact that would be quite appreciated!
Fixes #6798.
Last updated: Jan 24 2025 at 00:11 UTC