cranelift / Issue #1246 Develop a replacement for the gra... · git-cranelift

Stream: git-cranelift

Topic: cranelift / Issue #1246 Develop a replacement for the gra...

GitHub (Feb 28 2020 at 23:27):

alexcrichton transferred Issue #1246:

Cranelift's existing register allocator has proven to be a limiting factor for
achievable code quality. Typically it limits the performance of compiled
WebAssembly inputs to around 80% of that achieved when compiling WebAssembly
via SpiderMonkey's Ion pipeline instead.

In particular, the allocator generates excessive spilling. This is because it
doesn't implement live range splitting. Investigations have shown that while
it would be possible to implement splitting, this would make the allocator run
significantly more slowly. But the allocator already dominates Cranelift's
compilation time, and is already uncompetitively slow. So this isn't a
promising route.

The current plan is to implement two new allocators:

(1) A traditional linear-scan register allocator, in the style of Poletto and
Sarkar's paper [A], but not tune it much, at least at first.

(2) At the same time, implement a backtracking, top-down allocator similar to
that used by SpiderMonkey's Ion pipeline, and in LLVM 3.0's optimising
mode, as summarised at [B]. Experience with Ion indicates that this
allocator provides good quality allocation whilst being significantly
faster than graph colouring schemes. Hence it seems like a good choice.

The purpose of doing (1) is threefold:

One area of uncertainty/schedule risk with implementing a new allocator is
figuring out how to integrate it into CL's existing IR and
machine-instruction infrastructure. Starting off by implementing a simple
core algorithm frees up time to explore the integration aspects, which are
arguably the more difficult problem.

It may be useful for some users of CL to have the choice of a faster
allocator (hence, overall compilation) at the expense of poorer quality
code.

It gives a baseline for speed comparisons, both compilation speed and speed
of generated code.

Whether, and to what extent, it is necessary to translate out of SSA prior to
doing register allocation is currently under investigation.

In order to avoid any interruption in service for existing CL users, the
existing register allocator will be kept alive until such time as a new
implementation can do better, both in code quality and compile time.

[A] http://web.cs.ucla.edu/~palsberg/course/cs132/linearscan.pdf
[B] http://blog.llvm.org/2011/09/greedy-register-allocation-in-llvm-30.html

Last updated: Apr 15 2025 at 00:13 UTC