Stream: cranelift

Topic: register allocator questions


view this post on Zulip Shravan Narayan (Jul 01 2021 at 21:21):

Hi all,
A few of us at UC San Diego (working with Prof Deian Stefan), were looking at some of the security characteristics of the register allocator being used in cranelift/wasmtime.
Any chance anyone here could help us with the below question? Any help is much appreciated :smiley:

Does cranelift perform register spilling/restore decisions (Eg: saving of caller save registers) based on
(1) only the current function for which is being code-genned OR
(2) would it consider if a callee functions never use certain registers i.e. if the current function foo invoked a function bar, and bar never uses a particular register, would this be considered as part of the register allocation for foo

Additionally, has this behavior changed since the last year?

Thanks in advance

fyi @Evan Johnson

view this post on Zulip Chris Fallin (Jul 01 2021 at 21:28):

Hi @Shravan Narayan, happy to answer questions on regalloc issues! (FYI there is a new regalloc in the works, called "regalloc2", but not in use yet; however my answers below are applicable to both)

All regalloc decisions in Cranelift (and, in fact, all compilation decisions) are function-local: there is no "global" register allocation. This is a key design property that lets the compiler parallelize across function compilations, and also keeps analyses tractable.

(To see how the functionality you suggest in (2) could lead to complications, consider a corecursive cycle: now we have a fixpoint problem where we need to propagate registers used across all call-edges. Also, the callgraph is sometimes not known precisely, and can only be conservatively approximated, both because of indirect calls and because of external linkage.)

Specifically, regalloc interacts with calls by means of the ABI code: the ABI implementation specifies which registers are "clobbered" by a call, and this is known according to the ABI spec for a given architecture, which is just a convention. We uphold the ABI on "both sides": on the caller side, we assume all maybe-clobbered regs are clobbered, and on the callee side, we ensure that we only clobber (at most) maybe-clobbered regs, and no more.

view this post on Zulip Chris Fallin (Jul 01 2021 at 21:28):

Re: "has this behavior changed", no, this is how it has always worked afaik

view this post on Zulip Chris Fallin (Jul 01 2021 at 21:29):

happy to answer further questions!

view this post on Zulip Shravan Narayan (Jul 01 2021 at 21:35):

Thanks a bunch @Chris Fallin ! This info definitely helps a bunch.

"new regalloc in the works" --- Ah the upcoming regalloc was going to be my follow up question :smiley: On a similar note, is the current register allocator in cranelift been the same for the last year/year and a half, or are the other register allocators that were used before the current one?

For some context: We are currently tracking down some interesting reg alloc patterns which look like they are safe from this "global allocation viewpoint" but could be potentially worth investigating assuming we get similar patterns in other contexts, where it may not be safe. Given that the allocation decisions are function local, this does seem worth exploring more on our side.

Will look into it a bit more and follow up over next few days/weeks.

view this post on Zulip Chris Fallin (Jul 01 2021 at 21:45):

Shravan Narayan said:

Thanks a bunch Chris Fallin ! This info definitely helps a bunch.

"new regalloc in the works" --- Ah the upcoming regalloc was going to be my follow up question :smiley: On a similar note, is the current register allocator in cranelift been the same for the last year/year and a half, or are the other register allocators that were used before the current one?

The answer is a bit subtle: it depends on which Cranelift configuration you're talking about. We started developing a new compiler backend infrastructure in Jan 2020-ish, and this has always used the regalloc.rs allocator. The new framework first came into use with the aarch64 backend, merged in Apr 2020; then we developed a new x86-64 backend in the same framework, which has existed since summer 2020 but was made the default in Cranelift and wasmtime in Mar 2021. Prior to then, if you used default settings in wasmtime, you would have gotten the "old" allocator on x86-64.

For some context: We are currently tracking down some interesting reg alloc patterns which look like they are safe from this "global allocation viewpoint" but could be potentially worth investigating assuming we get similar patterns in other contexts, where it may not be safe. Given that the allocation decisions are function local, this does seem worth exploring more on our side.

Will look into it a bit more and follow up over next few days/weeks.

That's surprising to me; do you have a repro case or code snippets to share?

view this post on Zulip Shravan Narayan (Jul 01 2021 at 21:57):

@Chris Fallin thanks for the info! We are looking at x86-64. From the description it sounds like the reg alloc would have changed in mar 2021.
"repro" --- yup, we are working on it. Will follow up as soon as we have something concrete. We are currently trying to confirm if the behavior we see in some cases is actually a security concern and can be reproduced/narrowed down, before we open a bugzilla thread, as the current pattern is in the middle of a huge function that we've seen just once

view this post on Zulip Chris Fallin (Jul 01 2021 at 21:58):

s/bugzilla/github issue/ please :-) But yes, if it's a confirmed bug, I'm very curious

view this post on Zulip Chris Fallin (Jul 01 2021 at 21:58):

What symptoms are you seeing? Incorrect execution, or just bad register usage according to some other tool/checker/oracle?

view this post on Zulip Evan Johnson (Jul 01 2021 at 22:05):

This issue originally surfaced when we were testing a regalloc checker extension to veriwasm. We then manually looked at the disassembly and saw that this was happening. The exact instance of this that we saw does not appear to be exploitable, but we're looking into whether it can surface in an exploitable way.

view this post on Zulip Chris Fallin (Jul 02 2021 at 23:25):

Just to follow up on this: I got a test case from @Evan Johnson and looked at it. It looks like the register (edx) in the particular case was actually defined, implicitly, by a preceding imul. So afaik Cranelift's output here is correct


Last updated: Oct 23 2024 at 20:03 UTC