merging arm64 backend · cranelift

Hi all -- I've put together a patch stack to start the review process for the new backend and the ARM64 machine-specific use of it. See https://github.com/cfallin/wasmtime/tree/arm64-merge for the 11 commits. This is a rollup of the ~4 months of work on the arm64 side-branch (excluding the incomplete x64 backend). It's ~17KLoC of code.

cfallin/wasmtime

Standalone JIT-style runtime for WebAssembly, using Cranelift - cfallin/wasmtime

Dan Gohman (Apr 09 2020 at 21:37):

My thought is that one PR makes sense. This will save you a lot of work over juggling a long chain of dependent PRs. It'll be a big review, but we can focus on making sure we understand the main shape of things rather than probing into every detail, and then, in return for us having saved you all that juggling work, you can be responsive to people probing into the details and asking questions after it lands :-)

Dan Gohman (Apr 09 2020 at 21:37):

Chris Fallin (Apr 09 2020 at 21:38):

Chris Fallin (Apr 09 2020 at 21:42):

Add new `MachInst` backend and ARM64 support. by cfallin · Pull Request #1494 · bytecodealliance/wasmtime

This PR contributes a new backend framework built around MachInsts (machine instructions) contained in VCode (virtual-registerized code) containers. Furthermore, it contains a working ARM64 backend...

Julian Seward (Apr 10 2020 at 05:42):

@Chris Fallin , excellent work. I am of course happy to review stuff.
Somewhat related question: what is now the situation regarding further commits to your branch? I ask because as I work to speed up the register allocator, I am beginning to see inefficiencies on the CL side (in the new code) that also need to be fixed. I could make patches for those and put them off to one side until after reviewing/merging is done, if that would be helpful.

Chris Fallin (Apr 10 2020 at 05:48):

@Julian Seward great! I think that Dan is going to review at a high level (as he indicated in the #cranelift channel) but please feel free to go over anything in more detail

Chris Fallin (Apr 10 2020 at 05:48):

Regarding the branch -- I don't want to block ongoing work, so I think we can keep pushing to arm64 until the PR lands; and then we can rebase whatever has happened in the meantime and review as additional PRs

Chris Fallin (Apr 10 2020 at 05:49):

(The alternative is to stack up a bunch of dependent PRs which is especially messy as reviews change things -- don't really want to go there)

Benjamin Bouvier (Apr 10 2020 at 09:05):

Benjamin Bouvier (Apr 10 2020 at 09:08):

Co-authored-by: Julian Seward <jseward@acm.org>
Co-authored-by: Joey Gouly <joey.gouly@arm.com>
Co-authored-by: Benjamin Bouvier <public@benj.me>

Then Github is able to pattern-match this and mark commits as being co-authored, which is nice to better attribute credit.

Dan Gohman (Apr 10 2020 at 18:53):

@Chris Fallin I'm just realizing that this design of parameterizing on the instruction means that it looks like the entire mach backend gets monomorphized for every target, is that right?

Chris Fallin (Apr 10 2020 at 18:53):

Dan Gohman (Apr 10 2020 at 18:54):

That does work out well for the JIT case where just one target is active, of course, so it's not necessarily a bad idea.

Chris Fallin (Apr 10 2020 at 18:55):

We figured that in the common case, the compiler would be built with support for one target, and in the uncommon case, the code that gets monomorphized is ideally not a large portion of the total code (most of the bulk is in the machine backend and in the target-indep passes, it seems)

Dan Gohman (Apr 10 2020 at 18:56):

Yeah, and in any configuration linking in multiple targets (rustc?), there's going to be a lot of target-specific code anyway.

Dan Gohman (Apr 10 2020 at 18:58):

This may make things harder if we pursue the crazy idea I mentioned earlier of replacing clif IR with vcode IR entirely, because more code would be monomorphized then, but it may still be manageable.

Chris Fallin (Apr 10 2020 at 19:05):

the VCode methods at least would be monomorphized twice, but e.g. the lowering wouldn't be (I imagine we'd conceptually have fn lower<FromInst: VCodeInst, ToInst: VCodeInst>(...) and just one (from, to) pair in the common case)

Stream: cranelift

Topic: merging arm64 backend

Chris Fallin (Apr 09 2020 at 21:29):

Dan Gohman (Apr 09 2020 at 21:37):

Dan Gohman (Apr 09 2020 at 21:37):

Chris Fallin (Apr 09 2020 at 21:38):

Chris Fallin (Apr 09 2020 at 21:42):

Julian Seward (Apr 10 2020 at 05:42):

Chris Fallin (Apr 10 2020 at 05:48):

Chris Fallin (Apr 10 2020 at 05:48):

Chris Fallin (Apr 10 2020 at 05:49):

Benjamin Bouvier (Apr 10 2020 at 09:05):

Benjamin Bouvier (Apr 10 2020 at 09:08):

Dan Gohman (Apr 10 2020 at 18:53):

Chris Fallin (Apr 10 2020 at 18:53):

Dan Gohman (Apr 10 2020 at 18:54):

Chris Fallin (Apr 10 2020 at 18:55):

Dan Gohman (Apr 10 2020 at 18:56):

Dan Gohman (Apr 10 2020 at 18:58):

Chris Fallin (Apr 10 2020 at 19:05):

Chris Fallin (Apr 10 2020 at 19:06):

Benjamin Bouvier (Apr 16 2020 at 17:04):