Indirect branches? · general · Zulip Chat Archive

but is it planned to have support for indirect branches and branch target patching support in the future?

bjorn3 (Jan 22 2021 at 16:16):

For tail calls there is an issue, but for indirect branches there isn't yet as far as I know. Branch target patching may be hard to implement due to interaction with regalloc I think.

Adel Prokurov (Jan 22 2021 at 18:37):

Is there any support for tail call eliminations? Are there any plans about it? · Issue #1065 · bytecodealliance/wasmtime

To write a functional language compiler using this IR, tail call eliminations would be desirable. Are there any plans to support this? I couldn't find any details in the docs.

fitzgen (he/him) (Jan 22 2021 at 19:22):

In general, we intend to support all of Wasm, so standards track proposals will be implemented eventually (at least by the time they are fully accepted) but there is the question of priority and (again, in general) we are focusing on proposals that move us towards the nanoprocess model

Announcing the Bytecode Alliance: Building a secure by default, composable future for WebAssembly

Lin Clark introduces the Bytecode Alliance, and uses Code Cartoon illustrations to share their vision of a WebAssembly ecosystem that is secure by default, fixing cracks in today’s software foundations. ...

Chris Fallin (Jan 22 2021 at 19:27):

+1 on fitzgen's note re: prioritization, though I'll add that if someone wants to look at adding tail calls at the CLIF level, for use by other Cranelift frontends before the Wasm story comes together, I'm happy to point at the relevant ABI code and details, and review PRs

Adel Prokurov (Jan 22 2021 at 19:31):

I guess I could try to work on tail calls on CLIF level. I've worked on my own tiny IR with tail calls support and it worked very well

Chris Fallin (Jan 22 2021 at 19:35):

bytecodealliance/wasmtime

Standalone JIT-style runtime for WebAssembly, using Cranelift - bytecodealliance/wasmtime

Chris Fallin (Jan 22 2021 at 19:36):

we'd need new tail-call instructions, I think, and then lower them into some combination of the return epilogue (stackframe cleanup) and call code; I haven't thought through too many details yet

Chris Fallin (Jan 22 2021 at 19:36):

Adel Prokurov (Jan 22 2021 at 19:44):

I have question that is where to look for adding new instruction? I think tail call could be just treated as return_ except with jmp instruction instead of ret most of the time.

Chris Fallin (Jan 22 2021 at 19:45):

Chris Fallin (Jan 22 2021 at 19:46):

that's more or less the lowering, yes, though there are a lot of details wrt stack frames :-) And actually one thing I didn't mention above is that for call signatures that imply on-stack args or return values, things get a bit messy if we want to maintain constant stack usage with tail-call recursion

Adel Prokurov (Jan 22 2021 at 19:47):

And if I understand correctly if we will treat tail call as return then function we're calling should have same return types as current function. I'm also not sure wha to do when functions returns more than two parameters or some parameters is passed through stack? For example windows abi can return only one value through register while SystemV allows returning two values in two registers

Chris Fallin (Jan 22 2021 at 19:47):

yes, so the return-area size will match, so we can just reuse the space that the caller allocated

Chris Fallin (Jan 22 2021 at 19:48):

the args are a bit trickier though: the tail-call's on-stack args space needs to be the same size or smaller than ours

Chris Fallin (Jan 22 2021 at 19:48):

if not, then we are forced to make the tail call with an SP below its location when we were called, which voids the constant stack usage guarantee otherwise usually expected of tail call implementations

Chris Fallin (Jan 22 2021 at 19:49):

Chris Fallin (Jan 22 2021 at 19:57):

Hmm -- so gcc seems to just not do TCO when the stack arg space of the callee is larger than the caller's: https://godbolt.org/z/nbGTve (func2 and func3 both tail-call func1; func3 has a large enough stack arg space to reuse while func2 does not)

Compiler Explorer - C (x86-64 gcc (trunk))

__attribute__((noinline)) int func1(int a, int b, int c, int d, int e, int f, int g, int h) { return a + b + c + d + e + f + g + h; } int func2(int a, int b, int c, int d, int e, int f, int g) { return func1(a, b, c, d, e, f, g + 1, g + 1); } int func3(int a, int b, int c, int d, int e, int f, int g, int h) { return func1(a, b, c, d, e, f, g + 1, h + 1); }

Chris Fallin (Jan 22 2021 at 19:57):

Adel Prokurov (Jan 22 2021 at 19:57):

Chris Fallin (Jan 22 2021 at 19:58):

for C/C++ compilers this is a "nice-to-have" optimization, so no problem if we fail, but yeah, for cases where the IR semantics imply we must do TCO, it's an open question

Adel Prokurov (Jan 22 2021 at 20:04):

Adel Prokurov (Jan 22 2021 at 20:15):

What if preallocate stack space for arguments if needed on caller side? This way callee does not have to do any stack manipulation and function that callee tail calls should use this memory?

Adel Prokurov (Jan 22 2021 at 20:16):

But this seems to limit how we can call functions. Indirect tail calls is quite impossible this way I believe

Chris Fallin (Jan 22 2021 at 20:19):

That's a bit tricky to require, since if a standard ABI (like System V) is specified, calls may come from code that is outside our control

Adel Prokurov (Jan 22 2021 at 20:20):

Another way is to do tail calls platform dependent and make tail_call return true or false if it emits or not tail call. There should also be way to pass these parameters on heap but I'm entirely not sure how this works

Chris Fallin (Jan 22 2021 at 20:21):

I just took a look at the Wasm proposal (to look ahead a bit to what we will eventually need to satisfy) and I didn't see any requirements wrt argument type or number of arguments of caller and callee. I remember something about an ABI change needed in SpiderMonkey to support fully general tail calls. So if we will need to do something specific for Wasm anyway, where TCO is required by the newly proposed instructions, I think the best thing to do in this case (with a standard ABI like SysV) is to just fall back silently to a non-tail call, as gcc and clang do

Chris Fallin (Jan 22 2021 at 20:22):

Perhaps we could add an attribute on the instruction that requires it to TCO or emit a compile error, but that makes the CLIF validity dependent on ABI details / stack layout, which I am not too fond of

Chris Fallin (Jan 22 2021 at 20:23):

Adel Prokurov (Jan 22 2021 at 20:25):

Yeah I don't like that too. I would like to use Cranelift in my tracing jit compiler and tail calls is the only way to implement side traces effectively.

Chris Fallin (Jan 22 2021 at 20:30):

if you control the signatures (as you generate the traces) a workaround would be to just ensure that you always have the same number of args, I guess (I dunno how you're mapping traces to functions -- args are live registers/values?)

Chris Fallin (Jan 22 2021 at 20:30):

anyway, yeah, I would say let's start by assuming silent fallback, and we can change if we have a better idea later :-)

Adel Prokurov (Jan 22 2021 at 20:34):

Yeah I guess this should work. We also could return boolean from tail_call function which will say if tail call is sucessful or not.

Adel Prokurov (Jan 22 2021 at 20:35):

Adel Prokurov (Jan 23 2021 at 00:44):

  thread 'main' panicked at 'Inst return and recipe Op1tail_call must have the same format!', cranelift/codegen/meta/src/cdsl/encodings.rs:154:9
  stack backtrace:

bytecodealliance/wasmtime

Standalone JIT-style runtime for WebAssembly, using Cranelift - bytecodealliance/wasmtime

bjorn3 (Jan 23 2021 at 06:46):

It is complaining that the arguments to tail_call and the required return values of functions can't be the same. You are implementing it for the old x86 backend that will likely be removed somewhere this year. The new backend can be found at codegen/src/isa/x64. This backend uses a completely different framework.

Mario Carneiro (Apr 14 2021 at 09:38):

By the way, there is another option for tail calls that wasn't discussed above: you can generate a shim for the current function to call one with a bigger stack frame, so as to get enough space to fit a tail call. This usually isn't worth the effort, but it is important if you have a mutual tail call situation. You can also optimize any calls that you have visibility into to skip the shim, but anything where you have to adhere to the ABI has to call the shim. (I had to implement a compiler with TCO for a class and this is what I did to ensure that mutual tail calls always work.)

Stream: general

Topic: Indirect branches?

Adel Prokurov (Jan 22 2021 at 13:07):

bjorn3 (Jan 22 2021 at 13:07):

Adel Prokurov (Jan 22 2021 at 14:24):

bjorn3 (Jan 22 2021 at 16:16):

Adel Prokurov (Jan 22 2021 at 18:37):

fitzgen (he/him) (Jan 22 2021 at 19:22):

Chris Fallin (Jan 22 2021 at 19:27):

Adel Prokurov (Jan 22 2021 at 19:31):

Chris Fallin (Jan 22 2021 at 19:35):

Chris Fallin (Jan 22 2021 at 19:36):

Chris Fallin (Jan 22 2021 at 19:36):

Adel Prokurov (Jan 22 2021 at 19:44):

Chris Fallin (Jan 22 2021 at 19:45):

Chris Fallin (Jan 22 2021 at 19:46):

Adel Prokurov (Jan 22 2021 at 19:47):

Chris Fallin (Jan 22 2021 at 19:47):

Chris Fallin (Jan 22 2021 at 19:48):

Chris Fallin (Jan 22 2021 at 19:48):

Chris Fallin (Jan 22 2021 at 19:49):

Chris Fallin (Jan 22 2021 at 19:57):

Chris Fallin (Jan 22 2021 at 19:57):

Adel Prokurov (Jan 22 2021 at 19:57):

Chris Fallin (Jan 22 2021 at 19:58):

Adel Prokurov (Jan 22 2021 at 20:04):

Adel Prokurov (Jan 22 2021 at 20:15):

Adel Prokurov (Jan 22 2021 at 20:16):

Chris Fallin (Jan 22 2021 at 20:19):

Adel Prokurov (Jan 22 2021 at 20:20):

Chris Fallin (Jan 22 2021 at 20:21):

Chris Fallin (Jan 22 2021 at 20:22):

Chris Fallin (Jan 22 2021 at 20:23):

Adel Prokurov (Jan 22 2021 at 20:25):

Chris Fallin (Jan 22 2021 at 20:30):

Chris Fallin (Jan 22 2021 at 20:30):

Adel Prokurov (Jan 22 2021 at 20:34):

Adel Prokurov (Jan 22 2021 at 20:35):

Adel Prokurov (Jan 23 2021 at 00:44):

bjorn3 (Jan 23 2021 at 06:46):

Mario Carneiro (Apr 14 2021 at 09:38):