Stream: cranelift

Topic: exception support


view this post on Zulip bjorn3 (May 16 2022 at 16:50):

I've been trying to implement exception support over at https://github.com/bjorn3/wasmtime/tree/eh_cleanup I'm not sure how to handle callee saved registers though. They are trashed in case of the cleanup path AFAIK, but for good performance shouldn't need to be restored in case of a regular return. Caller saved registers are also somewhat annoying, but should be doable. The basic design I'm trying to implement is:

function %foo(i32) system_v {
    ; eh_personality rust_eh_personality

    ; list of alternative targets for the invoke outside of the control of Cranelift.
    jt0 = jump_table [block2]

    sig0 = () system_v
    sig1 = (i64) system_v

    fn0 = %bar sig0
    fn1 = %_Unwind_Resume sig1

block0(v0: i32):
    invoke fn0, block1, jt0

block1:
    return

; All registers specified as callee-saved by the base ABI are restored, as well as scratch registers
; %rdi,%rsi,%rdx,%rcx(see below). Except for those exceptions, scratch (or caller-saved) registers
; are not preserved, and their contents are undefined on transfer.

; up to four args passed in %rdi, %rsi, %rdx, %rcx
; panic_unwind passes the exception_object pointer as the first arg and no additional arguments
block2(v1: i64) eh_landing_pad: ; eh_landing_pad is the "block entry abi"
    ; do cleanup work. all values accessible at the invoke are also accessible here
    tail_call fn1(v1)
}

This should work fine for DWARF unwinding, while also being flexible enough to enable non-exception usecases like maybe on stack replacement in the future.

Standalone JIT-style runtime for WebAssembly, using Cranelift - GitHub - bjorn3/wasmtime at eh_cleanup

view this post on Zulip bjorn3 (May 16 2022 at 16:50):

cc https://github.com/bytecodealliance/wasmtime/issues/1677, https://github.com/bytecodealliance/wasmtime/issues/2049, https://github.com/bytecodealliance/wasmtime/issues/3427

https://itanium-cxx-abi.github.io/cxx-abi/abi-eh.html

Feature Currently the generated unwinding information only restores registers. This issue proposes to make it possible to also run cleanup actions like running destructors during unwinding. Benefit...
Feature Cranelift currently doesn't support SEH (but has some support for finally I think). This is a feature request to support properly exception (landingpads on Posix, SEH on windows, maybe ...
I've been interested in having the Exception Handling proposal supported in Wasmtime, so I looked into possible ways to implement it. There's been some prior discussion in issues #2049 and ...

view this post on Zulip bjorn3 (May 16 2022 at 17:12):

@Chris Fallin This was my question during the cranelift bi-weekly.

view this post on Zulip Chris Fallin (May 16 2022 at 17:18):

Ah, I think this should "just work" in the way you're hoping for, if I understand correctly. Basically the regalloc implicitly reloads caller-saved values when needed, but this is a consequence of spill/reload mechanisms and the regalloc-metadata model of the call instruction (clobbers all caller-saves) rather than some explicitly-coded behavior

view this post on Zulip Chris Fallin (May 16 2022 at 17:18):

so if values in caller-saves are not needed on the landingpad path then they simply won't be reloaded

view this post on Zulip Chris Fallin (May 16 2022 at 17:19):

the "some additional registers are also preserved" part is interesting, and right now we don't have a way of representing "conditional clobbers" like this; in principle it might be possible to model but it's a lot more complex, so I'd prefer not to if it's not clearly needed for adequate performance

view this post on Zulip bjorn3 (May 16 2022 at 17:21):

Forgot about that comment that callee-saved registers are restored. I wrote that file months ago.

view this post on Zulip bjorn3 (May 16 2022 at 17:22):

Chris Fallin said:

the "some additional registers are also preserved" part is interesting, and right now we don't have a way of representing "conditional clobbers" like this; in principle it might be possible to model but it's a lot more complex, so I'd prefer not to if it's not clearly needed for adequate performance

If you are refering to %rdi, %rsi, %rdx and %rcx, they are set by the personality function as kind of extra arguments. I modeled them as block parameters.

view this post on Zulip bjorn3 (May 16 2022 at 17:23):

In any case I think it would work once implementing the right exception tables to restore callee saved registers, thanks!

view this post on Zulip Chris Fallin (May 16 2022 at 17:24):

ah ok cool -- so if you need access to those then the right way to do it is I think to put a pseudoinstruction at the top of the unwind path that defs parameters with fixed-reg constraints (e.g. def v123 fixed in %rdi); then it will pick up the values automatically

view this post on Zulip bjorn3 (May 16 2022 at 17:25):

That was basically what I wanted to do.

view this post on Zulip bjorn3 (May 16 2022 at 18:34):

@Chris Fallin Is there a way to introduce an extra block during lowering to machinst's? Basically I need to turn invoke into a sequence of load inputs, call function, jump to temp block, and then in the temp block store the return values and jump to whichever the destination of the invoke is. I believe the call needs to be a terminator at machinst level and not just clif ir level as it can return to more than one place.

view this post on Zulip Chris Fallin (May 16 2022 at 18:39):

@bjorn3 not currently, no; the invariant is that lowering does not introduce additional control flow that is visible to the register allocator

view this post on Zulip Chris Fallin (May 16 2022 at 18:39):

though various sequences do emit local branches within a single pseudoinstruction; that's fine as long as it's single-in, single-out (e.g. trap-if)

view this post on Zulip Chris Fallin (May 16 2022 at 18:40):

it's somewhat surprising to me that we would need additional control-flow expansion during lowering (i.e. that the invoke with landingpad edge is not enough to capture the control flow) but I'm happy think more about this if that's really the case

view this post on Zulip bjorn3 (May 16 2022 at 18:41):

The problem is that invoke needs to lower to argument stores after the call, but the landingpad edge is right after the call instruction.

view this post on Zulip Chris Fallin (May 16 2022 at 18:42):

OK, perhaps something could be done in the landingpad itself then? I'm not entirely sure, and I don't have cycles to think deeply about this right now, but there should be a way to work around the invariant; we really really do not want to introduce the complexity of patching in new blocks here if we can help it

view this post on Zulip bjorn3 (May 16 2022 at 18:43):

Basically I need:

abi.emit_stack_pre_adjust(ctx);
assert_eq!(inputs.len(), abi.num_args());
for i in abi.get_copy_to_arg_order() {
    let input = inputs[i];
    let arg_regs = put_input_in_regs(ctx, input);
    abi.emit_copy_regs_to_arg(ctx, i, arg_regs);
}
abi.emit_invoke(ctx, temp_block, alternatives);
// Switch to temp_block
for (i, output) in outputs.iter().enumerate() {
    let retval_regs = get_output_reg(ctx, *output);
    abi.emit_copy_retval_to_regs(ctx, i, retval_regs);
}
abi.emit_stack_post_adjust(ctx);

view this post on Zulip bjorn3 (May 16 2022 at 18:43):

OK, perhaps something could be done in the landingpad itself then?

No, those stores are for when the landingpad is not hit.

view this post on Zulip Chris Fallin (May 16 2022 at 18:44):

OK, we'll need to find another way then; not possible to introduce a block in that context

view this post on Zulip bjorn3 (May 16 2022 at 18:47):

By the way how do I tell everything what successors invoke has?

view this post on Zulip bjorn3 (May 16 2022 at 18:47):

Which function needs an extra branch?

view this post on Zulip bjorn3 (May 16 2022 at 18:53):

Found it: analyze_branch.

view this post on Zulip bjorn3 (May 19 2022 at 10:18):

@Chris Fallin Would it be possible to require that the successor block for a normal return of the invoke instruction has a single predecessor and then during lowering of the invoke instruction add the return value store instructions to the successor block? Or is it not possible to partially fill a block from the start either?

view this post on Zulip Chris Fallin (May 19 2022 at 14:58):

@bjorn3 no, I don't think so, it breaks a number of assumptions for an instruction in one block to put lowered instructions in another.

Can we put a pseudoinstruction in the successor that does these stores? Both at the CLIF level, and then when lowered as well; at the MachInst level the call (or invoke specifically) writes whatever RealRegs, and the instruction at the head of the successor reads them, and regalloc will do the right thing reserving those liveranges

view this post on Zulip bjorn3 (May 19 2022 at 15:01):

Those stores must not be performed for the case where it unwinds and the respective output values likely shouldn't be considered defined either.

view this post on Zulip Chris Fallin (May 19 2022 at 15:07):

OK, but a "conditional definition" is not possible with current regalloc, and in general that's really difficult and error-prone; my suggestion above will work properly, I think

view this post on Zulip Chris Fallin (May 19 2022 at 15:07):

if the stores occur as part of the successor inst, then this makes at least those conditional

view this post on Zulip bjorn3 (May 19 2022 at 15:08):

I guess I will try to put the store instructions in the invoke machinst pseudoinstruction after the call and before the jump.


Last updated: Jan 24 2025 at 00:11 UTC