Stream: git-wasmtime

Topic: wasmtime / issue #3427 Implementation strategy for the Ex...


view this post on Zulip Wasmtime GitHub notifications bot (Oct 09 2021 at 16:45):

whitequark opened issue #3427:

I've been interested in having the Exception Handling proposal supported in Wasmtime, so I looked into possible ways to implement it. There's been some prior discussion in issues #2049 and #1677, but those issues focus on details that I think are less important than a high-level strategy.

As far as I see, the core difficulty in implementing this feature is that Wasmtime currently has only one non-local control flow construct, traps, and the only way to catch traps is the scoped wasmtime_setjmp construct implemented in C. There is no particularly good way to use this construct to implement the Wasm exception handling opcodes; it is a C function that manages resources opaque to the rest of the runtime. You could translate try blocks to wasmtime_setjmp calls by splitting each Wasm function that uses exception handling into many Cranelift functions, but this is a very complex transformation that would interfere with optimizations on the fast path, and I expect that no one wants that.

So, a different non-local control flow construct is necessary for exception handling. I see two options here:

1. Reusing the existing OS-specific stack unwinding mechanism (SEH on Windows, DWARF elsewhere), and
2. Implementing a new Cranelift-specific stack unwinding mechanism.

Option (1) requires a large amount of platform-specific work. Wasmtime does already emit DWARF and SEH tables to be able to capture backtraces with Wasm frames, but a lot more work is necessary to extend that functionality to cover exception handling.

In this case, exceptions and traps would use disjoint mechanisms, which naturally aligns with the semantics specified in the proposal.

Option (2) makes it possible to use a mostly platform-independent mechanism. It doesn't make a lot of sense to implement a new zero-cost exception handling strategy (you're better off using DWARF/SEH in that case), and the other major approach is SjLj. For example, Wasmtime could maintain a linked list of registered exception handlers in VM context, and a function that has try blocks would append an entry to this list in the prologue, containing the frame pointer and the address of the basic block that dispatches an in-flight exception for a particular try body. (This is a bit similar to 32-bit SEH.) Then, on any control flow into or out of a try body, the address recorded in the entry would be updated. The throw instruction would capture the exception parameters and set SP and IP to the ones in the head of the list, while Cranelift would have to make sure that any SSA values live in the dispatch block are allocated to stack slots.

This option could actually eliminate the dependency on C setjmp/longjmp functions and unify unwinding due to exceptions and traps. However, while it requires less platform-specific work, it is more costly at runtime, and maybe not a good fit for Wasmtime in the long run.

What do you think?

view this post on Zulip Wasmtime GitHub notifications bot (Oct 09 2021 at 18:02):

bjorn3 commented on issue #3427:

Option (1) requires a large amount of platform-specific work. Wasmtime does already emit DWARF and SEH tables to be able to capture backtraces with Wasm frames, but a lot more work is necessary to extend that functionality to cover exception handling.

It will have to be implemented anyway at some point for cg_clif. In addition I don't think it is much more complex than option (2). Both options require adding support for alternative exits from a call where all registers are clobbered and where the exit destination can be placed wherever Cranelift wants. Once that is implemented for DWARF (option (1) all that is needed is to register the location in the .gcc_except_table section (or a custom format if preferred) and writing a personality function (or copy the rust one if using the .gcc_except_table format). For option (2) there also needs to be code generated to update the linked list at every point.

Wasmtime could maintain a linked list of registered exception handlers in VM context, and a function that has try blocks would append an entry to this list in the prologue, containing the frame pointer and the address of the basic block that dispatches an in-flight exception for a particular try body. (This is a bit similar to 32-bit SEH.)

This hurts performance even when not raising any exceptions. The Unix world switched from SjLj to DWARF unwinding for a reason.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 09 2021 at 18:13):

whitequark commented on issue #3427:

You still need to implement the SEH parts for option (1). But the rationale makes sense.

Closing this in favor of #1677.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 09 2021 at 18:13):

whitequark closed issue #3427:

I've been interested in having the Exception Handling proposal supported in Wasmtime, so I looked into possible ways to implement it. There's been some prior discussion in issues #2049 and #1677, but those issues focus on details that I think are less important than a high-level strategy.

As far as I see, the core difficulty in implementing this feature is that Wasmtime currently has only one non-local control flow construct, traps, and the only way to catch traps is the scoped wasmtime_setjmp construct implemented in C. There is no particularly good way to use this construct to implement the Wasm exception handling opcodes; it is a C function that manages resources opaque to the rest of the runtime. You could translate try blocks to wasmtime_setjmp calls by splitting each Wasm function that uses exception handling into many Cranelift functions, but this is a very complex transformation that would interfere with optimizations on the fast path, and I expect that no one wants that.

So, a different non-local control flow construct is necessary for exception handling. I see two options here:

1. Reusing the existing OS-specific stack unwinding mechanism (SEH on Windows, DWARF elsewhere), and
2. Implementing a new Cranelift-specific stack unwinding mechanism.

Option (1) requires a large amount of platform-specific work. Wasmtime does already emit DWARF and SEH tables to be able to capture backtraces with Wasm frames, but a lot more work is necessary to extend that functionality to cover exception handling.

In this case, exceptions and traps would use disjoint mechanisms, which naturally aligns with the semantics specified in the proposal.

Option (2) makes it possible to use a mostly platform-independent mechanism. It doesn't make a lot of sense to implement a new zero-cost exception handling strategy (you're better off using DWARF/SEH in that case), and the other major approach is SjLj. For example, Wasmtime could maintain a linked list of registered exception handlers in VM context, and a function that has try blocks would append an entry to this list in the prologue, containing the frame pointer and the address of the basic block that dispatches an in-flight exception for a particular try body. (This is a bit similar to 32-bit SEH.) Then, on any control flow into or out of a try body, the address recorded in the entry would be updated. The throw instruction would capture the exception parameters and set SP and IP to the ones in the head of the list, while Cranelift would have to make sure that any SSA values live in the dispatch block are allocated to stack slots.

This option could actually eliminate the dependency on C setjmp/longjmp functions and unify unwinding due to exceptions and traps. However, while it requires less platform-specific work, it is more costly at runtime, and maybe not a good fit for Wasmtime in the long run.

What do you think?

view this post on Zulip Wasmtime GitHub notifications bot (Oct 11 2021 at 16:17):

cfallin commented on issue #3427:

HI @whitequark -- thank you so much for starting to look into this! Wasm EH will be an important feature to support for a bunch of reasons and I'm happy to hear you're interested in its implementation as well!

I do think that there is some more discussion that should happen with the relevant Wasmtime and Cranelift folks on this issue before we decide to go with one option or the other. What @bjorn3 says above regarding platform-native code generators (such as cg_clif) is true -- in such cases, it's best to use the platform-native mechanisms -- but in the scope of Wasmtime's VM, it's definitely not clear to me at least that this is the best option.

As one possibly useful anecdotal data point, SpiderMonkey implements exception unwinding (in JS and Wasm) via a custom frame format and unwinder, and this gives the engine runtime both platform orthogonality (they don't need to make every detail compatible with both SEH and DWARF unwind) and ability to optimize and tweak as needed. Note that the unwinding is still based on a PC-table lookup, afaik, so it doesn't require separate generated code to dynamically maintain a linked list.

I have some interest in cleaning up our unwinding story in general; in #2459 we discussed ways to potentially do GC stack-walking without relying on libunwind. IMHO, after implementing the DWARF and SEH unwind info generation for the new backends just far enough to enable stack traces and stafckwalking, we lose a lot of flexibility trying to be generic over both; I had to go through some contortions to find a stackframe format that would be describable by both; and there's still a latent worry I have at least with placing the metadata path on the critical path for correctness, vs. a JIT frame format that we tightly control. libunwind in practice also has tended to be quite slow, compared to a custom walk that one could implement. For EH that's less important but it is something to consider I think.

I don't mean to say that we also must solve the above (GC) issue; just that there are multiple reasons why "build our own JIT-frame format" is interesting, both in terms of prior art (SpiderMonkey) and other things it will also clean up.

All of this is really to say that, at least from my point of view, and I think other Wasmtime folks' as well, this isn't quite a closed decision yet, and it would be great to discuss further!

The Wasmtime biweekly call might be a good venue for that; would you be interested in joining and discussing this (the next one is this Thursday at 16:00 UTC; I think @tschneidereit manages the event and you'd definitely be welcome, as is any other contributor who is reading here)? Alternately we can continue to discuss in this issue, of course.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 11 2021 at 16:17):

cfallin reopened issue #3427:

I've been interested in having the Exception Handling proposal supported in Wasmtime, so I looked into possible ways to implement it. There's been some prior discussion in issues #2049 and #1677, but those issues focus on details that I think are less important than a high-level strategy.

As far as I see, the core difficulty in implementing this feature is that Wasmtime currently has only one non-local control flow construct, traps, and the only way to catch traps is the scoped wasmtime_setjmp construct implemented in C. There is no particularly good way to use this construct to implement the Wasm exception handling opcodes; it is a C function that manages resources opaque to the rest of the runtime. You could translate try blocks to wasmtime_setjmp calls by splitting each Wasm function that uses exception handling into many Cranelift functions, but this is a very complex transformation that would interfere with optimizations on the fast path, and I expect that no one wants that.

So, a different non-local control flow construct is necessary for exception handling. I see two options here:

1. Reusing the existing OS-specific stack unwinding mechanism (SEH on Windows, DWARF elsewhere), and
2. Implementing a new Cranelift-specific stack unwinding mechanism.

Option (1) requires a large amount of platform-specific work. Wasmtime does already emit DWARF and SEH tables to be able to capture backtraces with Wasm frames, but a lot more work is necessary to extend that functionality to cover exception handling.

In this case, exceptions and traps would use disjoint mechanisms, which naturally aligns with the semantics specified in the proposal.

Option (2) makes it possible to use a mostly platform-independent mechanism. It doesn't make a lot of sense to implement a new zero-cost exception handling strategy (you're better off using DWARF/SEH in that case), and the other major approach is SjLj. For example, Wasmtime could maintain a linked list of registered exception handlers in VM context, and a function that has try blocks would append an entry to this list in the prologue, containing the frame pointer and the address of the basic block that dispatches an in-flight exception for a particular try body. (This is a bit similar to 32-bit SEH.) Then, on any control flow into or out of a try body, the address recorded in the entry would be updated. The throw instruction would capture the exception parameters and set SP and IP to the ones in the head of the list, while Cranelift would have to make sure that any SSA values live in the dispatch block are allocated to stack slots.

This option could actually eliminate the dependency on C setjmp/longjmp functions and unify unwinding due to exceptions and traps. However, while it requires less platform-specific work, it is more costly at runtime, and maybe not a good fit for Wasmtime in the long run.

What do you think?

view this post on Zulip Wasmtime GitHub notifications bot (Oct 11 2021 at 16:21):

cfallin edited a comment on issue #3427:

HI @whitequark -- thank you so much for starting to look into this! Wasm EH will be an important feature to support for a bunch of reasons and I'm happy to hear you're interested in its implementation as well!

I do think that there is some more discussion that should happen with the relevant Wasmtime and Cranelift folks on this issue before we decide to go with one option or the other. What @bjorn3 says above regarding platform-native code generators (such as cg_clif) is true -- in such cases, it's best to use the platform-native mechanisms -- but in the scope of Wasmtime's VM, it's definitely not clear to me at least that this is the best option.

As one possibly useful anecdotal data point, SpiderMonkey implements exception unwinding (in JS and Wasm) via a custom frame format and unwinder, and this gives the engine runtime both platform orthogonality (they don't need to make every detail compatible with both SEH and DWARF unwind) and ability to optimize and tweak as needed. Note that the unwinding is still based on a PC-table lookup, afaik, so it doesn't require separate generated code to dynamically maintain a linked list.

I have some interest in cleaning up our unwinding story in general; in #2459 we discussed ways to potentially do GC stack-walking without relying on libunwind. IMHO, after implementing the DWARF and SEH unwind info generation for the new backends just far enough to enable stack traces and stackwalking, we lose a lot of flexibility trying to be generic over both; I had to go through some contortions to find a stackframe format that would be describable by both; and there's still a latent worry I have at least with placing the metadata path on the critical path for correctness, vs. a JIT frame format that we tightly control. libunwind in practice also has tended to be quite slow, compared to a custom walk that one could implement. For EH that's less important but it is something to consider I think.

I don't mean to say that we also must solve the above (GC) issue; just that there are multiple reasons why "build our own JIT-frame format" is interesting, both in terms of prior art (SpiderMonkey) and other things it will also clean up.

All of this is really to say that, at least from my point of view, and I think other Wasmtime folks' as well, this isn't quite a closed decision yet, and it would be great to discuss further!

The Wasmtime biweekly call might be a good venue for that; would you be interested in joining and discussing this (the next one is this Thursday at 16:00 UTC; I think @tschneidereit manages the event and you'd definitely be welcome, as is any other contributor who is reading here)? Alternately we can continue to discuss in this issue, of course.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 11 2021 at 16:38):

alexcrichton commented on issue #3427:

One part I would add to what @cfallin already mentioned is that I think designing an implementation in Wasmtime for the exceptions proposal would be a great opportunity to rethink traps and their implementation. I don't think that the setjmp/longjmp strategy is set in stone at all and it has a significant downside in that it's got a fair deal of overhead entering into WebAssembly from the host (need to call setjmp and save regs, currently forces calling a foreign function that isn't inlined in optimized builds, etc).

Personally what I would shoot for is that the exception handling proposal would be zero-cost (or very close to zero) for wasm code which doesn't throw exceptions and then traps would use the same model, ideally making them zero-cost as well to enter from the host.

I'm not personally familiar with how other engines implement exceptions but I suspect we can draw a lot of inspiration from them. I'd be initiall dubiuos that DWARF/SEH are our best options for implementing the exception handling proposal, but I wouldn't necessarily rule it out at the same time. I think it'd be good to gather information first and weigh pros/cons of various implementation strategies.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 11 2021 at 16:45):

bjorn3 commented on issue #3427:

The Wasmtime biweekly call might be a good venue for that; would you be interested in joining and discussing this (the next one is this Thursday at 16:00 UTC; I think @tschneidereit manages the event and you'd definitely be welcome, as is any other contributor who is reading here)? Alternately we can continue to discuss in this issue, of course.

I would like to join.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 11 2021 at 16:53):

tschneidereit commented on issue #3427:

I would like to join.

Invite sent!

Something I want to highlight: it's definitely not necessary to join this or any other calls in order to weigh in on this or similar decisions. We're more than happy to discuss here, on Zulip, or if that seems appropriate for the decision, in RFCs.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 11 2021 at 17:50):

fitzgen commented on issue #3427:

To echo what others have said: thanks for filing an issue with lots of great context @whitequark.


@alexcrichton @cfallin: to be clear, we don't intend to ever unwind native frames, right?

That is, if we have a stack like wasm --call--> host --call--> wasm and the youngest Wasm frame on the stack throws an exception, we intend to return that as an error variant to the host, right? I am 99% sure the answer is "yes" but if it is ever "no" then we won't really have a way to avoid dealing with native DWARF/SEH unwinding for this stuff.

I guess this seems like the kind of thing we should answer definitively with an RFC.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 11 2021 at 17:54):

cfallin commented on issue #3427:

we don't intend to ever unwind native frames, right?

I think that's probably a good starting point (though this along with all the other details is up for discussion!). Doing otherwise would require us to think pretty carefully about the host/wasm transition in general (including the trap logic, as @alexcrichton mentioned) and the interactions that would have with unwinding. Not to mention that it is a public API change to say that a call from native code back into a Wasm function can throw a system exception and unwind past the caller; e.g. C code might not expect this and might not clean up properly.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 11 2021 at 17:59):

alexcrichton commented on issue #3427:

Yeah especially with C interop I don't think that we'll ever want wasm code to unwind host frames. Even if we do implement this via DWARF or SEH I would expect that all exceptions are caught at the wasm boundary unconditionally and raised from the entry into the host for when wasm calls the host.

This does leak into other API-looking questions, though. My initial naive thought for how we'd represent this is that we'd change all functions that return Result<T, Trap> to return Result<T, anyhow::Error> (like the rest of Wasmtime's APIs) and we'd add a new wasmtime::Exception type to map onto wasm exceptions. That way ? would naturally propagate exceptions/traps in Rust and you could still inspect the results of an invocation for just an Exception or just a Trap if you really wanted to.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 23 2022 at 17:47):

alexcrichton commented on issue #3427:

I wanted to write down some further thoughts we've realized recently about libunwind and dwarf exception handling (at least on Linux). Local testing I've done shows that libunwind gets slower as more modules are loaded and is also significantly slower for the first backtrace in the process than subsequent ones:

number of modules first backtrace second backtrace
1024 12.34ms 447.34µs
2048 24.26ms 1.49ms
4096 48.74ms 3.19ms
8192 95.92ms 6.06ms
16384 192.15ms 12.98ms

The libunwind being used here is whatever is shipped with glibc, probably the one in libgcc_s. I haven't done much analysis of its own source or why these timings are as slow as they are, but this poses a significant obstacle for embeddings of Wasmtime that want high performance wasm with also loading lots of modules. Backtrace capturing being on the order of milliseconds is also a far cry from the overhead of calling into wasm which is on the order of nanoseconds. These performance numbers were the primary motivator for https://github.com/bytecodealliance/wasmtime/pull/3932.

It's also worth pointing out that "it's ok for traps to be slow" isn't necessarily a given with WASI's proc_exit(0) being implemented by raising a trap. That means that programs, as part of a normal execution, may raise a trap as they exit the process with a 0 return code. Additionally performance-wise libunwind also performs quite bad at high parallelism since unwinding currently takes a global lock in the Rust backtrace crate.

All this is basically to say that while libunwind/dwarf/seh/etc may be appealing from a compiler simplicity point of view it may not be as appealing from a performance point of view. I believe there's a lot of implementations of libunwind, though, and we can probably investigate some other ones to see if they have reasonable performance compared to whatever the system glibc is. That may change things but if it's still somewhat the same we may, for performance reasons, be pushed to mirror SpiderMonkey's or another JS JIT's implementation of unwinding despite the increase in complexity.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 13 2024 at 13:50):

sdeleuze commented on issue #3427:

With the WasmGC support now available, I think this issue is the last blocker to get Kotlin (and Java as the GraalVM team is working on compiling JVM bytecode to WasmGC + EH) running on Wasmtime.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 13 2024 at 16:38):

fitzgen commented on issue #3427:

FWIW, we have an open RFC discussing Wasm exceptions and working towards consensus on implementation strategy and incremental milestones: https://github.com/bytecodealliance/rfcs/pull/36

view this post on Zulip Wasmtime GitHub notifications bot (Oct 13 2024 at 20:16):

bashor commented on issue #3427:

@sdeleuze, you can start experimenting (!) with Kotlin right away but without the ability to throw and catch exceptions.

To prevent using EH by Kotlin/Wasm toolchain, you need to add the following lines to your build.

tasks.withType<org.jetbrains.kotlin.gradle.dsl.KotlinJsCompile>().configureEach {
    compilerOptions.freeCompilerArgs.addAll(listOf("-Xwasm-use-traps-instead-of-exceptions"))
}

:warning: The option was added only to allow earlier experimentation with VMs with limited/lack of EH support. In this mode, throwing an exception will lead to a trap (~program termination).


Last updated: Nov 22 2024 at 16:03 UTC