cfallin opened PR #10510 from cfallin:lets-be-truly-exceptional to bytecodealliance:main:
This PR adds
try_callandtry_call_indirectinstructions, and lowerings on four of five ISAs (x86-64, aarch64, riscv64, pulley; s390x has its own non-shared ABI code that will need separate work).It extends CLIF to support these instructions as new kinds of branches, and extends block-calls to accept
retNandexnNblock-call args that carry the normal return values or exception payloads (respectively) into the appropriate successor blocks.It wires up the "normal return path" so that it continues to work. It updates the ABI so that unwinding is possible without an initial register state at throw: specifically, as per our RFC, all registers are clobbered. It also includes metadata in the
MachBufferthat describes exception-catch destinations. However, no unwinder exists to interpret these catch-destinations yet, so they are untested.
cfallin requested fitzgen for a review on PR #10510.
cfallin requested wasmtime-compiler-reviewers for a review on PR #10510.
cfallin requested wasmtime-core-reviewers for a review on PR #10510.
cfallin commented on PR #10510:
This builds on #10502; only the last commit is new.
cfallin commented on PR #10510:
Procedural note: I'm happy to go through review on this and try to land it with the "normal return" path working (ah, I should add runtests too!), or we can wait until I've implemented a test unwinder in the
clif-utilfunction runner. I'm happy to go either way.I'm also happy to add a Cranelift feature to gate this off by default until we've got it fully tested, if desired.
cfallin commented on PR #10510:
Also, I haven't built full Wasmtime since rebasing so some newer CLIF-producing code needs to be updated; I'll put this in draft until then (and the runtests).
cfallin updated PR #10510.
fitzgen commented on PR #10510:
Procedural note: I'm happy to go through review on this and try to land it with the "normal return" path working (ah, I should add runtests too!), or we can wait until I've implemented a test unwinder in the
clif-utilfunction runner. I'm happy to go either way.Testing the normal path only for the moment seems fine by me.
Will take a look at this PR tomorrow.
cfallin updated PR #10510.
fitzgen submitted PR review:
Looks great! A bunch of nitpicks and a couple more serious comments below.
fitzgen created PR review comment:
Nitpick: it shouldn't catch any thrown exception, just the exceptions specified in its exceptions table, so maybe rename this to
Call a function, catching the specified exceptions.
fitzgen created PR review comment:
I don't think we even need to specify this do we? The embedder can choose whether the reserved value is "catch all", or instead use
tag0, or even whether they have a "catch all" tag at all or not.It may still make sense to explicitly communicate this freedom by saying something like
The embedder may reserve a particular tag (for example, the reserved value or
tag0) to mean "catch all".
fitzgen created PR review comment:
Not prepended anymore, right? There is a special kind of block call parameter to specify where they go, no? (I haven't read through everything yet)
fitzgen created PR review comment:
If this is already going to be a generic function over the type of
args, it makes sense to also be generic over theItem, which (withFrom<Value> for BlockCallandFrom<&'a Value> for BlockCallimpls) should make a bunch of call sites a little less noisy:pub fn block_call<T>( &mut self, block: Block, args: impl IntoIterator<Item = T>, ) -> BlockCall where T: Into<BlockArg> {
fitzgen created PR review comment:
Maybe we should put the normal return last, so that
tags[i]is fortargets[i]?And maybe we should make the members of this struct not-
pub(maybe not evenpub(crate)?) and only expose accessors that we can ensure are correct, use the right indices, etc...? This would mean that only this module needs to check/maintain the length invariant, or be concerned with correct indexing, not all code that interacts exception tables.
fitzgen created PR review comment:
Care to just put this in the
entity_impl!macro? Then we could implement the constructors that assert thatn != u32::MAXwith unwrapping a call to this.
fitzgen created PR review comment:
Idle comment, not suggesting we actually do this or anything, but we could do a ref-counting + CoW kind of thing here. Not really worth the complexity tho.
fitzgen created PR review comment:
Ditto about making the iterator's
Itemgeneric here.
fitzgen created PR review comment:
And then this could be
pub fn append_argument(&mut self, arg: impl Into<BlockArg>, pool: &mut ValueListPool) {as well
fitzgen created PR review comment:
"ex" feels a little too brief and potentially ambiguous, maybe
entity_impl!(ExceptionTable, "exn_table");?
fitzgen created PR review comment:
This is the one thing that cranelift is constraining on the embedder's runtime: it must supply the exception values in the expected registers and this is inherently ISA specific. I don't know how to be document and communicate this. Maybe just a list of each ISA's registers in the docs for the
try_callinstruction? Or maybe in the docs for exception tags?But it is also per-calling convention, so maybe on the calling convention's docs? Bit of a mess...
fitzgen created PR review comment:
And here as well
fitzgen created PR review comment:
Would be kinda nice if
PRegSethad awith_rangemethod or something like that. I've vaguely wanted that method before, and this function would sure appreciate that method's existence.
fitzgen created PR review comment:
&Inst::Call { ref info } if info.try_call_info.is_some() => MachTerminator::Direct,right?
fitzgen created PR review comment:
It looks like we are laying the code out such that the exceptional blocks are coming before the normal return blocks here. Ideally the call would fall through to the normal return block. Obviously, with blocks that are both a
try_call's normal return and the successor of many other blocks, this isn't always possible/desirable, but that is not the case in this example, which means we have room for improvement.This isn't anything to fix before this PR can land, but is something we should think about for follow ups.
fitzgen created PR review comment:
Looks like the exception payload registers for each handler are not included in here. But I think they should be, or else the unwinder implementation will have to just know what we happen to use since, while system-v documents this, there is no such existing docs/constraints for our
tailconvention. We could even theoretically let regalloc pick the registers, if we wanted.
fitzgen created PR review comment:
It would also be good to test when both normal return blocks and exceptional return blocks don't only have
try_callpredecessors, but also regular control-flow predecessors.
fitzgen created PR review comment:
I guess we could include this in the unwind table data structures returned by cranelift after compilation. That actually seems easiest to me. Maybe that is indeed what is done and I haven't seen it yet or overlooked it.
fitzgen created PR review comment:
It would also be nice to have a runtest to exercise the normal return path, even before we have a filetest unwinder to exercise exceptional paths.
fitzgen created PR review comment:
Maybe some kind of mention here that it is the user's responsibility to coordinate with the runtime on the meaning of tags and how look up is done, and that all of that is outside of Cranelift's purview, itself here.
cfallin updated PR #10510.
cfallin updated PR #10510.
cfallin updated PR #10510.
cfallin submitted PR review.
cfallin created PR review comment:
Unfortunately this seems to create a ton of noise elsewhere: it means that
ins.jump(target, &[])leads to a type-inference error, becauseTis now unconstrained. I've tried a bunch of things here including a custom traitIntoBlockArgsetc and this seems to be the least-bad; other ideas welcome...
cfallin updated PR #10510.
cfallin submitted PR review.
cfallin created PR review comment:
Done!
cfallin submitted PR review.
cfallin created PR review comment:
Done!
cfallin submitted PR review.
cfallin created PR review comment:
Ah yes, fixed, sorry, this is old text. Fixed.
cfallin submitted PR review.
cfallin created PR review comment:
Unfortunately runs into the same unconstrained-generic problem with empty args list as above...
cfallin submitted PR review.
cfallin created PR review comment:
Indeed; that's a type from
regalloc2so this would need a separate PR and release roundtrip, so unfortunately not in this PR, but will file this away...
cfallin updated PR #10510.
cfallin submitted PR review.
cfallin created PR review comment:
Done!
cfallin submitted PR review.
cfallin created PR review comment:
Done!
cfallin submitted PR review.
cfallin created PR review comment:
Yes, that's true; bigger change for sure, and we'd want to do it for jumptables for consistency. Might be worth it if we end up with large lists of handler locations on lots of callsites someday.
cfallin submitted PR review.
cfallin created PR review comment:
That's technically true, but there are a number of reasons that pushed me toward "fixed payload registers" and "not listed in metadata":
- Letting these be dynamic and allocating them with regalloc is pretty complex, because non-fixed-reg-but-still-reg-constrained defs on calls are error-prone -- we have to carefully poke holes in the clobbers to try to force regalloc to use those holes, then we have to assert that it did the right thing (because otherwise everything is clobbered). We really want a "def after clobbers" constraint but that is a weird thing in its own way.
- Naming different registers per exception handler record per callsite is going to be quite bloated -- right now we have three
u32s here, vs. 5.- Naming a variable-length list of registers (we don't specify that there are exactly e.g. two payload regs; that's a property of the platform and calling convention) is also complex: do we have an owned smallvec/vec per record here? separate list and ranges in it?
- Letting the register choice be dynamic means that the unwinder needs a way to take an index and set an arbitrary register, rather than a fixed one. That rules out small handwritten longjmp-like trampolines/stubs (set rsp/rbp/rax and jump).
So basically it seemed like too much generality, leading to far too much complexity, vs. saying that each platform gets to specify fixed payload regs. I think I'm comfortable with specifying that the payload registers are a fixed choice and this choice becomes part of our public contract (we should document this somewhere; I'll add it to the doc-comments on
try_call/try_call_indirectand/or theCallConvarms...).
cfallin updated PR #10510.
cfallin submitted PR review.
cfallin created PR review comment:
(comment below -- added docs on
CallConvand a note pointing to those docs ontry_call)
cfallin created PR review comment:
Actually removed this paragraph -- I ended up with
PackedOption<ExceptionTag>to allowNoneto mean "catch-all".
cfallin submitted PR review.
cfallin updated PR #10510.
cfallin updated PR #10510.
cfallin submitted PR review.
cfallin created PR review comment:
Good call, went for
extable(underscores don't seem to be used elsewhere in these tags).
cfallin submitted PR review.
cfallin created PR review comment:
Sure, mind if I do this in a separate refactor? This is following the pattern that all the entities use -- happy to do them all at once.
cfallin updated PR #10510.
cfallin submitted PR review.
cfallin created PR review comment:
Interesting -- I took a look at where this is consumed and it seems we don't really need the distinction anymore (it's left over from older one-vs-two-armed conditional branches and branch finalization). So I collapsed direct/indirect/conditional into one "Branch" case.
cfallin submitted PR review.
cfallin created PR review comment:
Ah, great idea, done on both counts. This also had the nice side-effect of putting the normal-return path first in block order (i.e. fallthrough) because the
BlockLoweringOrderprocesses out-edges in reverse order for particular heuristic reasons that I forget...
cfallin updated PR #10510.
cfallin updated PR #10510.
cfallin submitted PR review.
cfallin created PR review comment:
Resolved by above!
cfallin submitted PR review.
cfallin created PR review comment:
Done!
cfallin updated PR #10510.
cfallin submitted PR review.
cfallin created PR review comment:
Done!
cfallin has marked PR #10510 as ready for review.
cfallin commented on PR #10510:
@fitzgen I think I've addressed everything except for the unresolved comments above (see comments, happy to take suggestions). I think #10502 also needs a review (Alex deferred to you per [here]https://github.com/bytecodealliance/wasmtime/pull/10502#pullrequestreview-2733680435); I'll make the update that Ulrich suggests in a separate commit). Thanks!
cfallin updated PR #10510.
cfallin commented on PR #10510:
Rebased on top of latest #10502 (and had to squash the stack of review-feedback commits to avoid too many conflicts-at-every-step but they're still at this branch)
cfallin requested fitzgen for a review on PR #10510.
cfallin requested wasmtime-fuzz-reviewers for a review on PR #10510.
cfallin updated PR #10510.
fitzgen submitted PR review.
fitzgen created PR review comment:
Ah yeah, I wish we could provide default type parameters for functions too, not just types. Oh well.
fitzgen submitted PR review.
fitzgen created PR review comment:
Sure thing, thanks
fitzgen submitted PR review.
fitzgen created PR review comment:
That seems fair. Figuring out a good place to document this for users is going to be annoying.
fitzgen submitted PR review:
Thanks!
fitzgen created PR review comment:
And then somewhere around here is where we would actually implement that check.
fitzgen created PR review comment:
Since we are only defining the exception registers for the
tailcalling convention, can we also check thattry_calling a function of a different calling convention triggers verifier errors? At least for now, we can always loosen this later and document the registers used for other conventions at that point. I double checked thetry_call.clifverifier filetest and didn't see a case for this there.
cfallin updated PR #10510.
cfallin updated PR #10510.
cfallin updated PR #10510.
cfallin submitted PR review.
cfallin created PR review comment:
Done!
cfallin submitted PR review.
cfallin created PR review comment:
Done!
cfallin commented on PR #10510:
Thanks! All addressed; while looking over it I noticed I still had
try_call_indirectlowerings as a TODO and added the ~20 lines of final plumbing to wire it up. Separated that as last commit and wanted to flag before merging -- lemme know if good.
cfallin updated PR #10510.
fitzgen submitted PR review.
cfallin has enabled auto merge for PR #10510.
cfallin merged PR #10510.
uweigand commented on PR #10510:
Hi @cfallin , I'm looking to implement this for s390x, and I'm a bit confused about the two (different?) SigRef's in play in
gen_try_call: one passed as direct argument, and one taken from theet. Which one is supposed to be used when? The current code mixed them in unexpected ways:let caller_conv = self.lower_ctx.abi().call_conv(self.lower_ctx.sigs()); let sigref = self.lower_ctx.dfg().exception_tables[et].signature(); let sig = &self.lower_ctx.dfg().signatures[sigref]; let num_rets = sig.returns.len(); let caller = <$abicaller>::from_func( self.lower_ctx.sigs(), sig_ref, &extname, IsTailCall::No, dist, caller_conv, self.backend.flags().clone(), );Why do we compute
num_retsfrom oneSigRefbut pass the other tofrom_func? Also, why is this handled differently betweengen_try_callandgen_try_call_indirect?
bjorn3 submitted PR review.
bjorn3 created PR review comment:
FWIW I had put it in
MachCallSiteinstead, but this should also work fine.
bjorn3 submitted PR review.
bjorn3 created PR review comment:
Cranelift doesn't care about this, right? Only the embedder should care. I don't see why they can't choose to do a different assignment, so should this really be documented and there be a
PackedOptionaround theExceptionTag. Wasmtime could assign exception tag 0 to the catch-all handler, right?
bjorn3 submitted PR review.
bjorn3 created PR review comment:
This isn't currently checked by the verifier, right? It probably should be.
bjorn3 deleted PR review comment.
bjorn3 submitted PR review.
bjorn3 created PR review comment:
Why are all registers clobbered for
try_call? For regular systemv unwinding, all callee saved registers are preserved on the unwind path too.
cfallin submitted PR review.
cfallin created PR review comment:
Right, I suppose the alternative is to give the embedder a raw
u32to plumb through, or let it assign a special sentinel value by convention; I guess one advantage of this approach is that it distinguishes the tags in CLIF so they aren't just bare integers.
cfallin submitted PR review.
cfallin created PR review comment:
Ah, yes, this came out of our discussions regarding the Wasmtime side of the implementation: we didn't want to have to require all registers to be saved and accessible in all trampolines out to the host as a starting-point for the unwind register state (otherwise we are penalizing. all existing programs when the feature is turned on); and we didn't necessarily want to have the unwind-info be load-bearing for correctness. With this approach we can restart at the catch-point without having to set up any register state other than the payload.
This could definitely become a codegen flag for use in native code; not doing that is just an oversight in the initial PR, sorry. I'd be happy to review a PR for that!
bjorn3 submitted PR review.
bjorn3 created PR review comment:
It would be possible to make
ExceptionTaga wrapper around a u32 without niche, right?
bjorn3 submitted PR review.
bjorn3 created PR review comment:
It would have been nice if this accepted
impl IntoIterator<Item = BlockArg>.BlockArgisCopy, so you can use.copied()on the iterator if necessary to convert from an&BlockArgto aBlockArgiterator. However if you are trying to build theBlockArgss on the fly, you are forced to collect them into aVecbefore passing it toblock_callwith the current function signature.
cfallin commented on PR #10510:
Hi @uweigand,
Why do we compute
num_retsfrom oneSigRefbut pass the other tofrom_func? Also, why is this handled differently betweengen_try_callandgen_try_call_indirect?I think this is just an oversight, sorry! The signature ID needs to be in the exception table because
TryCallIndirectdoesn't have space in theInstructionDatadirectly for the ID, as theCallIndirectcase does. IMHO we should uniformly use the version from the exception table in bothtry_callandtry_call_indirecthandling (and thus not take a separate parameter). In the direct-call case, the verifier checks that the exception table's sigref matches the called function's signature, so it should be fine no matter the source we use.
cfallin commented on PR #10510:
(And: please feel free to update this as you need to when building out s390x support for this!)
cfallin submitted PR review.
cfallin created PR review comment:
IIRC, this was necessary to allow
&[a, b]-style arg-lists, which are everywhere in other crates in-repo, to continue to work. Although now that I think about it, I suppose just[a, b]might work withItem = BlockArg. I don't have strong feelings here and I'm happy to review a PR if you can find a way to make it nicer (including with existing callsites)...
Last updated: Dec 13 2025 at 20:04 UTC