Stream: git-wasmtime

Topic: wasmtime / PR #11210 Cranelift: introduce a function inliner


view this post on Zulip Wasmtime GitHub notifications bot (Jul 09 2025 at 21:28):

fitzgen requested alexcrichton for a review on PR #11210.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 09 2025 at 21:28):

fitzgen requested wasmtime-compiler-reviewers for a review on PR #11210.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 09 2025 at 21:28):

fitzgen opened PR #11210 from fitzgen:cranelift-inliner to bytecodealliance:main:

This comit adds "inlining as a library" to Cranelift; it does _not_ provide a complete, off-the-shelf inlining solution. Cranelift's compilation context is per-function and does not encompass the full call graph. It does not know which functions are hot and which are cold, which have been marked the equivalent of #[inline(always)] versus #[inline(never)], etc... Only the Cranelift user can understand these aspects of the full compilation pipeline, and these things can be very different between (say) Wasmtime and cg_clif. Therefore, this infrastructure does not attempt to define hueristics for when inlining a particular call is likely beneficial. This module only provides hooks for the Cranelift user to tell Cranelift whether a given call should be inlined or not, and the mechanics to inline a callee into a particular call site when the user directs Cranelift to do so.

This commit also creates a new kind of filetest that will always inline calls to functions that have already been defined in the file. This lets us exercise the inliner in filetests.

Fixes https://github.com/bytecodealliance/wasmtime/issues/4127

<!--
Please make sure you include the following information:

Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.html

Please ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->

view this post on Zulip Wasmtime GitHub notifications bot (Jul 09 2025 at 21:29):

fitzgen requested cfallin for a review on PR #11210.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 09 2025 at 22:10):

cfallin created PR review comment:

Would it make sense to provide a Cow<'a, ir::Function> here? I can imagine use-cases that might want to synthesize a body to inline (for example: intrinsics, perhaps even dynamically generated based on parameters) rather than holding ownership of an existing function elsewhere.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 09 2025 at 22:10):

cfallin created PR review comment:

This feels like a somewhat error-prone API -- it imposes an ordering requirement (or rather, the first push_catch is not really a catch but the normal return). Perhaps we could have a separate setter for the normal return entry starting from an empty exception table, and then assert here that the normal return is present?

view this post on Zulip Wasmtime GitHub notifications bot (Jul 09 2025 at 22:10):

cfallin created PR review comment:

This seems like a reasonable case to optimize here (will be a very frequent case) but it might be worth noting that it could go away, simplifying logic here, if we add a jump-threading optimization pass later...

view this post on Zulip Wasmtime GitHub notifications bot (Jul 09 2025 at 22:10):

cfallin submitted PR review:

This looks great overall -- thanks a bunch for building this and the care taken in handling all the combinations of kinds of calls!

A few minor thoughts below. vmctx handling in particular is interesting and I'm wondering if we should punt some complexity by leaning on legalization instead; could be convinced either way though.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 09 2025 at 22:10):

cfallin created PR review comment:

A thought -- not sure how I feel about it, but worth noting to see what you think:

The vmctx handling here feels a little awkward, and certainly costs something; what do you think about requiring the inlined function to have been legalized first, and erroring out if we see a global_value operator during inlining?

On the one hand, it's a little less user-friendly (not just any function body can be inlined -- one must ensure it's in a particular state), but on the other hand, it feels much more orthogonal and well-factored from a compiler-pass point of view -- the point of legalization is to lower the implicit function-scoped GVs to ordinary (compositional, fully explicit) operators, so we might as well use that logic.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 10 2025 at 10:27):

bjorn3 created PR review comment:

I don't think we can inline try_call at all without frontend specific knowledge. For example for cg_clif, a regular call to _Unwind_Resume is used to resume unwinding to the caller of the current function. The inliner would need to know to replace _Unwind_Resume calls with a jump to the right exception table entry.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 10 2025 at 10:27):

bjorn3 submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 10 2025 at 10:28):

bjorn3 edited PR review comment.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 10 2025 at 16:03):

fitzgen submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 10 2025 at 16:03):

fitzgen created PR review comment:

Good idea, will do :+1:

view this post on Zulip Wasmtime GitHub notifications bot (Jul 10 2025 at 16:08):

fitzgen submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 10 2025 at 16:08):

fitzgen created PR review comment:

Yeah I wavered back and forth on this myself, but ultimately it didn't seem too hard to support global values, even with the different-vmctx case, so I went ahead and did it. I guess unless you feel strongly that we should require legalization before inlining, I would rather not put effort into changing what is already there, given that I don't think there is a clear winner between the two choices. But I am willing to make this change if you really think it is the better approach.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 10 2025 at 16:08):

fitzgen submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 10 2025 at 16:08):

fitzgen created PR review comment:

True, will add a comment

view this post on Zulip Wasmtime GitHub notifications bot (Jul 10 2025 at 16:10):

fitzgen created PR review comment:

Will add the assert. I don't have any use case for actually building a new exception table from scratch after it has been cleared, and no other place in the code base does either, so I'll avoid the push-normal-return method for now.

I think ideally this type would internally have the equivalent of enum { Cleared, Full(...) } so that if we have the vecs then we know we aren't cleared and can always rely on their length relationship invariant, but that is a bit outside the scope of this PR, IMO.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 10 2025 at 16:10):

fitzgen submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 10 2025 at 16:13):

fitzgen submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 10 2025 at 16:13):

fitzgen created PR review comment:

without frontend specific knowledge

I think that should be fine because the frontend is in control of deciding whether to inline any given call, and has the information about which function is being called and with which call opcode is being used (try_call vs regular call vs return_call), so the frontend can simply tell Cranelift not to inline any call that would be problematic in the ways you describe, or in any other ways. This inliner just needs to be able to do the actual inlining, given the precondition that it is valid to do said inlining (which is what the frontend is promising by saying "yes inline this, here is a body").

view this post on Zulip Wasmtime GitHub notifications bot (Jul 10 2025 at 16:42):

fitzgen submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 10 2025 at 16:42):

fitzgen created PR review comment:

(Furthermore, because there is no "throw" instruction in CLIF, the only way of throwing an exception is via a call into some runtime function (libunwind, the Wasmtime runtime, C++'s _Unwind* runtime, etc...), and therefore it is invalid to inline that particular call because then the throw could not happen, which breaks the equivalent-semantics promise that the frontend is making by saying "please inline this call with this callee body." On the other hand, it is fine to inline a try_call of a function that does not throw and does not make any calls that could throw; in this case we won't end up using the original try_call's exception edges at all, which is the correct thing to do as those exception edges are unreachable from the inlined call's body -- although the exception edges' target blocks may be reachable from other edges -- as we can statically see that there is no call that could throw and unwind along those exceptional edges.)

view this post on Zulip Wasmtime GitHub notifications bot (Jul 10 2025 at 16:57):

fitzgen updated PR #11210.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 10 2025 at 17:00):

fitzgen has enabled auto merge for PR #11210.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 10 2025 at 17:04):

cfallin submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 10 2025 at 17:04):

cfallin created PR review comment:

On a little further thought: I do think it would actually be better to remove the vmctx-specific logic here, for a few maintenance- and risk-related reasons:

Finally I suspect that this requirement wouldn't place too much burden on "typical" use-cases: either one is doing a bottom-up-on-SCC-callgraph general inlining pass, and one will have done optimizations (which start with legalization) already on bodies that are considered for inlining; or one is building a specific compile-time builtin / intrinsification mechanism and has full mutable access to the special function bodies one is inlining as they are prepared. In other words I don't think it will be common to be stuck with a &ir::Function that's not legalized and have to clone it or whatnot.

All that said: still happy to discuss more if there are things I've missed!

view this post on Zulip Wasmtime GitHub notifications bot (Jul 10 2025 at 17:05):

cfallin edited PR review comment.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 10 2025 at 17:12):

cfallin commented on PR #11210:

(I think your add-to-queue crossed paths with my comment above -- see discussion on vmctx)

view this post on Zulip Wasmtime GitHub notifications bot (Jul 10 2025 at 18:19):

fitzgen updated PR #11210.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 10 2025 at 18:19):

fitzgen has enabled auto merge for PR #11210.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 10 2025 at 18:54):

fitzgen merged PR #11210.


Last updated: Jan 09 2026 at 13:15 UTC