Kmeakin opened PR #9251 from Kmeakin:km/pulley/tail-calls
to bytecodealliance:main
:
Sorry for the delay, I was on holiday.
The proof of concept works, but codegen forbecome
is not actually implemented yet and so triggers an ICE:error: internal compiler error: /rustc/9b72238eb813e9d06e9e9d270168512fbffd7ee7/compiler/rustc_codegen_ssa/src/mir/block.rs:1372:17: `TailCall` terminator is not yet supported by `rustc_codegen_ssa` --> pulley/src/interp/interp_loop.rs:85:25 | 85 | become (next_handler.fun)($state, $pc); | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
So for now
return
is used instead ofbecome
and we must hope that LLVM will be smart enough to recongize the tail call
Kmeakin requested fitzgen for a review on PR #9251.
Kmeakin requested wasmtime-default-reviewers for a review on PR #9251.
Kmeakin updated PR #9251.
github-actions[bot] commented on PR #9251:
Subscribe to Label Action
cc @fitzgen
<details>
This issue or pull request has been labeled: "pulley"Thus the following users have been cc'd because of the following labels:
- fitzgen: pulley
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
Learn more.
</details>
alexcrichton commented on PR #9251:
Thanks for this! I might jump in with a small recommendation on overall structure / CI here -- with a Cargo feature controlling this it unfortunately doesn't play well with our "test with all features enabled" in CI well because it enables the feature when a stable compiler is in use. To fix this I might recommend
#[cfg(pulley_tail_call)]
perhaps? That would require settingRUSTFLAGS
in CI but that could be added to the "Monolith Checks" perhaps as a singlecargo check
of pulley on nightly? That would also enable this to land despite codegen support not being in upstream rust-lang/rust yet.
fitzgen commented on PR #9251:
Thanks! I'll take a look either later today or tomorrow.
Mildly surprised about
become
not working anymore, as it was working when I did a few experiments around the time of the RFC. I guess rustc is doing a new ssa thing in the midend and hasn't extended support tobecome
yet?
fitzgen commented on PR #9251:
FWIW, it looks like https://github.com/rust-lang/rust/issues/127520 was filed about this regression, but closed as WONTFIX, so I also cross-posted the bug to the
explicit_tail_calls
tracking issue: https://github.com/rust-lang/rust/issues/112788#issuecomment-2353680606
WaffleLapkin commented on PR #9251:
@fitzgen I'm not sure how it could have worked for you --
become
was never fully implemented. (maybe you tried to use it when it was just an alternative syntax forreturn
?)either way the reason why https://github.com/rust-lang/rust/issues/127520 was WONTFIXED is that tails calls are simply not implemented yet, they are an incomplete feature. The error even says as much:
TailCall
terminator is not yet supported byrustc_codegen_ssa
there is no reason to bring any more awareness to this issue -- it will be fixed naturally once the implementation is complete.
fitzgen submitted PR review:
Looks great! Really happy with how this shaped up.
Feel free to split out subcommits if you want, but it isn't necessary. Not sure they will actually land first at this point, since the final commit doesn't have as much feedback. Probably good to do for future PRs tho, just to keep things nice and easy for reviewers :+1:
Big thing needed to land this is integrating with our CI infrastructure and replacing the cargo feature, as Alex mentioned.
Thanks!
fitzgen created PR review comment:
Ditto for this second commit: it could also be split out and land sooner.
Also, can this
debug_assert!(!return_addr.is_null())
?
fitzgen created PR review comment:
Nice! This first commit could be split out into its own PR and probably land sooner than the following bits, if you want.
fitzgen created PR review comment:
Maybe some docs for the module, even if we are skipping them for the individual functions?
fitzgen created PR review comment:
FWIW, the size of
ControlFlow<Done>
is going to be bigger than the size ofContinuation
was, because there will be twoenum
discriminant tags instead of one. But that is fine, and we can measure performance down the line and see if this actually matters. I suspect it might matter if it crosses an ABI threshold where returning these values goes from being in a register to going through a return pointer to a caller-supplied, stack-allocated space. But, as I said, we can always measure later.
fitzgen created PR review comment:
Can this take a generic
T: BytecodeStream
and return aResult<(...), T::Error>
? No need to foist the unsafety upon everyone. We can makeunwrap_uninhabited
a method on an exported extension trait, if we want to make it generally available to users of the decoder.pub trait UnwrapUninhabited<T> { fn unwrap_uninhabited(self) -> T; } impl<T> UnwrapUninhabited<T> for Result<T, Uninhabited> { ... } ... pub fn $snake_name<T>(pc: &mut T) -> Result<($($($field_ty,)*)?), T::Error> { ... }
fitzgen created PR review comment:
For caller ergonomics, this could still return a
ControlFlow::Continue(())
. This way, we wouldn't need to update every call site to also return that value.
fitzgen created PR review comment:
As Alex mentioned, to better integrate with our CI infrastructure (which passes
--all-features
to tests), let's replace allfeature = "tail_calls"
incfg(...)
s withcfg(pulley_tail_calls)
andcfg(not(pulley_tail_calls))
.Then we can turn it on and off via the
RUSTFLAGS
env var, for exampleRUSTFLAGS="--cfg pulley_tail_calls" cargo test --all-features
And then by default CI will test the regular-calls version, but we can also add a step to the "Monolith Checks" CI job that sets the above
RUSTFLAGS
to check that the tail-calls version compiles.
fitzgen created PR review comment:
Nice! I think this worked out really well, especially how each opcode handler doesn't need to explicitly decode immediates/operands, and everything Just Works
fitzgen created PR review comment:
And then we can actually use
become
here. I've tested locally andcargo check
works withbecome
, and avoids the ICE.
fitzgen created PR review comment:
Although maybe we want another
cfg(pulley_become)
or something so that if people want to rely on LLVM to optimize calls in tail position into tail calls, then they can?
Kmeakin submitted PR review.
Kmeakin created PR review comment:
ControlFlow<Done, ()>
andContinuation
are both 1 byte (it seems rustc is smart enough to use the "unused discriminants" fromDone
as niches forControlFlow
).
ControlFlow<Done, OpcodeHandler>
is 2 words. The alternative would be to returnControlFlow<Done, ()>
and set the next handler via an&mut OpcodeHandler
parameter, but I expect that might be harder for LLVM to optimize. In the worst case, returning a large aggregate will be lowered to returning via an out-parameter anyway, so it shouldn't be any worse performance wise
Kmeakin submitted PR review.
Kmeakin created PR review comment:
We could also squeeze
ControlFlow<Done, OpcodeHandler>
into a single word by using the lower 2 bits as the discriminant tag, since function pointers must be word aligned. I think there was a proposal for rustc to do this automatically but it has not yet been implemented.
Kmeakin submitted PR review.
Kmeakin created PR review comment:
cargo check
works because it doesn't actually generate LLVM/assembly, but if you trycargo build
orcargo test
it will
Kmeakin updated PR #9251.
fitzgen submitted PR review.
fitzgen created PR review comment:
Yep, that's the point: to be using
become
and still havecargo check
type check the Rust code that wecfg
on.Not sure if it is actually implemented yet, but that should check that we aren't relying on any
Drop
implementations running after the tail call (which would preclude it from being an actual tail call instead of a call in tail position).
fitzgen edited PR review comment.
Kmeakin updated PR #9251.
Kmeakin updated PR #9251.
Kmeakin submitted PR review.
Kmeakin created PR review comment:
So run
cargo check
/cargo clippy
withRUSTFLAGS="--cfg pulley_tail_calls"
, but notcargo test
?
Kmeakin updated PR #9251.
Kmeakin updated PR #9251.
Kmeakin submitted PR review.
Kmeakin created PR review comment:
Also, can this debug_assert!(!return_addr.is_null())?
NonNull::new_unchecked
already performs the check in debug builds: https://doc.rust-lang.org/1.81.0/src/core/ptr/non_null.rs.html#217
Kmeakin updated PR #9251.
fitzgen submitted PR review.
fitzgen created PR review comment:
Yeah just add a line something like
# Make sure that the tail-calls version of Pulley checks okay. - run: cargo check -p pulley-interpreter env: RUSTFLAGS: "--cfg pulley_tail_calls"
Kmeakin updated PR #9251.
Kmeakin updated PR #9251.
Kmeakin updated PR #9251.
fitzgen commented on PR #9251:
Once https://github.com/bytecodealliance/wasmtime/pull/9274 lands, is this ready to merge as well?
fitzgen commented on PR #9251:
Once #9274 lands, is this ready to merge as well?
@Kmeakin, now that #9274 landed, is this ready to rebase and land as well, or are there still more things left to do first?
Kmeakin commented on PR #9251:
Once #9274 lands, is this ready to merge as well?
@Kmeakin, now that #9274 landed, is this ready to rebase and land as well, or are there still more things left to do first?
I think it is ready to land, I don't have anything else I want to add to it.
Kmeakin updated PR #9251.
fitzgen submitted PR review:
Thanks!
fitzgen updated PR #9251.
fitzgen commented on PR #9251:
I pushed a commit moving the new CI check to the nightly tests CI job, since it requires a nightly rustc.
But it looks like there are some failures running pulley tests under miri now as well, taking a quick look.
fitzgen updated PR #9251.
fitzgen commented on PR #9251:
Pushed a fix for the miri tests:
Done::Trap
needed to have the trapping PC as a payload so that we could pass that trapping PC toself.trap(pc)
rather than the first PC passed torun
.
fitzgen has enabled auto merge for PR #9251.
fitzgen merged PR #9251.
fitzgen commented on PR #9251:
Thanks again @Kmeakin!
Rudxain submitted PR review.
Rudxain created PR review comment:
"the the" mistake (typo?)
fitzgen submitted PR review.
fitzgen created PR review comment:
@Rudxain good eye! Care to make a PR fixing it?
fitzgen submitted PR review.
fitzgen created PR review comment:
Nevermind, I see that you already did -- thanks!
Last updated: Dec 23 2024 at 12:05 UTC