cfallin opened PR #10087 from cfallin:s390x-one-armed-branch
to bytecodealliance:main
:
This is a followup to #10086, this time removing the one-armed branch variant for s390x. This branch was only used as the default-target branch in the
br_table
lowering. This PR incorporates the branch into theJTSequence
pseudo-instruction. Some care is needed to keep theProducesBool
abstraction; it is unwrapped into itsProducesFlags
and theJTSequence
becomes aConsumesFlags
, so the compare for the jump-table bound (for default target) is not part of the pseudoinst. (This is OK because regalloc-inserted moves never alter flags, by explicit contract; the same reason allows cmp/branch terminators.)Along the way I noticed that the comments on
JTSequence
claimed thattargets
included the default, but this is (no longer?) the case, as the targets are unwrapped byjump_table_targets
which peels off the first (default) separately. Aside from comments, this only affected pretty-printing; codegen was correct.With this, we have no more one-armed branches; hence, this fixes #9980.
<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
cfallin requested abrown for a review on PR #10087.
cfallin requested wasmtime-compiler-reviewers for a review on PR #10087.
cfallin updated PR #10087.
cfallin commented on PR #10087:
cc @uweigand if you are available to review? Otherwise we're probably OK with the usual reviewer rotation...
uweigand submitted PR review:
Looks generally good to me, but see inline for a couple of comments.
As a longer-term improvement, I'm wondering if the whole default case handling couldn't just be done by common code using regular basic blocks and branch instructions, enabling all the normal optimization machinery? The back-end would then only have to do the actual tablejump (simply assuming that no overflow can happen).
uweigand created PR review comment:
The generated code is slightly worse now as the shifting is done before the branch. This probably doesn't have a very significant impact on the pipeline, but it does increase register pressure. One alternative would be to compare the shifted value instead (assuming there are no overflow considerations).
But I guess this isn't serious enough that it should prevent merging this.
uweigand created PR review comment:
Couldn't we have a
jt_sequence_bool
that implicitly decomposes theProducesBool
like the above? Then we wouldn't have to introduce thebool_producer
/bool_cond
accessors just for this case ...
cfallin updated PR #10087.
cfallin updated PR #10087.
cfallin submitted PR review.
cfallin created PR review comment:
Great idea, done!
cfallin submitted PR review.
cfallin created PR review comment:
Hmm, that's true -- I did the code motion without looking at the instruction semantics in detail to optimize. We can optimize in a followup if needed I suppose.
cfallin commented on PR #10087:
Could I bother someone for a rubber-stamp here (@uweigand's not an official approver) -- @alexcrichton or @fitzgen maybe? Thanks!
cfallin edited a comment on PR #10087:
Could I bother someone for a rubber-stamp here (@uweigand's not an official approver) -- @alexcrichton or @fitzgen maybe? (Or @abrown as original reviewer?) Thanks!
alexcrichton submitted PR review.
cfallin has enabled auto merge for PR #10087.
cfallin merged PR #10087.
Last updated: Jan 24 2025 at 00:11 UTC