alexcrichton opened PR #7131 from alexcrichton:rv64-more-constants
to bytecodealliance:main
:
Currently any 32-bit constant can be materialized without a load from a constant pool on RISCV-64 but once constants start getting larger than this they're always loaded from the constant pool. This commit adds another special case for loading constants which appears to match what LLVM does which is to consider materializing a smaller constant and than shifting it left.
This is done by chopping off all trailing zeros from an immediate and then testing if the immediate can be materialized as a 32-bit constant. This means that the current constant loading sequence can optionally be followed by a trailing
slli
instruction to shift the zeros back into the constant. This namely means that loadingi64::MIN
(1 << 63) no longer falls back to the constant pool.<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
alexcrichton requested elliottt for a review on PR #7131.
alexcrichton requested wasmtime-compiler-reviewers for a review on PR #7131.
alexcrichton requested afonso360 for a review on PR #7131.
afonso360 submitted PR review:
LGTM! :+1:
alexcrichton updated PR #7131.
alexcrichton has enabled auto merge for PR #7131.
alexcrichton merged PR #7131.
a1phyr submitted PR review.
a1phyr created PR review comment:
lui
accepts 20 bits immediates so this could probably be optimized further:lui a0, 0xffff slli a0, a0, 0x14
afonso360 created PR review comment:
Oh right, that's a good catch! Since we always test materializing the immediate at position 0, there could be another shift amount that would produce a better immediate generation sequence.
I wonder how we could better match these other cases.
afonso360 submitted PR review.
a1phyr submitted PR review.
a1phyr created PR review comment:
Opened #7139
Last updated: Nov 22 2024 at 17:03 UTC