alexcrichton opened PR #8719 from alexcrichton:riscv64-store-zero-optimization
to bytecodealliance:main
:
This commit is similar to #8701 in that it adds a special case to
store
operations to use thezero
register when applicable.<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
alexcrichton requested abrown for a review on PR #8719.
alexcrichton requested wasmtime-compiler-reviewers for a review on PR #8719.
afonso360 submitted PR review:
Neat!
I also have a suspicion that the CI failures might be that we are now trying to encode a compressed store with the zero register. I don't remember if this is legal, but I'm going to investigate it a bit.
afonso360 submitted PR review:
Neat!
I also have a suspicion that the CI failures might be that we are now trying to encode a compressed store with the zero register. I don't remember if this is legal, but I'm going to investigate it a bit.
Edit: Yeah, the RISC-V Manual Page 70 states:
For many RVC instructions, zero-valued immediates are disallowed and x0 is not a valid 5-bit register specifier. These restrictions free up encoding space for other instructions requiring fewer operand bits.
There is nothing specific about the store instructions falling under these restrictions and It's probably why we don't have the check here. Note the regular stores (non-stack based) should not need an update since they use a compressed register which automatically excludes the zero register.
afonso360 submitted PR review:
Neat!
I also have a suspicion that the CI failures might be that we are now trying to encode a compressed store with the zero register. I don't remember if this is legal, but I'm going to investigate it a bit.
Edit: Yeah, the RISC-V Manual (Page 70) states:
For many RVC instructions, zero-valued immediates are disallowed and x0 is not a valid 5-bit register specifier. These restrictions free up encoding space for other instructions requiring fewer operand bits.
There is nothing specific about the store instructions falling under these restrictions and It's probably why we don't have the check here. Note the regular stores (non-stack based) should not need an update since they use a compressed register which automatically excludes the zero register.
afonso360 submitted PR review:
Neat!
I also have a suspicion that the CI failures might be that we are now trying to encode a compressed store with the zero register. I don't remember if this is legal, but I'm going to investigate it a bit.
Edit: Yeah, the RISC-V Manual (Page 70) states:
For many RVC instructions, zero-valued immediates are disallowed and x0 is not a valid 5-bit register specifier. These restrictions free up encoding space for other instructions requiring fewer operand bits.
There is nothing specific about the store instructions falling under these restrictions and It's probably why we don't have the check here.
Note the regular stores (non-stack based) should not need an update since they use a compressed register which automatically excludes the zero register.
afonso360 submitted PR review:
Neat!
I also have a suspicion that the CI failures might be that we are now trying to encode a compressed store with the zero register. I don't remember if this is legal, but I'm going to investigate it a bit.
Edit: Yeah, the RISC-V Manual (Page 70) states:
For many RVC instructions, zero-valued immediates are disallowed and x0 is not a valid 5-bit register specifier. These restrictions free up encoding space for other instructions requiring fewer operand bits.
There is nothing specific about the store instructions falling under these restrictions and It's probably why we don't have the check here. This still allows us to use the zero register for floating point compressed stores though!
Note the regular stores (non-stack based) should not need an update since they use a compressed register which automatically excludes the zero register.
alexcrichton updated PR #8719.
alexcrichton commented on PR #8719:
At least one issue I tracked to my misunderstanding of how
istore{8,16,32}
worked. I was inferring the type of store from the operand but that's not correct for those instructions. I'll take a look at the compressed bits now as well.
alexcrichton updated PR #8719.
alexcrichton updated PR #8719.
alexcrichton commented on PR #8719:
Ok I did a bit of an audit of the compressed stores. I added a new test in the zcb.clif test doing all the store widths to both an arbitrary register and the stack. My bugfix around my misinterpretation of
istore8
and friends fixed CI, so there was no "smoking gun" per se to the compressed loads.Compressed loads to arbitrary registers already don't work since that gates on the source being x8-x15 (as you pointed out). We are emitting compressed loads to the stack for
x0
though but at least capstone disassembles it so it seems like it might be intended to work? Do you think I should add a condition to avoid generating the instruction to be safe though to ensure that it doesn't get emitted?This still allows us to use the zero register for floating point compressed stores though!
By this do you mean that we could use a compressed
sw
instruction for a f32 store for example? I haven't done that yet because it would involve changing the type of the store to recognize that which is somewhat orthogonal to this. That being said I actually discovered a bug as well adding the f32/f64 case to the test here. Turns outi64_from_iconst
will matchf32const
andf64const
nodes, so I switched instead tou64_from_iconst
which actually only matchesiconst
nodes.
afonso360 commented on PR #8719:
My bugfix around my misinterpretation of istore8 and friends fixed CI, so there was no "smoking gun" per se to the compressed loads.
Nice! Good to know.
We are emitting compressed loads to the stack for x0 though but at least capstone disassembles it so it seems like it might be intended to work?
For loads we shouldn't be, at least we have that specific check in the emit code
For stores I would agree, If capstone disassembles it it's probably intended to work. It also seems like a very useful instruction to have, so I wouldn't be surprised that it is intended.
Do you think I should add a condition to avoid generating the instruction to be safe though to ensure that it doesn't get emitted?
I think this is fine as is. I'm fairly sure that if it wasn't intended the testsuites would fail in a bunch of places.
By this do you mean that we could use a compressed sw instruction for a f32 store for example?
No, just that the zero_reg in the emit for
Fsd (float store double)
, meansf0
and that is a perfectly normal register, but since we are not restricting the zero reg, it doesn't matter.Turns out i64_from_iconst will match f32const and f64const nodes
That's super weird, and I would not expect that at all.
afonso360 merged PR #8719.
alexcrichton commented on PR #8719:
For loads we shouldn't be, at least we have that specific check in the emit code
Oops sorry I misspoke, I meant stores!
Last updated: Dec 23 2024 at 12:05 UTC