Stream: git-wasmtime

Topic: wasmtime / PR #8719 riscv64: Optimize storing zero to mem...


view this post on Zulip Wasmtime GitHub notifications bot (May 31 2024 at 14:42):

alexcrichton opened PR #8719 from alexcrichton:riscv64-store-zero-optimization to bytecodealliance:main:

This commit is similar to #8701 in that it adds a special case to store operations to use the zero register when applicable.

<!--
Please make sure you include the following information:

Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.html

Please ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->

view this post on Zulip Wasmtime GitHub notifications bot (May 31 2024 at 14:42):

alexcrichton requested abrown for a review on PR #8719.

view this post on Zulip Wasmtime GitHub notifications bot (May 31 2024 at 14:42):

alexcrichton requested wasmtime-compiler-reviewers for a review on PR #8719.

view this post on Zulip Wasmtime GitHub notifications bot (May 31 2024 at 15:54):

afonso360 submitted PR review:

Neat!

I also have a suspicion that the CI failures might be that we are now trying to encode a compressed store with the zero register. I don't remember if this is legal, but I'm going to investigate it a bit.

view this post on Zulip Wasmtime GitHub notifications bot (May 31 2024 at 16:01):

afonso360 submitted PR review:

Neat!

I also have a suspicion that the CI failures might be that we are now trying to encode a compressed store with the zero register. I don't remember if this is legal, but I'm going to investigate it a bit.

Edit: Yeah, the RISC-V Manual Page 70 states:

For many RVC instructions, zero-valued immediates are disallowed and x0 is not a valid 5-bit register specifier. These restrictions free up encoding space for other instructions requiring fewer operand bits.

There is nothing specific about the store instructions falling under these restrictions and It's probably why we don't have the check here. Note the regular stores (non-stack based) should not need an update since they use a compressed register which automatically excludes the zero register.

view this post on Zulip Wasmtime GitHub notifications bot (May 31 2024 at 16:02):

afonso360 submitted PR review:

Neat!

I also have a suspicion that the CI failures might be that we are now trying to encode a compressed store with the zero register. I don't remember if this is legal, but I'm going to investigate it a bit.

Edit: Yeah, the RISC-V Manual (Page 70) states:

For many RVC instructions, zero-valued immediates are disallowed and x0 is not a valid 5-bit register specifier. These restrictions free up encoding space for other instructions requiring fewer operand bits.

There is nothing specific about the store instructions falling under these restrictions and It's probably why we don't have the check here. Note the regular stores (non-stack based) should not need an update since they use a compressed register which automatically excludes the zero register.

view this post on Zulip Wasmtime GitHub notifications bot (May 31 2024 at 16:03):

afonso360 submitted PR review:

Neat!

I also have a suspicion that the CI failures might be that we are now trying to encode a compressed store with the zero register. I don't remember if this is legal, but I'm going to investigate it a bit.

Edit: Yeah, the RISC-V Manual (Page 70) states:

For many RVC instructions, zero-valued immediates are disallowed and x0 is not a valid 5-bit register specifier. These restrictions free up encoding space for other instructions requiring fewer operand bits.

There is nothing specific about the store instructions falling under these restrictions and It's probably why we don't have the check here.

Note the regular stores (non-stack based) should not need an update since they use a compressed register which automatically excludes the zero register.

view this post on Zulip Wasmtime GitHub notifications bot (May 31 2024 at 16:04):

afonso360 submitted PR review:

Neat!

I also have a suspicion that the CI failures might be that we are now trying to encode a compressed store with the zero register. I don't remember if this is legal, but I'm going to investigate it a bit.

Edit: Yeah, the RISC-V Manual (Page 70) states:

For many RVC instructions, zero-valued immediates are disallowed and x0 is not a valid 5-bit register specifier. These restrictions free up encoding space for other instructions requiring fewer operand bits.

There is nothing specific about the store instructions falling under these restrictions and It's probably why we don't have the check here. This still allows us to use the zero register for floating point compressed stores though!

Note the regular stores (non-stack based) should not need an update since they use a compressed register which automatically excludes the zero register.

view this post on Zulip Wasmtime GitHub notifications bot (May 31 2024 at 16:40):

alexcrichton updated PR #8719.

view this post on Zulip Wasmtime GitHub notifications bot (May 31 2024 at 16:41):

alexcrichton commented on PR #8719:

At least one issue I tracked to my misunderstanding of how istore{8,16,32} worked. I was inferring the type of store from the operand but that's not correct for those instructions. I'll take a look at the compressed bits now as well.

view this post on Zulip Wasmtime GitHub notifications bot (May 31 2024 at 17:05):

alexcrichton updated PR #8719.

view this post on Zulip Wasmtime GitHub notifications bot (May 31 2024 at 17:06):

alexcrichton updated PR #8719.

view this post on Zulip Wasmtime GitHub notifications bot (May 31 2024 at 17:10):

alexcrichton commented on PR #8719:

Ok I did a bit of an audit of the compressed stores. I added a new test in the zcb.clif test doing all the store widths to both an arbitrary register and the stack. My bugfix around my misinterpretation of istore8 and friends fixed CI, so there was no "smoking gun" per se to the compressed loads.

Compressed loads to arbitrary registers already don't work since that gates on the source being x8-x15 (as you pointed out). We are emitting compressed loads to the stack for x0 though but at least capstone disassembles it so it seems like it might be intended to work? Do you think I should add a condition to avoid generating the instruction to be safe though to ensure that it doesn't get emitted?

This still allows us to use the zero register for floating point compressed stores though!

By this do you mean that we could use a compressed sw instruction for a f32 store for example? I haven't done that yet because it would involve changing the type of the store to recognize that which is somewhat orthogonal to this. That being said I actually discovered a bug as well adding the f32/f64 case to the test here. Turns out i64_from_iconst will match f32const and f64const nodes, so I switched instead to u64_from_iconst which actually only matches iconst nodes.

view this post on Zulip Wasmtime GitHub notifications bot (May 31 2024 at 17:49):

afonso360 commented on PR #8719:

My bugfix around my misinterpretation of istore8 and friends fixed CI, so there was no "smoking gun" per se to the compressed loads.

Nice! Good to know.

We are emitting compressed loads to the stack for x0 though but at least capstone disassembles it so it seems like it might be intended to work?

For loads we shouldn't be, at least we have that specific check in the emit code

For stores I would agree, If capstone disassembles it it's probably intended to work. It also seems like a very useful instruction to have, so I wouldn't be surprised that it is intended.

Do you think I should add a condition to avoid generating the instruction to be safe though to ensure that it doesn't get emitted?

I think this is fine as is. I'm fairly sure that if it wasn't intended the testsuites would fail in a bunch of places.

By this do you mean that we could use a compressed sw instruction for a f32 store for example?

No, just that the zero_reg in the emit for Fsd (float store double), means f0 and that is a perfectly normal register, but since we are not restricting the zero reg, it doesn't matter.

Turns out i64_from_iconst will match f32const and f64const nodes

That's super weird, and I would not expect that at all.

view this post on Zulip Wasmtime GitHub notifications bot (May 31 2024 at 18:18):

afonso360 merged PR #8719.

view this post on Zulip Wasmtime GitHub notifications bot (May 31 2024 at 18:26):

alexcrichton commented on PR #8719:

For loads we shouldn't be, at least we have that specific check in the emit code

Oops sorry I misspoke, I meant stores!


Last updated: Nov 22 2024 at 16:03 UTC