Stream: git-cranelift

Topic: cranelift / Issue #1350 Determine best way to encode SIMD...


view this post on Zulip GitHub (Jan 14 2020 at 23:50):

abrown opened Issue #1350:

What is the feature or code improvement you would like to do in Cranelift?

I would like to implement the SIMD load_extend instructions.

This is necessary for Wasm SIMD spec compliance.

I see that currently shared/instructions.rs includes instructions such as uload32 and sload32; one option is to add the additional instructions [u/s]load8x8, [u/s]load16x4, and [u/s]load32x4. Alternately, @bnjbvr and I had discussed using the IR->IR infrastructure of legalization to implement peephole optimizations; if this was in place and at the right level, I could translate the Wasm load8x8_s, e.g., to Cranelift's load.i8x8 and uextend and then write a peephole optimization to generate the appropriate x86 PMOVSXBW to encode both of these instructions (perhaps with a new Cranelift x86_pmovsxb instruction). Which approach is better.

view this post on Zulip GitHub (Jan 14 2020 at 23:51):

abrown commented on Issue #1350:

cc: @bnjbvr, @sunfishcode

view this post on Zulip GitHub (Jan 14 2020 at 23:51):

abrown edited Issue #1350:

What is the feature or code improvement you would like to do in Cranelift?

I would like to implement the SIMD load_extend instructions.

What is the value of adding this in Cranelift?

This is necessary for Wasm SIMD spec compliance.

Do you have an implementation plan, and/or ideas for data structures or algorithms to use?

I see that currently shared/instructions.rs includes instructions such as uload32 and sload32; one option is to add the additional instructions [u/s]load8x8, [u/s]load16x4, and [u/s]load32x4. Alternately, @bnjbvr and I had discussed using the IR->IR infrastructure of legalization to implement peephole optimizations; if this was in place and at the right level, I could translate the Wasm load8x8_s, e.g., to Cranelift's load.i8x8 and uextend and then write a peephole optimization to generate the appropriate x86 PMOVSXBW to encode both of these instructions (perhaps with a new Cranelift x86_pmovsxb instruction). Which approach is better.

view this post on Zulip GitHub (Feb 06 2020 at 19:38):

sunfishcode commented on Issue #1350:

Significant peephole-style optimization will be best done once https://github.com/bytecodealliance/cranelift/issues/1344 is in place and we can build up from there. So my suggestion here is to just add the additional instructions, [u/s]load8x8, and so on, for now, and then we can revisit them later once we're ready to do more general peephole optimizations.

view this post on Zulip GitHub (Feb 28 2020 at 23:28):

alexcrichton transferred Issue #1350:

What is the feature or code improvement you would like to do in Cranelift?

I would like to implement the SIMD load_extend instructions.

What is the value of adding this in Cranelift?

This is necessary for Wasm SIMD spec compliance.

Do you have an implementation plan, and/or ideas for data structures or algorithms to use?

I see that currently shared/instructions.rs includes instructions such as uload32 and sload32; one option is to add the additional instructions [u/s]load8x8, [u/s]load16x4, and [u/s]load32x4. Alternately, @bnjbvr and I had discussed using the IR->IR infrastructure of legalization to implement peephole optimizations; if this was in place and at the right level, I could translate the Wasm load8x8_s, e.g., to Cranelift's load.i8x8 and uextend and then write a peephole optimization to generate the appropriate x86 PMOVSXBW to encode both of these instructions (perhaps with a new Cranelift x86_pmovsxb instruction). Which approach is better.


Last updated: Jan 24 2025 at 00:11 UTC