Stream: git-wasmtime

Topic: wasmtime / issue #2531 v8x16.shuffle optimizations needed


view this post on Zulip Wasmtime GitHub notifications bot (Jun 22 2023 at 15:03):

alexcrichton commented on issue #2531:

Can confirm that all these shuffles are now implemented, even on aarch64 too. All i8x16.shuffle instructions present in the above module are compiled to single-instruction lowerings on both x86_64 and aarch64. In that case I'm going to close this.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 22 2023 at 15:03):

alexcrichton closed issue #2531:

I translated the IDCT SSE code into Wasm. The algorithm uses lots of various punpckxxxx instructions, though WebAssembly has v8x16.shuffle. The v8 lowers into native SSE2 equivalents by matching immediate argument. I cannot find if we do it for any of the cranelift backends.

STR:

  1. Use test case at https://github.com/yurydelendik/zbar-wasm/raw/0083a9a48c8c06e5555424d85f71ce5a4b560145/zbar_jpeg/test.wasm
  2. Run time wasmtime run --enable-simd test.wasm --invoke test500

Observe the time; it is about 15 sec here. Node runs test.wasm (_initialize + test500) in about 11 sec here.

It is expected that wasmtime/cranelift will improve the performance by using specialized SSE2 instructions by 40-50%.


Last updated: Oct 23 2024 at 20:03 UTC