Stream: cranelift

Topic: simd is fast


view this post on Zulip Alex Crichton (Jul 13 2020 at 21:52):

@Andrew Brown you might be interested in this, so I'm updating Rust's simd support for wasm to the latest spec. One of the examples in the simd repo is a hex encoder that uses sse/avx on x86 and such, so I copied one of those and translated it to wasm intrinsics. Below "default" is using the simd intrinsics and "fallback" is the code you would write today (e.g. no intrinsics). Also "large" is processing 1MB and "small" is processing < 128 bytes.

test benches::large_default  ... bench:     213,961 ns/iter (+/- 5,108) = 4900 MB/s
test benches::large_fallback ... bench:   3,108,434 ns/iter (+/- 75,730) = 337 MB/s
test benches::small_default  ... bench:          52 ns/iter (+/- 0) = 2250 MB/s
test benches::small_fallback ... bench:         358 ns/iter (+/- 0) = 326 MB/s

view this post on Zulip Alex Crichton (Jul 13 2020 at 21:53):

basically wasmtime's implementation of SIMD, for hex encoding, is a 7x-15x speedup

view this post on Zulip Alex Crichton (Jul 13 2020 at 21:53):

not exactly a clever benchmark since who's bottlenecked hex encoding, but I figured this was pretty neat :)

view this post on Zulip Andrew Brown (Jul 13 2020 at 21:57):

Nice! Yeah, thanks for showing me that. Where's the code for those benchmarks?

view this post on Zulip Alex Crichton (Jul 13 2020 at 22:04):

gimme one min, will post soon

view this post on Zulip Alex Crichton (Jul 13 2020 at 22:09):

@Andrew Brown https://github.com/rust-lang/stdarch/pull/874, notably https://github.com/rust-lang/stdarch/pull/874/files#diff-179577566f4ea187af5abf39056532cb

Lots of time and lots of things have happened since the simd128 support was first added to this crate. Things are starting to settle down now so this commit syncs the Rust intrinsic definitions wit...
Lots of time and lots of things have happened since the simd128 support was first added to this crate. Things are starting to settle down now so this commit syncs the Rust intrinsic definitions wit...

view this post on Zulip Andrew Brown (Jul 13 2020 at 23:35):

That's pretty cool... I guess I never thought through how I would compile Rust to Wasm SIMD--but there it is!

view this post on Zulip Johnnie Birch (Jul 14 2020 at 00:24):

Hi @Alex Crichton , Can you share that translation? The before and after?

view this post on Zulip Alex Crichton (Jul 14 2020 at 01:15):

@Johnnie Birch I think https://gist.github.com/alexcrichton/f9f10a1e2ce56c246fb449df45c3f113 is it

GitHub Gist: instantly share code, notes, and snippets.

view this post on Zulip Alex Crichton (Jul 14 2020 at 01:15):

I previously used jitdump to get this stuff out but jitdump isn't working for me right now

view this post on Zulip Alex Crichton (Jul 14 2020 at 01:15):

perf report isn't getting any symbols showing up and it's not figuring out where jit code lives

view this post on Zulip Johnnie Birch (Jul 14 2020 at 02:14):

@Alex Crichton Got it thanks. Sorry, I'll take a look at jitdump and perf report. Need to figure out a way to have proper testing for those.

view this post on Zulip Alex Crichton (Jul 14 2020 at 04:44):

hm ok I bisected a bit and it looks like https://github.com/bytecodealliance/wasmtime/pull/1565 breaks perf when the module is loaded from the cache, before that commit or --disable-cache fixes the perf issues I was having

This implements the new WASI ABI described here: https://github.com/WebAssembly/WASI/blob/master/design/application-abi.md It adds APIs to Instance and Linker with support for running WASI programs...

view this post on Zulip Alex Crichton (Jul 14 2020 at 04:45):

I'll investigate tomorrow more, no idea what that PR would be doing...

view this post on Zulip Joey Gouly (Jul 14 2020 at 10:02):

@Alex Crichton any chance you can run it on aarch64 too? :)

view this post on Zulip Alex Crichton (Jul 14 2020 at 12:42):

@Joey Gouly

thread '<unnamed>' panicked at 'Vector ops not implemented.', cranelift/codegen/src/isa/aarch64/lower_inst.rs:1624:13

:(

view this post on Zulip Alex Crichton (Jul 14 2020 at 12:58):

hex.wasm.gz -- this is the file I'm using:

$ ./target/release/wasmtime run --enable-simd -- hex.wasm --bench

running 9 tests
test tests::avx_works ... ignored
test tests::big ... ignored
test tests::empty ... ignored
test tests::encode_equals_fallback ... ignored
test tests::odd ... ignored
test benches::large_default  ... bench:     214,676 ns/iter (+/- 2,050) = 4884 MB/s
test benches::large_fallback ... bench:   3,447,077 ns/iter (+/- 78,646) = 304 MB/s
test benches::small_default  ... bench:          54 ns/iter (+/- 0) = 2166 MB/s
test benches::small_fallback ... bench:         397 ns/iter (+/- 7) = 294 MB/s

test result: ok. 0 passed; 0 failed; 5 ignored; 4 measured; 0 filtered out

view this post on Zulip Joey Gouly (Jul 14 2020 at 13:41):

@Alex Crichton aww, well we're working on simd, so hopefully it'll all be implemented soon!

view this post on Zulip Joey Gouly (Jul 14 2020 at 14:49):

@Alex Crichton btw, I had to run 'gunzip' twice on that file... am I doing something weird?

view this post on Zulip Alex Crichton (Jul 14 2020 at 14:49):

uh...

view this post on Zulip Alex Crichton (Jul 14 2020 at 14:50):

it looks like zulip maybe ran another layer of gz after I uploaded

view this post on Zulip Alex Crichton (Jul 14 2020 at 14:50):

or I ran gz twice by accident

view this post on Zulip Alex Crichton (Jul 14 2020 at 14:50):

it should be 16MB ish

view this post on Zulip Joey Gouly (Jul 14 2020 at 14:50):

yeah I got it working, but was very confused for a little bit :-)


Last updated: Jan 24 2025 at 00:11 UTC