reduce compilation time · general

Stream: general

Topic: reduce compilation time

Wolfgang Meier (Feb 01 2024 at 21:13):

Hello,

I'd like to reduce instantiation time, which is currently ~10s for a 2MB .wasm file.
Are there safety checks at instantiation (or during execution for that matter) that I could disable to improve performance?

I'm working with the WasmCert-Coq formalization and have a (WIP) formal proof that the module instantiates according to the spec.

Pat Hickey (Feb 01 2024 at 21:22):

Do you know about wasmtime compile?

fitzgen (he/him) (Feb 02 2024 at 02:27):

also using InstancePre and maybe the pooling allocator will help: https://docs.rs/wasmtime/latest/wasmtime/struct.Linker.html#method.instantiate_pre

but yeah if you are seeing 10 second "instantiations" then I am pretty sure you are measuring compile time as well, and compiling your Wasm modules ahead of time will be the biggest single improvement you can do

Wolfgang Meier (Apr 11 2024 at 13:12):

Sorry for the late response.

Thanks for the suggestions!
You're right about that it's compilation and not instantiation (which I thought at first) that takes so long,
so wasmtime compile indeed speeds up things a lot.
I thus measure startup time now (loading file + compilation + instantiation).

As we finally have a proper testing setup now, I included some numbers below:

we measure wasmtime (19.0.1) against wasmtime run (.cwasm file) (19.0.1) and node (v20.11.1).
we've done a bunch of other improvements, so we're down from the 10s (which I mentioned above) to 2s for the color benchmark
note that withwasmtime-compile, we can't pretty print the result as that would require importing a function write_char, which we can't provide for wasmtime run it seems. (we don't support wasi-io).
sizes of the wasm binaries: [demo1: 45KB, demo2: 6KB, list_sum: 8KB, vs_easy: 225KB, vs_hard: 225KB, binom: 228KB, color:1.1MB, sha_fast: 422KB, even_10000: 33KB, ack_3_9: 1KB, sm_gauss_nat: 63KB, sm_gauss_N: 18KB)
average of 10 runs, we run wasm-opt --coalesce-locals on our binaries first

thanks!

benchmarks.png

Wolfgang Meier (Apr 11 2024 at 13:25):

I'd be quite interested to learn why Node is quite a bit better on our benchmarks...?

Lann Martin (Apr 11 2024 at 14:16):

Node uses V8, which has a multi-phase compilation pipeline: https://v8.dev/docs/wasm-compilation-pipeline

Lann Martin (Apr 11 2024 at 14:18):

This includes a very quick startup baseline compiler (Liftoff) that has "pretty good" performance

Lann Martin (Apr 11 2024 at 14:19):

There is a very work-in-progress backend for wasmtime called winch which I believe has similar goals

Alex Crichton (Apr 11 2024 at 14:24):

Would you be able to share what you're benchmarking in terms of code/setup/etc? Sounds like the lion's share of improvements, separating compile time from what you're benchmarking, worked well but further improvements may require a bit more careful analysis of what exactly node is doing and how Wasmtime is setup and/or configured.

Wolfgang Meier (Apr 12 2024 at 07:06):

Sure!
Everything for testing is in this repo, this includes the binaries generated by different versions of our compiler (best), benchmark.py to provide a CLI for benchmarking.
It calls run-wasmtime.py and run-node.js in the same folder, which measure one run. Multiple runs are aggregated by benchmark.py.

I obtained the above numbers like this (in the folder evaluation):
$./benchmark.py --folder binaries/non-cps-grow-mem-func-mrch-24-24/ --engine=node --wasm-opt --coalesce-locals
$./benchmark.py --folder binaries/non-cps-grow-mem-func-mrch-24-24/ --engine=wasmtime --wasm-opt --coalesce-locals

Some more background information:

See here for a short high-level description, most of the limitations are removed by now.
we generate code in SSA form, that's why you should use --coalesce-locals, in particular for color, the main function has >20k locals
this is the part of the main function of our compiler that builds the wasm module, every function body is generated by this function
the linked setup does not work with --engine=wasmtime-compile

GitHub - womeier/certicoqwasm-testing