igoriakovlev opened issue #13279:
Using Wasmtime 44.0.1 we are having very slow benchmark execution.
The benchmark allocates and copies a lot of gc arrays and seems Wasmtime works inefficient with that (other engines, like v8, works much more faster).To reproduce this one can use .wasm file and benchmark's config attached to this issue.
To run benchmark you need to execute the following cli command:
wasmtime -W gc,exceptions,function-references --dir=/ ./kotlin-wasm-benchmarks-wasmWasiBenchmark.wasm STUB $PWD/config 5 microBenchmarks.ArrayCopyBenchmark.copyInSameArrayTo compare this with v8 you can run the following cli command:
node ./kotlin-wasm-benchmarks-wasmWasiBenchmark.mjs $PWD/config 5 microBenchmarks.ArrayCopyBenchmark.cissueopyInSameArrayHere you can find the Kotlin source code for this benchmark.
fitzgen added the wasm-proposal:gc label to Issue #13279.
fitzgen commented on issue #13279:
Thanks for filing an issue! It is expected that we won't be on-par with engines that have put hundreds more engineering years into the GCs than we have, but at the same time we shouldn't be multiple order of magnitude slower in general. Can you provide relative timings here? (That's generally a good bit of information to provide when filing performance-related issues).
Probably much faster with the copying collector (
-Ccollector=copying) but I don't think that has made it into a release yet, and only exists onmaincurrently.In any case, I'll get around to taking a closer look soon ish.
alexcrichton commented on issue #13279:
@fitzgen FWIW I get, with
node24.15.0:$ node ./kotlin-wasm-benchmarks-wasmWasiBenchmark.mjs $PWD/config 5 microBenchmarks.ArrayCopyBenchmark.copyInSameArray (node:1050346) ExperimentalWarning: WASI is an experimental feature and might change at any time (Use `node --trace-warnings ...` to show where the warning was created) Warm-up #0: 435.285 ms/op Warm-up #1: 384.315 ms/op Warm-up #2: 585.851 ms/op Warm-up #3: 1,615.40 ms/op Warm-up #4: 2,232.04 ms/op Iteration #0: 1,664.03 ms/op Consumed blackhole value: -1805010299 <RESULT>4744766272456097792</RESULT>and with current main:
$ cargo run --release -- --dir / -Wgc,function-references,exceptions kotlin-wasm-benchmarks-wasmWasiBenchmark.wasm STUB $PWD/config 5 microBenchmarks.ArrayCopyBenchmark.copyInSameArray ...(never prints anything). I waited for ~1 minute where
nodefully completed in 8s.and with
-Ccollector=copying$ cargo run --release -- -C collector=copying --dir / -Wgc,function-references,exceptions kotlin-wasm-benchmarks-wasmWasiBenchmark.wasm STUB $PWD/config 5 microBenchmarks.ArrayCopyBenchmark.copyInSameArray Finished `release` profile [optimized] target(s) in 0.11s Running `target/release/wasmtime -C collector=copying --dir / -Wgc,function-references,exceptions kotlin-wasm-benchmarks-wasmWasiBenchmark.wasm STUB /home/alex/code/wasmtime/config 5 microBenchmarks.ArrayCopyBenchmark.copyInSameArray` thread 'main' (1052828) panicked at crates/environ/src/gc.rs:507:32: invalid `VMGcKind`: 0b000000000000000000000000000000 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace(after a few seconds)
Also, @igoriakovlev there's a typo in the node reproduction above which uses
cissueopyInSameArrayinstead ofcopyInSameArraylike in the Wasmtime snippet, and I'm assumingcopyInSameArrayis what's intended. I stay this because pasting the node command as-is completes instantaneously which looks like it's just not running anything.
alexcrichton commented on issue #13279:
A quick
perf recordof the drc collector shows that 70% of the time is in thearray.copylibcall. That's known to be extremely slow in Wasmtime as it's lifting and lowering between high-level and VM types, so I suspect the lion's share of the improvement here is "optimizearray.copy"
igoriakovlev edited issue #13279:
Using Wasmtime 44.0.1 we are having very slow benchmark execution.
The benchmark allocates and copies a lot of gc arrays and seems Wasmtime works inefficient with that (other engines, like v8, works much more faster).To reproduce this one can use .wasm file and benchmark's config attached to this issue.
To run benchmark you need to execute the following cli command:
wasmtime -W gc,exceptions,function-references --dir=/ ./kotlin-wasm-benchmarks-wasmWasiBenchmark.wasm STUB $PWD/config 5 microBenchmarks.ArrayCopyBenchmark.copyInSameArrayTo compare this with v8 you can run the following cli command:
node ./kotlin-wasm-benchmarks-wasmWasiBenchmark.mjs $PWD/config 5 microBenchmarks.ArrayCopyBenchmark.copyInSameArrayHere you can find the Kotlin source code for this benchmark.
alexcrichton commented on issue #13279:
With the optimizations from https://github.com/bytecodealliance/wasmtime/pull/13382, plus the bug fixes in https://github.com/bytecodealliance/wasmtime/pull/13381, I've got the following results. The results in this table are total runtime for the entire wasm module to finish running. The module itself prints information and timing information but I'm not entirely sure what it is. I figure this is all indicative though:
runtime time wasmtime gc=null 3.455 wasmtime gc=drc 14.424 wasmtime gc=copying 3.407 node 6.984 so, effectively, I believe that #13382 solves this issue
alexcrichton closed issue #13279:
Using Wasmtime 44.0.1 we are having very slow benchmark execution.
The benchmark allocates and copies a lot of gc arrays and seems Wasmtime works inefficient with that (other engines, like v8, works much more faster).To reproduce this one can use .wasm file and benchmark's config attached to this issue.
To run benchmark you need to execute the following cli command:
wasmtime -W gc,exceptions,function-references --dir=/ ./kotlin-wasm-benchmarks-wasmWasiBenchmark.wasm STUB $PWD/config 5 microBenchmarks.ArrayCopyBenchmark.copyInSameArrayTo compare this with v8 you can run the following cli command:
node ./kotlin-wasm-benchmarks-wasmWasiBenchmark.mjs $PWD/config 5 microBenchmarks.ArrayCopyBenchmark.copyInSameArrayHere you can find the Kotlin source code for this benchmark.
Last updated: Jun 01 2026 at 09:49 UTC