Stream: git-wasmtime

Topic: wasmtime / PR #12230 Cranelift(egraphs): Maintain the ins...


view this post on Zulip Wasmtime GitHub notifications bot (Dec 29 2025 at 18:39):

fitzgen requested cfallin for a review on PR #12230.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 29 2025 at 18:39):

fitzgen opened PR #12230 from fitzgen:inst-set-cost-function to bytecodealliance:main:

Fixes https://github.com/bytecodealliance/wasmtime/issues/12156

view this post on Zulip Wasmtime GitHub notifications bot (Dec 29 2025 at 18:39):

fitzgen requested wasmtime-compiler-reviewers for a review on PR #12230.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 29 2025 at 18:39):

fitzgen requested wasmtime-default-reviewers for a review on PR #12230.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 29 2025 at 18:48):

fitzgen updated PR #12230.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 30 2025 at 01:38):

cfallin commented on PR #12230:

Thanks for the PR! (Nick knows but for anyone else watching) I'm out on PTO then at a conference, so I'll be able to review this Mon, Jan 12 or later.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 03 2026 at 00:45):

cfallin submitted PR review:

Got time today to review this actually so: LGTM and I'm really excited about this!

If you don't mind, could we post Sightglass results here before merging? Specifically I'd like to see cycles measured for compilation and execution time; hopefully we see no or few regressions (as I know you've already seen during some experimentation).

view this post on Zulip Wasmtime GitHub notifications bot (Jan 05 2026 at 20:04):

fitzgen commented on PR #12230:

Hrm, yeah cycles is looking very different from what I previously saw with insts-retired, unfortunately. Here is the default and spidermonkey suite's results, with statistically insignificant entries removed:

compilation :: cycles :: benchmarks/bz2/benchmark.wasm

  Δ = 38816006.88 ± 4557789.09 (confidence = 95%)

  main.so is 1.22x to 1.28x faster than inst-set.so!

  [173565996 191159750.88 226710599] inst-set.so
  [134897941 152343744.00 181128243] main.so

compilation :: cycles :: benchmarks/pulldown-cmark/benchmark.wasm

  Δ = 30708277.27 ± 5797934.73 (confidence = 95%)

  main.so is 1.11x to 1.16x faster than inst-set.so!

  [227209069 254002445.00 308181456] inst-set.so
  [194211038 223294167.73 282640832] main.so

compilation :: cycles :: benchmarks/spidermonkey/spidermonkey-regex.wasm

  Δ = 632063507.40 ± 84792443.24 (confidence = 95%)

  main.so is 1.09x to 1.11x faster than inst-set.so!

  [6551514230 7032180956.90 7890587513] inst-set.so
  [6198102237 6400117449.50 7008087519] main.so

compilation :: cycles :: benchmarks/spidermonkey/spidermonkey-markdown.wasm

  Δ = 628333545.93 ± 98363544.32 (confidence = 95%)

  main.so is 1.08x to 1.11x faster than inst-set.so!

  [6494087498 7067526581.28 7805742475] inst-set.so
  [6158498649 6439193035.35 7315715259] main.so

compilation :: cycles :: benchmarks/spidermonkey/spidermonkey-json.wasm

  Δ = 562993166.38 ± 115506969.41 (confidence = 95%)

  main.so is 1.07x to 1.10x faster than inst-set.so!

  [6486850237 7033140774.35 8340342367] inst-set.so
  [6044689306 6470147607.97 7743603904] main.so

execution :: cycles :: benchmarks/spidermonkey/spidermonkey-json.wasm

  Δ = 6534858.80 ± 5891940.54 (confidence = 95%)

  main.so is 1.00x to 1.02x faster than inst-set.so!

  [504126227 528608965.58 576654072] inst-set.so
  [496952074 522074106.78 604679005] main.so

I guess instructions retired isn't a good proxy for wall time in this case because we are switching from pretty much an array-of-scalars representation of cost to an array-of-trees representation, which is going to involve more pointer chasing and memory latency that isn't reflected in instructions retired.

Pretty clear to me that this PR cannot land as-is. Might be worth some profiling to see if there is any low-hanging fruit, but if we can't just engineer our way out of this would-be regression, then it is probably best to stop pursuing the instruction-set cost function any further :-/

view this post on Zulip Wasmtime GitHub notifications bot (Jan 05 2026 at 20:05):

fitzgen edited a comment on PR #12230:

Hrm, yeah cycles is looking very different from what I previously saw with insts-retired, unfortunately. Here is the default and spidermonkey suites' results, with statistically insignificant entries removed:

compilation :: cycles :: benchmarks/bz2/benchmark.wasm

  Δ = 38816006.88 ± 4557789.09 (confidence = 95%)

  main.so is 1.22x to 1.28x faster than inst-set.so!

  [173565996 191159750.88 226710599] inst-set.so
  [134897941 152343744.00 181128243] main.so

compilation :: cycles :: benchmarks/pulldown-cmark/benchmark.wasm

  Δ = 30708277.27 ± 5797934.73 (confidence = 95%)

  main.so is 1.11x to 1.16x faster than inst-set.so!

  [227209069 254002445.00 308181456] inst-set.so
  [194211038 223294167.73 282640832] main.so

compilation :: cycles :: benchmarks/spidermonkey/spidermonkey-regex.wasm

  Δ = 632063507.40 ± 84792443.24 (confidence = 95%)

  main.so is 1.09x to 1.11x faster than inst-set.so!

  [6551514230 7032180956.90 7890587513] inst-set.so
  [6198102237 6400117449.50 7008087519] main.so

compilation :: cycles :: benchmarks/spidermonkey/spidermonkey-markdown.wasm

  Δ = 628333545.93 ± 98363544.32 (confidence = 95%)

  main.so is 1.08x to 1.11x faster than inst-set.so!

  [6494087498 7067526581.28 7805742475] inst-set.so
  [6158498649 6439193035.35 7315715259] main.so

compilation :: cycles :: benchmarks/spidermonkey/spidermonkey-json.wasm

  Δ = 562993166.38 ± 115506969.41 (confidence = 95%)

  main.so is 1.07x to 1.10x faster than inst-set.so!

  [6486850237 7033140774.35 8340342367] inst-set.so
  [6044689306 6470147607.97 7743603904] main.so

execution :: cycles :: benchmarks/spidermonkey/spidermonkey-json.wasm

  Δ = 6534858.80 ± 5891940.54 (confidence = 95%)

  main.so is 1.00x to 1.02x faster than inst-set.so!

  [504126227 528608965.58 576654072] inst-set.so
  [496952074 522074106.78 604679005] main.so

I guess instructions retired isn't a good proxy for wall time in this case because we are switching from pretty much an array-of-scalars representation of cost to an array-of-trees representation, which is going to involve more pointer chasing and memory latency that isn't reflected in instructions retired.

Pretty clear to me that this PR cannot land as-is. Might be worth some profiling to see if there is any low-hanging fruit, but if we can't just engineer our way out of this would-be regression, then it is probably best to stop pursuing the instruction-set cost function any further :-/

view this post on Zulip Wasmtime GitHub notifications bot (Jan 05 2026 at 20:06):

fitzgen updated PR #12230.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2026 at 13:03):

cfallin commented on PR #12230:

Ah, that's too bad! Very worthwhile exploration regardless...

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2026 at 17:48):

fitzgen commented on PR #12230:

Idea: We could potentially try to do the current scalar cost first, but bail out if we ever saturate to infinity, and only then use the instruction-set cost function. Kind of heavy-handed, but might avoid compile-time regressions.


Last updated: Jan 09 2026 at 13:15 UTC