Reviving weval · StarlingMonkey · Zulip Chat Archive

Hey, I'm working on bringing AOT through weval back to StarlingMonkey. The long story short is that I was curious to see how the weval failures looked like after we switched to much more recent version of SpiderMonkey. I applied the weval patches on top of firefox 140 and to my surprise the whole StarlingMonkey test suite, including wevaled runtime, passed.

I tried to repeat this success and updated the firefox to the latest stable which is 147_0_4 and that was working as well. I then ran the SpiderMonkey testsuite with wevaled js shell and there were no weval specific failures.

I moved to updating the wasi-sdk that we use to build StarlingMonkey to the latest v30 and that worked as well.

Although the situation is confusing, I’m cautiously optimistic that we can offer AOT again. Ideally, I’d like a definitive explanation of what changed to resolve the AOT issues, but I’m also very aligned with the pragmatic approach of simply re‑enabling it.

You can find the StarlingMonkey branch with reenabled weval here: https://github.com/bytecodealliance/StarlingMonkey/pull/303

Tomasz Andrzejak (Feb 24 2026 at 10:42):

In the meantime I will work on rolling out the patches for updating SpiderMonkey to latest stable.

To use weval with latest wasi-sdk I had to make some small changes to waffle and upgrade weval to use the latest wizer. Both are now merged, but I will need to update waffle in crates.io so that weval can bump the version. @Chris Fallin I can help with whatever is needed to publish latest waffle to crates.io :slight_smile:

Chris Fallin (Feb 24 2026 at 17:16):

Unfortunately, and I hate to be a bearer of bad news, that PR doesn't actually enable weval: it's missing the config flag to the SpiderMonkey build that turns on the support. (comment here)

In addition to that, it will need to rebuild the IC corpus (that has to be done on every SM update, and is inherent to any AOT approach since we don't know IC bodies ahead of time) -- there's a readme in js/src/ics/ describing how.

In general for a "it's successful" result I'd expect a performance validation showing that compilation occurred -- happy to help get you to that point!

Tomasz Andrzejak (Feb 24 2026 at 17:18):

Damn... that's interesting because I did ran the performance tests and got pretty consistent boost for various JetStream tests -- could that be from just using ICs cache?

Tomasz Andrzejak (Feb 24 2026 at 17:20):

Also, I did rebuild the IC corpus, following the instructions you mentioned.

Chris Fallin (Feb 24 2026 at 17:22):

Huh, interesting, on the same build before and after --enable-aot?

(and sorry, didn't realize the ICs in your patch were rebuilt; usually I do it as a separate commit to keep rebases clean)

Tomasz Andrzejak (Feb 24 2026 at 17:24):

Yeah, I can split the ICs from other changes, that is probably a good idea.

Tomasz Andrzejak (Feb 24 2026 at 17:25):

Oh wait! I know what we are missing, check here:
https://github.com/bytecodealliance/StarlingMonkey/blob/main/cmake/spidermonkey.cmake#L149

Tomasz Andrzejak (Feb 24 2026 at 17:26):

The flags you mention are already in StarlingMonkey@main they were never removed. They are injected when you configure StarlingMonkey with WEVAL=ON

Tomasz Andrzejak (Feb 24 2026 at 17:28):

Also if you look at the CI weval step, there is this:

Specializing functions...
Inserting results into cache...
Updatimg memory image...
Function func11751: 1079 blocks, 8233 insts)
   specialized (328 times): 94992 blocks, 328275 insts
   virtstack: 9993 reads (2103 mem), 10427 writes (4730 mem)
   locals: 1240 reads (1017 mem), 568 writes (550 mem)
   live values at block starts: 850523 (8.953627673909383 per block)
Function func11753: 1840 blocks, 12827 insts)
   specialized (7 times): 140 blocks, 259 insts
   virtstack: 0 reads (0 mem), 0 writes (0 mem)
   locals: 0 reads (0 mem), 0 writes (0 mem)
   live values at block starts: 640 (4.571428571428571 per block)
Serializing back to binary form...
Performing post-filter pass to remove intrinsics...
Writing output file...

Chris Fallin (Feb 24 2026 at 17:30):

ah! fantastic! OK I didn't look carefully at what was deleted so I didn't realize that was still there

Chris Fallin (Feb 24 2026 at 17:30):

so, yeah, you're weval'ing

Chris Fallin (Feb 24 2026 at 17:30):

sorry for the too-cautious skepticism and thanks for doing this work :-)

Chris Fallin (Feb 24 2026 at 17:33):

I'm somewhat surprised that the JS op interpreter (second func) is not using state intrinsics (virtstack/locals) but I can stare at the rebased patch a bit when I get time to see if anything went wrong there (probably next week, after the plumber's summit)

Tomasz Andrzejak (Feb 24 2026 at 17:36):

Thank you! In the meantime I will work on jit-tests and jstests in the CI. I'm seeing a strange issue where sometimes the tests produce an empty IC file. I will try to get to the bottom of it.

Tomasz Andrzejak (Feb 24 2026 at 17:49):

one more question: should I also collect IC from jstests? The ics readme only mentions jit-tests but I guess if I want to run AOT jstests in CI I have to collect IC from those tests as well?

Chris Fallin (Feb 24 2026 at 17:54):

short answer: jit-tests should be sufficient.

longer answer: this is a little unlike classical PGO, in that we don't necessarily need representativeness of particular workloads or distributions. Rather all that is desired is that each particular path that generates some IC is hit at least once so we have that IC in the corpus and can AOT-compile it. The theory here is that the unit test suite should have a test for every case added to the compiler (in a healthy codebase), so collecting a corpus over that is sufficient.

and one more important bit is that we don't need anything in the corpus, strictly speaking; the IC interpreter is still there, so a missing case still works, just without AOT code (but we shouldn't have missing cases, and in my tests with corpus just from jit-tests, we don't)

Tomasz Andrzejak (Feb 24 2026 at 21:25):

OK, I got full spidermonkey test matrix passing on the weval branch: https://github.com/bytecodealliance/firefox/pull/1/checks

Chris Fallin (Mar 02 2026 at 16:36):

I'm going to do a release of waffle+weval today to allow your work to land in StarlingMonkey -- if anyone with rubber-stamp privileges could help by stamping https://github.com/bytecodealliance/waffle/pull/12 that'd be much appreciated :-) (Then I'll have two for weval, to bump waffle and to release)

Chris Fallin (Mar 02 2026 at 16:37):

(right now "rubber stamp privs" is the weval-core group plus BA admins; happy to consider adding folks to weval-core, maybe Tomasz is interested?)

Chris Fallin (Mar 02 2026 at 16:38):

(or, it looks like I can't modify the group; I guess that's up to a BA admin or TSC member)

Tomasz Andrzejak (Mar 02 2026 at 16:40):

Thank you!

happy to consider adding folks to weval-core, maybe Tomasz is interested?

Sure!

Chris Fallin (Mar 02 2026 at 16:45):

(cc @Till Schneidereit if you're ok with it to add Tomasz to weval-core ^^^ -- thanks!)

Tomasz Andrzejak (Mar 02 2026 at 16:57):

I think we can wait with https://github.com/bytecodealliance/firefox/pull/1 until weval is released, then I can bump wasi-sdk and test the whole chain again.

Chris Fallin (Mar 02 2026 at 16:58):

ok, @Tomasz Andrzejak , looks like you can rubber-stamp the above now (then I'll keep going down the list)

Chris Fallin (Mar 02 2026 at 17:04):

and then https://github.com/bytecodealliance/weval/pull/24

Ralph (Mar 02 2026 at 17:19):

nice!

Chris Fallin (Mar 02 2026 at 17:20):

and finally https://github.com/bytecodealliance/weval/pull/25 (thanks!)

Chris Fallin (Mar 02 2026 at 17:46):

ok, weval 0.4.0 is published (https://www.npmjs.com/package/@bytecodealliance/weval/v/0.4.0, https://github.com/bytecodealliance/weval/releases/tag/v0.4.0) with your changes -- thanks for the patience! should be good to use that in your PR now

Tomasz Andrzejak (Mar 02 2026 at 17:56):

Thank you! :bow:

Tomasz Andrzejak (Mar 02 2026 at 19:09):

FYI: wasi-sdk@30 adds support for -pthread which firefox build system recognize as supported and adds the flag. This makes llvm to generate atomic ops which waffle does not support yet. I'm stripping the flag from the config if target is wasi in firefox for now. I'm happy to add support for atomics down the road, but I don't think we have any usecase for wasi-threads in spidermonkey.

Tomasz Andrzejak (Mar 02 2026 at 19:15):

Unfortunately I couldn't find the way to strip the -pthread from within StarlingMonkey so I will have to add a tiny patch to ff :(

--- a/build/moz.configure/libraries.configure
+++ b/build/moz.configure/libraries.configure
@@ -93,7 +93,7 @@ moz_use_pthreads = (
     | check_symbol_in_libs(
         [None, "pthread"],
         symbol="pthread_create",
-        when=building_with_gnu_compatible_cc & ~target_is_darwin,
+        when=building_with_gnu_compatible_cc & ~target_is_darwin & ~target_is_wasi,
     ).found
 )

Chris Fallin (Mar 02 2026 at 19:17):

that seems like a reasonable patch to carry for now and yes, it should be pretty trivial to pass through the atomic ops in waffle eventually

Tomasz Andrzejak (Mar 03 2026 at 13:09):

This was a bit more chore than I expected but finally I got everything working in the PR. That includes:

bump to firefox 147 (latest statble) with weval annotations,
bump wasmtime to latest stable (42),
bump wizer to wasmtime wizer(42),
bump weval to latest release,
bump wasi-sdk to latest stable (30)
switch to c++23

Tomasz Andrzejak (Mar 03 2026 at 13:14):

@Till Schneidereit This is a lot of changes, but really most of them comes from fixing linter issues that appears after updating wasi-sdk and all of that are in a separate commit.

Tomasz Andrzejak (Mar 03 2026 at 14:37):

@Chris Fallin One thing that surprised me is that wevaled component is roughly 10M larger than release component. Looking at module memory it seems like a lot of it is just zeros. I see that wizer handles snapshotting memory by writing sparse segments but then weval reencode memory to dense segment, IIUC.

An immediate fix in StarlingMonkey is to use wasm-opt --memory-packing which brings the size from 21M to 14M. But I think we could also just adapt the wizer::snapshot_memories logic to weval::image::update function, does that make sense?

https://github.com/bytecodealliance/wasmtime/blob/main/crates/wizer/src/snapshot.rs#L120
https://github.com/bytecodealliance/weval/blob/main/src/image.rs#L73

Till Schneidereit (Mar 03 2026 at 15:05):

@Tomasz Andrzejak understood, and no problem: I'll review by commit. Is now a good time to do the review, or should I hold off still? I guess: un-draft whenever you think it's ready :slight_smile:

Tomasz Andrzejak (Mar 03 2026 at 15:07):

I think we need ff PR land first: https://github.com/bytecodealliance/firefox/pull/1

Then I can switch to spidermonkey from bytecodealliance repo and undraft the starling PR :)

Till Schneidereit (Mar 03 2026 at 15:12):

reviewed, and should be ready to merge

Chris Fallin (Mar 03 2026 at 15:34):

Tomasz Andrzejak said:

Chris Fallin One thing that surprised me is that wevaled component is roughly 10M larger than release component. Looking at module memory it seems like a lot of it is just zeros. I see that wizer handles snapshotting memory by writing sparse segments but then weval reencode memory to dense segment, IIUC.

An immediate fix in StarlingMonkey is to use wasm-opt --memory-packing which brings the size from 21M to 14M. But I think we could also just adapt the wizer::snapshot_memories logic to weval::image::update function, does that make sense?

https://github.com/bytecodealliance/wasmtime/blob/main/crates/wizer/src/snapshot.rs#L120
https://github.com/bytecodealliance/weval/blob/main/src/image.rs#L73

Yes, we should definitely do something better here! I wrote this when initially bringing up weval as a separate tool from wizer, and didn't want to replicate complex hole-finding/compression logic, hence "entire memory image goes in one segment"; but if we can factor that logic out and use it somehow, all the better.