Updating Gecko version · StarlingMonkey

@Chris Fallin without urgency: can you tell me which of the gecko-dev patches we can drop because they have since been upstreamed? It's awfully hard to figure that out with pointer-hopping from git to hg to BMO

Chris Fallin (Apr 11 2025 at 17:31):

Hi Till! So the work is upstreamed through AOT ICs (0044c7a in that series); however there is some divergence with upstream reviews and reorganization and the rest will likely not apply cleanly unfortunately

Chris Fallin (Apr 11 2025 at 17:31):

I started to work on this at one point a month ago and I was running into mysterious test failures -- I suspect this is a deep rabbit hole. Unfortunately I don't have the cycles to push it forward right now, sorry :-(

Till Schneidereit (Apr 11 2025 at 17:34):

It's a bit unfortunate, as we currently can't update the version of wasi-sdk used by StarlingMonkey, because for that we'd have to rebuild the SpiderMonkey object files—which doesn't work with newer wasi-sdks at the current revision. Which wouldn't be the end of the world, except CMake 4.0 was released recently, and the current version of wasi-sdk uses a config file that CMake doesn't like anymore :frown:

Till Schneidereit (Apr 11 2025 at 17:35):

Chris Fallin (Apr 11 2025 at 17:36):

Oh dear, that's quite the interconnected mess... yeah, I need to get this done eventually, I know it's holding things back; I'm likely to have more free time in August or September (after an internal demo/sprint) unless I reshuffle things some other way

Till Schneidereit (Apr 11 2025 at 17:40):

yeah, I know you're busy with internal things. For now I'll see if updating to the last Gecko LTS is an alternative

Chris Fallin (Jul 02 2025 at 21:24):

An update on this effort: I've spent the past three days on this and have a branch on which I'm testing things. The version of PBL here passes all JIT-tests still in interpreter mode. Unfortunately when wevaling, I'm running into some really odd code-patterns that modern LLVM is generating in the Wasm -- in particular, it seems to be tail-merging specialized and non-specialized paths in a way that breaks weval's context tracking and causes processing errors. I'm trying to work out a way to either guide the compiler with intrinsics (possibly workable, but brittle) or build stronger optimizations/preprocessing steps into weval to make this work.

(In a little more detail: there is an extremely maddening code sequence that LLVM is generating where, on checking for error after an IC return, we're supposed to be in shared tail-call-into-interpreter slowpath code; but it's merging this path with the happy path, then doing another redundant check of the error code and then branching to the slowpath. I suspect this may be some weird relooper thing to avoid irreducible control flow, somehow reusing some path-dependent value for the condition/flag. It will require something like path-sensitive constant propagation to peel it apart again.)

GitHub - cfallin/gecko-dev at cfallin/upstream-weval-annotations

Read-only Git mirror of the Mercurial gecko repositories at https://hg.mozilla.org. How to contribute: https://firefox-source-docs.mozilla.org/contributing/contribution_quickref.html - GitHub - cfa...

Chris Fallin (Jul 03 2025 at 00:02):

(OK, in typical darkest-before-dawn fashion that seems to repeatedly happen when I hit a wall -- it turns out that weval is a nice context-specializing machine; that's its whole deal; so I can context-specialize my way out of LLVM's tail-merging. Some other hiccups with AOT ICs it seems but maybe will have an update working in a bit that I can then rebase the rest of StarlingMonkey's patches onto and test with integration tests...)

Chris Fallin (Jul 03 2025 at 20:26):

So I've got a bit more working on that branch, but I'm still hitting unweval'able IC bodies that seem to be coming from some out-of-sync definition in PBL's IC interpreter not caught by upstream tests (leading to garbage during partial eval). I'll keep working on this.

In the meantime, once I get the PBL rebase working, it would be useful to have the rest-of-the-owl ready for me to combine with it to run StarlingMonkey's integration tests -- i.e., all of the BYOB-stream patches, etc. @Till Schneidereit or others, is there a branch somewhere with a recent rebase of all of that? E.g. a branch with everything but the weval patches rebased to latest release or main would be a very useful starting point for me. I can tackle this at some point if others don't but I'll be more out of my depth. Thanks!

Chris Fallin (Jul 04 2025 at 00:15):

OK, I'm going to give a somewhat unfortunate update here: I don't think updating the Gecko version that we support with AOT compilation is going to be feasible without substantial additional engineering.

The problem I've been banging into now is that LLVM seems to be doing some much more clever relooper-pass transforms (Wasm structured control flow reconstruction) where it re-uses branch conditions in ways that merge together different control flow paths statically, but reuse existing branch conditions to separate them dynamically. So when we have the weval traversal visit a particular opcode case that has an early out (e.g. an IC guard failure), we see something like: if (x != k) goto bail; fallthrough... but the goto bail path goes through a common exit block with fallthrough, and the x != k test is repeated further down to actually bail.

To maintain context specialization that allows weval to statically lay out the compiled code, it needs to reason about the two paths through that common exit block separately. But even if I use a nested context to try to do that, when we visit the common block in the exit context, we see a statically possible edge back to the top of the interpreter loop. This edge is never actually taken, because we're already in the x != k case and the edge is guarded with x == k. But weval doesn't have SCCP or equivalent that can track path-specific constant facts.

In this branch I've been building out some infrastructure that tries to do that, but the problem is that we don't directly pass a value through the branch-args that can be value-specialized; we see a branch on x != k and then we have to scan all other values in-scope and reason about them and potentially update their abstract analysis state. That's an ugly and non-scalable O(n^2) analysis problem.

So the tl;dr here is: LLVM got smarter in its generated Wasm; weval needs too much (non-scalable, non-feasible, non-tractable?) analysis to see through it; it's both unclear whether this is possible at all, and the exploration to know for sure is too much of a deep-dive for me to reasonably spend time on it. (A real fix here would need months of full-time probably, which is not something I have.)

Anyway, my recommendation is that if the SpiderMonkey update is more important, let's drop AOT; or if losing the 3-5x speedup would be too painful, then we're stuck on v127 and an older wasi-sdk for a while more. Sorry!

GitHub - bytecodealliance/weval at cfallin/value-specialization

the WebAssembly partial evaluator. Contribute to bytecodealliance/weval development by creating an account on GitHub.

Chris Fallin (Jul 04 2025 at 01:01):

Hmm, actually, there might be a way to throw some big-hammer intrinsics at this -- "unreachable if [context condition]" and basically force the specialization walk to terminate if it follows an edge we know is dynamically invalid. (I'm vacillating between "grad student mode" and "actually take the three-day weekend", sigh)

Chris Fallin (Jul 04 2025 at 01:21):

OK, yeah, that leads to further problems. There's a meta-problem of brittleness here where we rely too much on the shape of the toolchain in ways that can change; that only works if there's a full team tracking this, not little slices of my time. This is pretty insurmountable. Sorry!

Till Schneidereit (Jul 07 2025 at 08:35):

The conclusions seem pretty unfortunate, but like something we'll have to live with for now. One thing that might still be worth looking into: is there maybe some flag we can pass to llvm to change the codegen? Perhaps this is the result of a new pass, which we could disable?

Regardless, I'll prepare the branch you mentioned, of all non-BPL patches on current FF release. We can then either ship that without weval, or maybe we get really lucky and my suggestion above or some other hail-mary allows us to use weval after all

Chris Fallin (Jul 07 2025 at 22:47):

After a weekend of being really really bothered by this, I'm digging in more, and I'm just implementing SCCP. Branch linked above has weval with propagation of == / != conditions; debugging a few more is showing I need inequality propagation too (one branch tests x < y, another tests z < w where z and w are aliases via blockparams for x and y, and we need to only visit paths where these branches go the same way). This is making the cprop lattice more complex too (already have "this concrete value" and "some value that is not this concrete value", now will need ranges too) but this is tractable -- would rather make the tool more general like this. 75% chance of working -- no promises yet...

Chris Fallin (Jul 08 2025 at 07:45):

OK, yeah, this is turning into an even deeper rabbithole. I implemented range-tracking (including the hairy off-by-one cases because LLVM is apparently smart enough to generate i32.le_u with k then i32.lt_u with k + 1 elsewhere) but it looks like I also need conditional/path-sensitive blockparam elimination (!!) to make this work -- omg llvm why?!

I suspect that the annotate-unreachable-with-intrinsics could work with a bit more thought but I'm gonna timebox myself here. (Couldn't find options to change the behavior in LLVM either -- this seems like some deep optimizer/Wasm backend change, not an optional pass doing something) Wasm exceptions also need doing and have less variance in completion time, so I'll spend time on that, and come back to this later.

Ralph (Jul 08 2025 at 12:15):

preserve the brain. this is a hard problem. you can come back to it having relaxed the brain a bit.

Chris Fallin (Aug 28 2025 at 00:18):

An update on this: I put a few more days into it; through some truly wacky path-sensitive constant propagation magic, as well as some just-so tweaks to context updates, I have a working SpiderMonkey-on-weval again for one Octane benchmark, Richards, at this gecko-dev branch and this weval branch.

Unfortunately after getting that working, I then expanded to the rest of Octane, and found a number of other "partial evaluation ran off the end because of path-merging" kinds of errors, and...

... I think I'm going to admit defeat here: new LLVM is doing stuff that is too weird, and it's not viable to partial-evaluate its output. Anyone who wants the 3-5x perf improvements from weval should use the old StarlingMonkey. Sorry!

Long-term, the more maintainable solution is probably to spend ~1-1.5 years fulltime building a dedicated AOT compiler in the "traditional" way, after building out debugging facilities in Wasmtime so that we have adequate abilities to bring it up. I picked partial eval because it was the right way to get a fully compatible, fully AOT thing working "quickly" (~1 year), was viable to develop the JS semantics part mostly on native (without good Wasm debug facilities), and should have been great for shared maintenance effort with non-Wasm folks. Unfortunately I didn't foresee LLVM becoming too clever here, and didn't foresee the debugging difficulties that would have me grepping through 1M-line logs of compiler traces and IR for days...

GitHub - cfallin/gecko-dev at cfallin/upstream-weval-annotations

GitHub - bytecodealliance/weval at cfallin/value-specialization

the WebAssembly partial evaluator. Contribute to bytecodealliance/weval development by creating an account on GitHub.

bjorn3 (Aug 28 2025 at 07:33):

How feasible would it be to add a flag to LLVM to disable this path merging optimization?

Chris Fallin (Aug 28 2025 at 15:59):

Possibly? I did root around a bit to see if I could identify any recently-changed bits in the Wasm backend, and didn't see anything obvious at least. Perhaps @Dan Gohman could help at some point (I know weval-based perf is important to you!)

Tomasz Andrzejak (Aug 28 2025 at 16:06):

I was wondering if bisecting LLVM to pin the last working commit would be both feasible and valuable? For this, we would likely need a minimal reproducible example that can serve as a test case.

Chris Fallin (Aug 28 2025 at 16:42):

Sure, yep, that would be ideal -- right now the repro is "build SpiderMonkey and then try to weval Octane". At some point in another timebox (not now!) I can try to reproduce the path merging with a little standalone C program or something

Tomasz Andrzejak (Aug 28 2025 at 17:57):

Well, if you think it's worthwhile, I could try to help and bisect using this repro. Building both clang and SpiderMonkey "shouldn't" be too bad with ccache -- I hope :fingers_crossed: . Do you have any scripts/environment that would be useful ? :slight_smile:

Chris Fallin (Aug 28 2025 at 18:03):

$ cd weval/; cargo build --release
$ cd ../gecko-dev/; MOZCONFIG=weval-mozconfig ./mach build
$ ../weval/target/release/weval weval --show-stats -w -i obj-weval/dist/bin/js -o octane.weval.wasm < octane.js

this should fail with a bunch of debug output from weval, with the version of clang/LLVM pulled down by the mozbuild infrastructure. You'll need to override the C/C++ compiler used in the mozconfig if building with a custom LLVM, I think

Chris Fallin (Aug 28 2025 at 18:05):

it's entirely possible that in my messing around with path-sensitive constant prop, I will have broken something else once the path-merging issue is taken away; the only real way to know is to look at the trace output (RUST_LOG=weval=trace RAYON_NUM_THREADS=1 weval ...) which is gigabytes of meaningless mush for anyone who isn't me :-)

Stream: StarlingMonkey

Topic: Updating Gecko version

Till Schneidereit (Apr 11 2025 at 17:26):

Chris Fallin (Apr 11 2025 at 17:31):

Chris Fallin (Apr 11 2025 at 17:31):

Till Schneidereit (Apr 11 2025 at 17:34):

Till Schneidereit (Apr 11 2025 at 17:35):

Chris Fallin (Apr 11 2025 at 17:36):

Till Schneidereit (Apr 11 2025 at 17:40):

Chris Fallin (Jul 02 2025 at 21:24):

Chris Fallin (Jul 03 2025 at 00:02):

Chris Fallin (Jul 03 2025 at 20:26):

Chris Fallin (Jul 04 2025 at 00:15):

Chris Fallin (Jul 04 2025 at 01:01):

Chris Fallin (Jul 04 2025 at 01:21):

Till Schneidereit (Jul 07 2025 at 08:35):

Chris Fallin (Jul 07 2025 at 22:47):

Chris Fallin (Jul 08 2025 at 07:45):

Ralph (Jul 08 2025 at 12:15):

Chris Fallin (Aug 28 2025 at 00:18):

bjorn3 (Aug 28 2025 at 07:33):

Chris Fallin (Aug 28 2025 at 15:59):

Tomasz Andrzejak (Aug 28 2025 at 16:06):

Chris Fallin (Aug 28 2025 at 16:42):

Tomasz Andrzejak (Aug 28 2025 at 17:57):

Chris Fallin (Aug 28 2025 at 18:03):

Chris Fallin (Aug 28 2025 at 18:05):