blog post: compilation of JS to Wasm, AOT vs. JIT · wasm

Stream: wasm

Topic: blog post: compilation of JS to Wasm, AOT vs. JIT

Chris Fallin (Aug 27 2024 at 15:34):

I've just posted a blog post here: https://cfallin.org/blog/2024/08/27/aot-js/

(this is the first in a pair of posts about the weval-based JS compilation work I've been doing; second half will come tomorrow)

Ralph (Aug 28 2024 at 13:00):

as it was morning here, I'm WAITING, CHRIS

Chris Fallin (Aug 28 2024 at 16:41):

Unfortunately my blog is only open 9am-5pm Pacific time, weekdays, with federal holidays and other random vacations excluded; your request is being processed :-)

Chris Fallin (Aug 28 2024 at 16:42):

Alright, a followup with the next part, about weval itself: https://cfallin.org/blog/2024/08/28/weval/

Victor Adossi (Aug 30 2024 at 05:27):

I think you might be out of business hours right now but ahumble feature request for your blog: automatic dark mode

Chris Fallin (Aug 30 2024 at 05:47):

if anyone can tell me how to do that with a hacked up Jekyll site generator tree from 2014, that only works with a very specific version of Ruby and various gems locked in place via nix-shell, I'm all ears!

Chris Fallin (Aug 30 2024 at 05:47):

(my html skills are current circa 1998)

fitzgen (he/him) (Aug 30 2024 at 15:04):

Specific Jekyll and specific ruby version, both outdated, is way too real :sob:

Milan (Aug 30 2024 at 15:16):

Chris Fallin said:

if anyone can tell me how to do that with a hacked up Jekyll site generator tree from 2014, that only works with a very specific version of Ruby and various gems locked in place via nix-shell, I'm all ears!

FWIW, it is a CSS only change to be responsive to the system light-dark preference. Override the styles for the prefers-color-scheme media query :)

prefers-color-scheme - CSS: Cascading Style Sheets | MDN

The prefers-color-scheme CSS media feature is used to detect if a user has requested light or dark color themes. A user indicates their preference through an operating system setting (e.g. light or dark mode) or a user agent setting.

Victor Adossi (Aug 31 2024 at 02:49):

The trick for that being... you'll have to reconfigure all those styles :)

Also if the source is open I'm happy to contribute!

Victor Adossi (Aug 31 2024 at 03:48):

Wow just realized my own blog didn't have it, and it was amazingly easy to add with prefers-color-scheme. CSS in 2024, amazing

Olivier FAURE (Sep 08 2024 at 09:04):

These blog posts aren't for the faint of heart. I think I've read the "JavaScript compilation" parts 1 and 2 about three times already and I'm still a bit hazy on the details.

Olivier FAURE (Sep 08 2024 at 09:16):

Looking at the second post, I'm still confused about how pre-compiling a corpus of ICs can possibly work for object shapes.

Olivier FAURE (Sep 08 2024 at 09:21):

Says I have this code:

const obj = { foo: 42, bar: "Hello world" };

// ...

const x = obj.foo + 3;

To properly optimize this code, the JIT needs the following information:

obj has the shape (Number, String).
foo is the first element of that shape.
Therefore obj.foo is a Number.

Olivier FAURE (Sep 08 2024 at 09:29):

How can you possibly encode this in pre-compiled ICs? What do the ICs look like? I can imagine an IC that's like:

GuardShape obj, (Number, String)
GuardField obj, "foo", 0
LoadNumberField obj, 0

But then wouldn't you need thousands of ICs for any field access anywhere in your code? Even if you're only covering shapes with up to four fields that's already thousands of possible shapes. Doing a linked list search through those can't possibly be more efficient that a hashmap access.

Olivier FAURE (Sep 08 2024 at 10:10):

Second question

Olivier FAURE (Sep 08 2024 at 10:13):

Regarding weval, I'm a little surprised you had to go through so much trouble with the update_context intrinsic.

Olivier FAURE (Sep 08 2024 at 10:13):

Does SpiderMonkey's bytecode not have a concept of EBBs? Because if so, it feels like you could just have a run_ebb function that takes a slice of bytecode instructions as a parameter and loops over them, and add an annotation to unroll that loop.

Olivier FAURE (Sep 08 2024 at 10:15):

(Plus the annotation you currently use to mark that slice as constant.)

bjorn3 (Sep 08 2024 at 14:40):

I believe polymorphic inline caches can have arguments which indicate eg the offset of the field to access and the expected type and then the inline cache itself would use those arguments instead of hard coding them. This allows generating a single polymorphic inline cache for each shape of inline cache we want rather than fully specializing it.

Olivier FAURE (Sep 08 2024 at 15:22):

This allows generating a single polymorphic inline cache for each shape of inline cache

If the field offset and type are passed as arguments, I'm not sure I have a model of what you mean by "each shape of inline cache" here.

Chris Fallin (Sep 08 2024 at 21:36):

Hi @Olivier FAURE , I'd recommend you read the 2023 paper on CacheIR, linked in my blog posts, for more information on how that part of SpiderMonkey works.

In a little more detail: your conception of what an "object shape" is is a little off, and also you're missing the notion of parameterized ICs. The idea is that the IC code is constant, and known ahead of time, but it is parameterized on runtime values in the "stub data" that are attached when the IC chain is filled out. The IC for a simple property access (no weird corner cases) is: check the shape pointer (encodes mapping from field names to offsets); load offset. And both the shape pointer and offset are in teh stub data, so we have one IC body that we can use for every property access everywhere.

Chris Fallin (Sep 08 2024 at 21:38):

Re: SpiderMonkey bytecode and EBBs: I think there may be such a property but it's fairly irrelevant: the entire point of the weval-based approach is that we do not know anything about the interpreter or the code it's interpreting, beyond the one intrinsic to denote "change of PC". More complex intermeshing of properties limits the scope of applicability and makes reasoning about bugs much harder.

Catherine (whitequark) (Sep 10 2024 at 19:22):

@Chris Fallin The weval transform is incredibly cool. Some of my earlier work (on Foundry and ARTIQ) used similar principles, but it didn't go as far, and I'm really happy to see this research. I hope one day I'll be able to apply it.

Olivier FAURE (Nov 06 2024 at 13:06):

So, reading the 2023 paper...

In addition to operands, CacheIR stubs have stub fields, which are values associated with and used within the stub. For Baseline ICs, stub fields facilitate the sharing of native code for stubs that are identical except for offsets and pointer values, and simplify the process of integrating stubs into the garbage collector.

So... My understanding of what happens when you execute let x = foo.bar:

The interpreter / compiled code reaches a dynamic jump instruction pointing to the IC chain.
The first stub in the IC chain has a pointer to some x86/ARM/etc code which is jumped to.
The code starts with some "guard" instructions that checks that the foo object has the expected shape/hidden-class/etc. "Shape" in that context means a specific set of field names and types, so if the IC was generated for a shape baz: Number, bar: Object and foo currently has shape baz: String, bar: Object, you're out of luck, even though the binary code would work the same in principle. (On the plus side, because shapes are immutable, that check is a single pointer compare.)
If the guard instruction exits, it jumps to the next IC in the chain.
If the guard instruction fall through, the rest of the IC's binary is executed. In that case, the binary being executed is "add an offset to the base pointer, load the resulting address, return the value to the main code".

Regarding stub fields specifically, the IC stub includes both a pointer to x86 executable code, and some additional variables that the executable code can read; in the case of the field access above, those additional variables are "a pointer to the immutable shape, and a field offset".

Olivier FAURE (Nov 06 2024 at 13:11):

So to answer my own question, the "executable code" parts of the ICs can be precompiled because they're massively shared between ICs. Even if your program has thousands of shapes, it only needs one brneq reg1 reg2; load reg1+reg3; ret; chunk of executable code. Each callsite can store pointers to that code alongside different values for reg2 (shape pointer) and reg3 (offset), and these values can be generated dynamically.

Chris Fallin (Nov 06 2024 at 15:38):

Yep, that’s pretty much it!

Last updated: Apr 18 2025 at 07:03 UTC