Stream: git-wasmtime

Topic: wasmtime / PR #8125 cranelift-wasm: Emit constant bounds ...


view this post on Zulip Wasmtime GitHub notifications bot (Mar 13 2024 at 23:07):

jameysharp opened PR #8125 from jameysharp:table-constant-bounds to bytecodealliance:main:

WebAssembly tables have a minimum size, but may optionally also have a maximum size. If the maximum is set to the same number of elements as the minimum, then we can emit an iconst instruction for bounds-checks on tables, rather than loading the bounds from memory.

That's a win in its own right, but when optimization is enabled, this also allows bounds-checks to constant-fold if the table index is provided as an integer constant.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 13 2024 at 23:07):

jameysharp requested wasmtime-core-reviewers for a review on PR #8125.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 13 2024 at 23:07):

jameysharp requested alexcrichton for a review on PR #8125.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 13 2024 at 23:07):

jameysharp requested abrown for a review on PR #8125.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 13 2024 at 23:07):

jameysharp requested wasmtime-compiler-reviewers for a review on PR #8125.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 13 2024 at 23:16):

alexcrichton submitted PR review:

Looks good to me!

Before I hit the big green button though, two questions/observations:

To be clear I don't think either of these need to be addressed before this PR lands, so you can feel free to hit the button as well, figured this'd be a good place to ask anyway!

view this post on Zulip Wasmtime GitHub notifications bot (Mar 13 2024 at 23:43):

cfallin commented on PR #8125:

One thing I'm a little sad about here is that we get to give Cranelift a bunch of constants but because they're involved in control flow it may not end up helping a whole lot. For example in the examples you added here the brif can become br, but I don't think we currently optimize that. In lieu of adding more optimizations, do you think that the frontend will be able to detect two constants and omit the control flow entirely? Basically doing that form of constant folding in the frontend rather than in the middle-end.

Having just talked to @jameysharp about his plans my understanding is that the next steps will address this: removing the branch-on-OOB in exchange for a select-null-pointer approach, and then separately, optimizing away the select if (and only if!) the choice-input is constant. The overall motivation/arc here is that we want to make i32.iconst N ; call_indirect ... as fast as possible, and in particular if we have non-nullable typed funcrefs and use the typed variant of call_indirect, with a bit more too (e.g. removing the null check) we should eventually have load-the-function-pointer-and-call machine code.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 13 2024 at 23:44):

cfallin edited a comment on PR #8125:

One thing I'm a little sad about here is that we get to give Cranelift a bunch of constants but because they're involved in control flow it may not end up helping a whole lot. For example in the examples you added here the brif can become br, but I don't think we currently optimize that. In lieu of adding more optimizations, do you think that the frontend will be able to detect two constants and omit the control flow entirely? Basically doing that form of constant folding in the frontend rather than in the middle-end.

Having just talked to @jameysharp about his plans my understanding is that the next steps will address this: removing the branch-on-OOB in exchange for a select-null-pointer approach, and then separately, optimizing away the select if (and only if!) the choice-input is constant. The overall motivation/arc here is that we want to make i32.iconst N ; table.get; call_ref ... as fast as possible, and in particular if we have non-nullable typed funcrefs and use the typed variant of call_ref, with a bit more too (e.g. removing the null check) we should eventually have load-the-function-pointer-and-call machine code.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 14 2024 at 00:12):

jameysharp commented on PR #8125:

Chris' response accurately reflects my plans for how to address your second question, Alex. I'm currently working on getting rid of the trapnz, and it works on x64, but I need to do a bit more work for the rest of the backends. #8122 was the first step on that. Once that is out of the way, the existing constant-folding optimizations get us most of the way to where we want to be, and as Chris said, eliminating select_spectre_guard when its condition is constant does the rest.

To your first question, I've been trying to push as much as I can into cranelift-wasm shared code, but many details (like "how do I discover the current table bounds", lazy table initialization, and GC refcount management) are unfortunately specific to a particular runtime, so I haven't been able to find much opportunity to share more than this. I'd love to hear suggestions though!

view this post on Zulip Wasmtime GitHub notifications bot (Mar 14 2024 at 00:28):

alexcrichton commented on PR #8125:

Oh nice, yeah changing to select{,_spectre} sounds like a nice way to solve this without much work in the frontend. And agreed that it'd be great for i32.const N + call_indirect to be load/call in the most optimal case.

I'd love to hear suggestions though!

Ah yeah if all the major bits go in cranelift-wasm then I think that's fine, I think I over-rotated on the let bound = ... bit that's basically the same in func_environ.rs and dummy.rs.

AFAIK there's no use case for keeping dummy.rs though other than "we haven't figure out a way to test Wasmtime". I might be just so inspired by this to take a stab at migrating all the tests to using Wasmtime's codegen...

view this post on Zulip Wasmtime GitHub notifications bot (Mar 14 2024 at 00:30):

jameysharp commented on PR #8125:

I would love to have CLIF and compiled-output tests that run through Wasmtime instead of using the dummy environment, personally! My current sense is there are plenty of reasons why that's difficult, but if you get it working I'm all for it.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 14 2024 at 02:06):

jameysharp merged PR #8125.


Last updated: Jan 24 2025 at 00:11 UTC