Stream: general

Topic: WASM mixins


view this post on Zulip Sekoia (May 15 2024 at 11:53):

WASM Mixins

view this post on Zulip Sekoia (May 15 2024 at 11:53):

Argh, didn't mean to send that yet. Hold on.

view this post on Zulip Sekoia (May 15 2024 at 12:06):

Hiya!
I've had an idea for a project that I've been casually exploring for a little bit: WASM Mixins.

Essentially, "safe" code injection. I don't think there's any practical use for this kind of thing, beyond "modding" programs that allow it, so it's really just for fun. I'm basing myself off of Minecraft modding mixins, which injects JVM bytecode. JVM bytecode is much higher level than WASM, so it is relatively easy to do basic injection; simply add a call!

In WASM, the call would need to call a new function. For very basic code injection, this is fine, because we can simply append a new function. However, this does not work with code injection that uses memory (because only one memory is defined per module (currently)), other functions (because the function indices would need to be shifted, which is very complex), etc.
The only solution I can see is to import the injected function (and then call it, though that's a little slower, it's okay).

Problem: this also shifts the entire function space of the injected module. Possible solution: "reserve" some function indices with fake functions, that can be removed freely?

If you see any other issues, or have other solutions, I'd be interested to hear them (or even if you don't and just wanna play around with the idea too)!

view this post on Zulip Lann Martin (May 15 2024 at 12:41):

In some ways I think this would actually be easier than you suspect; binary WASM (or equivalent textual WAT) is much easier to manipulate than typical native ASM (not sure how it would compare to JVM bytecode). You are correct that mucking around with a module's linear memory could be tricky, but it seems like multi-memory actually would be a reasonable solution in this scenario by twiddling memory indexes as part of the manipulation. :shrug:

view this post on Zulip Sekoia (May 15 2024 at 13:39):

Yeah, definitely easier than native ASM, which is why I'm actually interested in implementing this :P

I don't think you can twiddle memory indexes; currently all base instructions implicitly refer to memory index 0 (though, if I had a WASM runtime that has multiple-memory support, I could make all injected functions refer to their respective memories). The other problem is function indexes, since function "pointers" (indices, really) exist, and can be manipulated as data... before every call in an injected function, I would have to offset it by that injection's memory offset.

Actually a lot of these problems can be strongly reduced with a custom runtime (all index spaces can be offset when in their respective cases), but I'm not sure how much I want to veer into that, because non-compliant binaries would break all existing tooling.

view this post on Zulip Sekoia (May 15 2024 at 13:42):

Somebody else suggested injecting a single new import, which would be a delegating function. The delegating function would call the injected functions, and it would be constructed at injection time. Different injected functions would have different signatures, so some form of arbitrary message passing would be needed (and since I'm working in raw WASM, it needs to be relatively simple and agnostic, not a malloc or something).

view this post on Zulip Lann Martin (May 15 2024 at 13:47):

if I had a WASM runtime that has multiple-memory support

https://docs.rs/wasmtime/20.0.2/wasmtime/struct.Config.html#method.wasm_multi_memory :smile:

view this post on Zulip Sekoia (May 15 2024 at 13:56):

Well, analogously I would need function and table offsets (alternatively, right before any call or table fetch, offset the index by the function's offset in that index space. But that's difficult to implement and not the fastest).

view this post on Zulip Pat Hickey (May 15 2024 at 14:47):

Have you seen https://wingolog.org/archives/2022/08/18/just-in-time-code-generation-within-webassembly ? Andy outlined a scheme for adding new functions to a wasm instance by manipulating the table

view this post on Zulip Sekoia (May 15 2024 at 14:57):

oo, I haven't! That seems really relevant, thank you!

view this post on Zulip Sekoia (May 16 2024 at 14:30):

Pat Hickey said:

Have you seen https://wingolog.org/archives/2022/08/18/just-in-time-code-generation-within-webassembly ? Andy outlined a scheme for adding new functions to a wasm instance by manipulating the table

I think I've understood it now. In the dynamic version, the main module dynamically generates a new module, with a patch function. Also, a new function is appended to the main module. The... handler, for lack of a better word, instantiates this new module, then calls patch. patch imports a table from the main module, injects the newly added function to that table, and when the main module runs, it'll call_indirect the newly added function.

Very neat!

view this post on Zulip Sekoia (May 18 2024 at 23:21):

I ended up finding walrus, which helps a lot. Anyway, I now have function redirection!

INITIAL MODULE:
From fn_zero
MIXED MODULE:
From fn_one

view this post on Zulip Soni L. (May 19 2024 at 14:40):

Pat Hickey said:

Have you seen https://wingolog.org/archives/2022/08/18/just-in-time-code-generation-within-webassembly ? Andy outlined a scheme for adding new functions to a wasm instance by manipulating the table

oh this is neat!


Last updated: Oct 23 2024 at 20:03 UTC