WASM Mixins
Argh, didn't mean to send that yet. Hold on.
Hiya!
I've had an idea for a project that I've been casually exploring for a little bit: WASM Mixins.
Essentially, "safe" code injection. I don't think there's any practical use for this kind of thing, beyond "modding" programs that allow it, so it's really just for fun. I'm basing myself off of Minecraft modding mixins, which injects JVM bytecode. JVM bytecode is much higher level than WASM, so it is relatively easy to do basic injection; simply add a call!
In WASM, the call would need to call a new function. For very basic code injection, this is fine, because we can simply append a new function. However, this does not work with code injection that uses memory (because only one memory is defined per module (currently)), other functions (because the function indices would need to be shifted, which is very complex), etc.
The only solution I can see is to import the injected function (and then call it, though that's a little slower, it's okay).
Problem: this also shifts the entire function space of the injected module. Possible solution: "reserve" some function indices with fake functions, that can be removed freely?
If you see any other issues, or have other solutions, I'd be interested to hear them (or even if you don't and just wanna play around with the idea too)!
In some ways I think this would actually be easier than you suspect; binary WASM (or equivalent textual WAT) is much easier to manipulate than typical native ASM (not sure how it would compare to JVM bytecode). You are correct that mucking around with a module's linear memory could be tricky, but it seems like multi-memory actually would be a reasonable solution in this scenario by twiddling memory indexes as part of the manipulation. :shrug:
Yeah, definitely easier than native ASM, which is why I'm actually interested in implementing this :P
I don't think you can twiddle memory indexes; currently all base instructions implicitly refer to memory index 0 (though, if I had a WASM runtime that has multiple-memory support, I could make all injected functions refer to their respective memories). The other problem is function indexes, since function "pointers" (indices, really) exist, and can be manipulated as data... before every call in an injected function, I would have to offset it by that injection's memory offset.
Actually a lot of these problems can be strongly reduced with a custom runtime (all index spaces can be offset when in their respective cases), but I'm not sure how much I want to veer into that, because non-compliant binaries would break all existing tooling.
Somebody else suggested injecting a single new import, which would be a delegating function. The delegating function would call the injected functions, and it would be constructed at injection time. Different injected functions would have different signatures, so some form of arbitrary message passing would be needed (and since I'm working in raw WASM, it needs to be relatively simple and agnostic, not a malloc or something).
if I had a WASM runtime that has multiple-memory support
https://docs.rs/wasmtime/20.0.2/wasmtime/struct.Config.html#method.wasm_multi_memory :smile:
Well, analogously I would need function and table offsets (alternatively, right before any call or table fetch, offset the index by the function's offset in that index space. But that's difficult to implement and not the fastest).
Have you seen https://wingolog.org/archives/2022/08/18/just-in-time-code-generation-within-webassembly ? Andy outlined a scheme for adding new functions to a wasm instance by manipulating the table
oo, I haven't! That seems really relevant, thank you!
Pat Hickey said:
Have you seen https://wingolog.org/archives/2022/08/18/just-in-time-code-generation-within-webassembly ? Andy outlined a scheme for adding new functions to a wasm instance by manipulating the table
I think I've understood it now. In the dynamic version, the main module dynamically generates a new module, with a patch
function. Also, a new function is appended to the main module. The... handler, for lack of a better word, instantiates this new module, then calls patch
. patch
imports a table from the main module, injects the newly added function to that table, and when the main module runs, it'll call_indirect
the newly added function.
Very neat!
I ended up finding walrus
, which helps a lot. Anyway, I now have function redirection!
INITIAL MODULE:
From fn_zero
MIXED MODULE:
From fn_one
Pat Hickey said:
Have you seen https://wingolog.org/archives/2022/08/18/just-in-time-code-generation-within-webassembly ? Andy outlined a scheme for adding new functions to a wasm instance by manipulating the table
oh this is neat!
Last updated: Nov 22 2024 at 17:03 UTC