Stream: wasi

Topic: wasi-preview1-component-adapter size


view this post on Zulip bjorn3 (Jun 21 2024 at 13:25):

wasi-preview1-component-adapter currently results in a 78K wasm file. While pretty small already, it would be nice to get it even smaller. Some (relatively) easy wins may be:

view this post on Zulip bjorn3 (Jun 21 2024 at 13:26):

view this post on Zulip bjorn3 (Jun 21 2024 at 13:28):

view this post on Zulip bjorn3 (Jun 21 2024 at 13:31):

view this post on Zulip bjorn3 (Jun 21 2024 at 13:34):

cc @Yoshua Wuyts to increase the gap between native and wasip2 for your TCP echo server even more.

view this post on Zulip Lann Martin (Jun 21 2024 at 13:36):

I've been thinking about adapter size recently; having a standard approach to trapping with some output without std::fmt would be really helpful.

view this post on Zulip Lann Martin (Jun 21 2024 at 13:37):

Would it make sense to have an e.g. wasi-nostd-helpers crate or something?

view this post on Zulip bjorn3 (Jun 21 2024 at 13:39):

Re using bulk-memory: RUSTFLAGS="-Ctarget-feature=+bulk-memory" saves 1481 bytes total (79409 -> 77928).

view this post on Zulip Alex Crichton (Jun 21 2024 at 13:41):

I think these would all be quite reasonable to implement, even bulk-memory is so widespread nowadays I don't think there's any particular reason to leave it turned off

view this post on Zulip bjorn3 (Jun 21 2024 at 13:42):

Replacing the unreachable!() in trapping_unwrap with core::arch::wasm32::unreachable() reduces the size by another 2925 bytes (77928 -> 75003).

view this post on Zulip bjorn3 (Jun 21 2024 at 13:43):

Enabling LTO another 2247 bytes (75003 -> 72756).

view this post on Zulip bjorn3 (Jun 21 2024 at 13:50):

I just noticed that unreachable!() is already not the libstd version, but one provided by the component adapter itself, so all the wins for the unreachable!() replacement change are likely just caused by skipping the pretty failure message using eprint!().

view this post on Zulip bjorn3 (Jun 21 2024 at 13:52):

Most of the unreachable!() macro can probably be moved to a new function to deduplicate the code between call sites.

view this post on Zulip Lann Martin (Jun 21 2024 at 13:53):

We could still have pretty output without std I think since wasi stderr doesn't require a lock?

view this post on Zulip bjorn3 (Jun 21 2024 at 13:53):

Also the eprint!("unreachable executed at adapter line "); crate::macros::eprint_u32(line!()); can be replaced with eprint!(concat!("unreachable executed at adapter line ", line!())) to remove the eprint_u32 function.

view this post on Zulip bjorn3 (Jun 21 2024 at 13:54):

Lann Martin said:

We could still have pretty output without std I think since wasi stderr doesn't require a lock?

Turns out that is exactly what is done already. It is just that most of this code is duplicated at each unreachable!() call site.

view this post on Zulip Lann Martin (Jun 21 2024 at 13:55):

Oh sure enough...I didn't scroll up :sweat_smile:

view this post on Zulip bjorn3 (Jun 21 2024 at 13:57):

Just bulk-memory + LTO saves 4573 bytes.

view this post on Zulip Yoshua Wuyts (Jun 21 2024 at 13:58):

@bjorn3 out of curiosity: does this save anything on the base binary too - or just on the adapter?

view this post on Zulip Pat Hickey (Jun 21 2024 at 13:59):

beware, many of the macro (and other) shenanigans in the adapter are done the not obvious or idiomatic way in order to dance around llvm creating anything that ends up in the data section

view this post on Zulip Alex Crichton (Jun 21 2024 at 14:00):

One thing I'll note as well is that the adapter is automatically GC'd, e.g. it exports every single wasip1 function but most modules don't import all of them. The wit-component adapter process will remove all exports that aren't needed and then GC the wasm module itself, so the full size of the adapter is unlikely to go into a final component. Either that or if the importing component reduces its imports as well that's a way to shrink the adapter.

view this post on Zulip Alex Crichton (Jun 21 2024 at 14:00):

One thing that might be worth testing is that LLVM is known to produce pretty suboptimal binaries size-wise and running through wasm-opt can probably shave off 30-40%

view this post on Zulip Yoshua Wuyts (Jun 21 2024 at 14:00):

bjorn3 said:

cc Yoshua Wuyts to increase the gap between native and wasip2 for your TCP echo server even more.

by the way, for context on this - here are the numbers I found the other day

view this post on Zulip Yoshua Wuyts (Jun 21 2024 at 14:01):

The results: async-std comes in at half a Megabyte for the echo server. WASI 0.2 comes in at just 100 Kilobytes. And in even better news: it currently still uses a WASI 0.1 adapter that weighs 80 Kilobytes. WASI binaries are small.

view this post on Zulip Pat Hickey (Jun 21 2024 at 14:03):

if you want to optimize the adapter down to 0, there is the remaining work in wasi-libc to use p2 for filesystem and everything else. the p2 support in there is right now limited to sockets

view this post on Zulip Pat Hickey (Jun 21 2024 at 14:05):

that would additionally unlock using rust, c, c++ to target the new single-module representation of a component that luke has been presenting

view this post on Zulip Pat Hickey (Jun 21 2024 at 14:05):

which saves even more bytes by not encoding any of the component type information

view this post on Zulip bjorn3 (Jun 21 2024 at 15:46):

Opened https://github.com/bytecodealliance/wasmtime/pull/8858 for LTO + bulk-memory

This reduces the size of wasi_snapshot_preview1.command.wasm from 79625 bytes to 75029 bytes for a total win of 4596 bytes. Of this reduction enabling LTO is responsible for 3103 bytes, while enabl...

view this post on Zulip bjorn3 (Jun 21 2024 at 16:29):

Managed to save another 21154 bytes (75029 -> 53875) by changing the unreachable!() and assert!() macros. This is without losing any information that may be useful for debugging, but with a slight tweak to the assertion failure message from "unreachable executed at adapter line ...: assertion failure" to "assertion failed at adapter line ...", which is slightly shorter. This tweak is only a small part of the saved bytes, but I figured I did still make it.

view this post on Zulip Alex Crichton (Jun 21 2024 at 16:31):

wow that's a huge reduction of 30%?!

view this post on Zulip bjorn3 (Jun 21 2024 at 16:32):

Yeah!

view this post on Zulip bjorn3 (Jun 21 2024 at 16:33):

https://github.com/bytecodealliance/wasmtime/pull/8859

This reduces the size of wasi_snapshot_preview1.command.wasm from 75029 bytes to 53875 bytes for a total win of 21154 bytes. This is done by deduplicating most of the trap messages and the code for...

view this post on Zulip bjorn3 (Jun 21 2024 at 16:39):

Got another easy 1.6k win, will push in a bit.

view this post on Zulip bjorn3 (Jun 21 2024 at 16:42):

Pushed

view this post on Zulip Pat Hickey (Jun 21 2024 at 16:45):

awesome!

view this post on Zulip bjorn3 (Jun 21 2024 at 16:50):

For reference removing all unreachable and assertion message printing brings the size down to 47162 bytes, which is only 5050 bytes less than with wasmtime#8859. IMHO winning 5050 bytes is not enough to justify making it harder to find the root cause when someone reports an issue caused by the adapter.

This reduces the size of wasi_snapshot_preview1.command.wasm from 75029 bytes to 53875 bytes for a total win of 21154 bytes. This is done by deduplicating most of the trap messages and the code for...

view this post on Zulip Juniper Tyree (Jun 23 2024 at 03:47):

Pat Hickey said:

that would additionally unlock using rust, c, c++ to target the new single-module representation of a component that luke has been presenting

I hadn't heard of that before and I'm intrigued - could you please point me to an explanation of the proposal? Thanks :)

view this post on Zulip Pat Hickey (Jun 25 2024 at 14:01):

https://github.com/WebAssembly/meetings/blob/main/wasi/2024/WASI-06-13.md#adding-core-wasm-build-targets-to-the-component-model

WebAssembly meetings (VC or in-person), agendas, and notes - WebAssembly/meetings

view this post on Zulip bjorn3 (Jun 25 2024 at 19:40):

So that would work in the browser with a shim similar to browser_wasi_shim without requiring a wasm component parser like jco?

view this post on Zulip Pat Hickey (Jun 25 2024 at 20:01):

presumably

view this post on Zulip bjorn3 (Jun 25 2024 at 20:27):

That would be great!

view this post on Zulip bjorn3 (Jun 25 2024 at 20:30):

Somewhat relatedly, is there any plan to write a tool going from a wasm component to a (possibly multi-memory) core wasm module with the same kind of custom section as would be used for building a wasm component from a core wasm module?

view this post on Zulip Pat Hickey (Jun 25 2024 at 20:38):

that could be done, yes. i think when we last discussed it in early 2023, the reason it wasnt pursued was that multi memory hadnt shipped enough places to make it worthwhile

view this post on Zulip Pat Hickey (Jun 25 2024 at 20:39):

its still behind a flag in chrome

view this post on Zulip Pat Hickey (Jun 25 2024 at 20:40):

at any rate, since jco provides something equivalent implemented in JS rather than multi-memory wasm, id consider it just an implementation detail that jco could use to optimize binary size


Last updated: Nov 22 2024 at 17:03 UTC