Is there any research on what languages generate the smallest wasm binaries?
I understand that this depends a lot on the application being compiled, but I'm sure certain languages make it easier than others to get small binaries.
Here's what I was able to gather from sources online. Note this is not a real apples to apples comparison, as they all deal with completely different applications.
Rust: Rust is able to get pretty small, but you have to bend over backwards to get those benefits, and won't be able to use a lot of the tools most Rust applications use. https://github.com/johnthagen/min-sized-rust
C/C++: not too big, if you don't use some features https://stackoverflow.com/a/74982101/5253155
Go: seems like Go wasm binaries are quite large https://www.fermyon.com/blog/optimizing-tinygo-wasm, much better: https://dev.bitolog.com/minimizing-go-webassembly-binary-size
Swift: Swift binary sizes are huge https://github.com/swiftwasm/swift/issues/7 https://forums.swift.org/t/swift-wasm-binary-sizes/51533
C#: C#/blazor apps are also gigantic https://www.meziantou.net/optimizing-a-blazor-webassembly-application-size.htm
AssemblyScript: My understanding is that AssemblyScript is able to generate fairly small binaries, but couldn't easily find any sources.
Porffor: porffor.dev explicitly states that they want to reduce binary sizes, which is refreshing. They make some big claims on the website, but not sure if this has been independently verified.
StarlingMonkey: IIUC StarlingMonkey is aiming to be a full JS engine, which is at odds with creating small binaries.
Any thoughts on this? Did I miss anything? Has anyone done this in a more methodical way?
note that that rust post is mostly concerned with non-wasm binaries, on wasm, one major way to save lots of space is to target wasm32-unknown-unknown
, though note that there is no libc so it works best when you only use Rust code and no C/C++. also most of std
is unavailable (as in it'll panic if you use it), but you can still use core
, alloc
, the stuff re-exported in std
, and HashMap
/HashSet
.
MoonBit? https://www.moonbitlang.com/
The reason that historically rustc produced native executables larger than 1MB are twofold:
RUST_BACKTRACE=1
. On wasm there is no way for a guest to get a backtrace in the first place and as such the backtrace symbolizer is omitted too, saving about 400kb.When compiling rust hello world for wasm32-wasip1 in release mode I get a 64kb binary. With fat LTO and stripping symbols that is already brought down to 45kb without loss of functionality. You still get panic messages, are able to allocate memory and use libstd as usual. If you are fine with not having useful panic messages you can use -Zbuild-std -Zbuild-std-features=panic_immediate_abort
to get the binary size down to 14kb (or 11kb with wasm-opt) with most space being taken by the memory allocator (which you aren't going to get around).
Here's an example of a 1.3KB no_std
Rust module with a dummy panic handler and no memory allocator, for reference: source; binary.
BTW, a tiny bump allocator might make sense for short-lived instances (e.g. serverless functions). Naturally, the minimum binary size is going to depend on the minimum feature set your application requires.
Re Rust: seems like a hello world example can get quite small. However, when building real world applications, Rust has a few disadvantages compared a other languages.
a) Monomorphization can grow a binary quickly. (e.g. anything making much use of serde will grow fast).
2) std::fmt seems to blow up binary sizes (there are some libraries that are better at this than std::fmt).
III) Panic unwinding is heavy (can get around this with panic_immediate_abort
as mentioned, but you loos panic-hooks, and hence the ability to let the user know that something went wrong).
You can write your own panic handler which prints a message to stderr prior to calling unreachable
.
Agreed that monomorphization can lead to bloat very quickly, which requires discipline (and being picky about dependencies) to avoid.
Joel Dice said:
You can write your own panic handler which prints a message to stderr prior to calling
unreachable
.
Good point. But that only covers cases where you panic, not when libraries panic.
It won't catch:
let a = vec![true];
a[1];
My understanding is that panic_handler affects the whole binary (including libraries). I could be mistaken, though.
I think you're right. But don't think it'll run if you set panic_immediate_abort
.
Yeah; I think you need to either set a custom handler or use panic_immediate_abort
; either one will give you small binaries, but only the former lets you print a message to stderr or otherwise customize the behavior.
Interesting. Wasn't aware that setting a custom handler by itself can reduce binary size
That's what the 1.3KB example I gave above does.
Ryuta Suzuki said:
MoonBit? https://www.moonbitlang.com/
Seems like small binary sizes is an explicit goal here, which is great!
https://www.moonbitlang.com/blog/first-announce#compact---tiny-wasm-output
I made a hello-world that compiles to 703 bytes using Rust 1.82.0:
Cargo.toml
:
[package]
name = "wasm-test"
version = "0.1.0"
edition = "2021"
[lib]
crate-type = ["cdylib"]
[dependencies]
wasi = "0.11.0"
src/lib.rs
:
#[unsafe(no_mangle)]
pub extern "C" fn _start() {
let hello_world = "Hello World!\n";
unsafe {
let _ = wasi::fd_write(
1,
&[wasi::Ciovec {
buf: hello_world.as_ptr(),
buf_len: hello_world.len(),
}],
);
}
}
build with (output is in target/wasm32-unknown-unknown/release/wasm_test.wasm
):
cargo build --release --target=wasm32-unknown-unknown
it should be able to be optimized further, since looking at the binary, a bunch of it is rustc version and target feature strings that should be able to be omitted, and some of the symbol names are like 40 bytes long so could be shrunk
Curious if can get anything close to this with an allocator
Well, probably possible with a bump allocator. Thinking more of a 'real' allocator.
despite targeting wasm32-unknown-unknown theres still std in there and it has the ability to allocate (since memory.grow is an ISA concern not an operating system concern), but since its not reachable in that code it would be dead-code-eliminated away
Jacob Lifshay said:
it should be able to be optimized further, since looking at the binary, a bunch of it is rustc version and target feature strings that should be able to be omitted, and some of the symbol names are like 40 bytes long so could be shrunk
wasm-tools strip
and wasm-tools strip --delete name
might help with that.
IMO if the goal is to make small binaries you want Rust, C++, or C. If you do anything nontrivial you probably don't want C, and if you want it to be approachable to most folks you probably want Rust. Rust, like any other language discussed here, is no silver bullet. That being said Rust has solutions for binary size things for most of what you'd run into. Applying said solutions can range from "massive refactor" to "use this flag", however.
My personal gut is that if you were to code golf other languages it would basically look the same if you zoom out enough. Some of this gets into language choice though which is sort of orthogonal to this
code golf is its own sport that is almost completely distinct from software engineering, i think its valuable to come up with a spec for a real program that solves a real problem and then implement that in a variety of languages if you want to compare how toolchains do
(c.f. lies, damn lies, and microbenchmarks)
you people just taught me code golfing, I love it.
Pedantic note on C++: templates and static initializers do add a significant amount of bloat at scale. The linked SO question is technically correct (the best kind of correct) about iostream, but that is not because of template bloat, rather, stream operations (including stringstream) must implicitly link against functions from <locale> to do all sorts of culture-sensitive parsing and conversion. This is a problem on native against statically linked c runtimes as well. See Dan's excellent answer here: https://github.com/WebAssembly/wasi-sdk/issues/87#issuecomment-567925947
I've done a lot of optimization work in C and C++ Wasm land, and generally, raw C scales pretty linearly between Wasm and native binary size. With C++, stl, templates, conversions, and non-elided copies (even moves) scale more exponentially against native binaries where these features do bloat binaries but not significantly. Just doing something seemingly innocuous like std::make_shared in an initializer and passing the shared_ptr by value can add like 1kb of wasm binary size.
I wish godbolt actually supported wasm (their clang wasm doesn't ship with a libc++) - would be very useful for more directly comparing output
Zig tends to produce the fastest and smallest WebAssembly modules out of the box. Moonbit and AssemblyScript also produce small binaries, but performance is meh.
Note that for Go, "standard" Go generates much larger binaries than TinyGo due mostly to the more complicated runtime.
~/go/src/github.com/dgryski/hw $ tinygo build -target=wasip1 -o hw.tinygo main.go
~/go/src/github.com/dgryski/hw $ GOOS=wasip1 GOARCH=wasm go build -o hw.biggo main.go
~/go/src/github.com/dgryski/hw $ ls -l hw.*
-rwxr-xr-x 1 dgryski staff 1581105 11 Mar 13:00 hw.biggo*
-rw-r--r-- 1 dgryski staff 105383 11 Mar 12:59 hw.tinygo
Last updated: Apr 08 2025 at 23:03 UTC