I just gave a talk today at the WebAssembly Workshop (WAW 2025) at POPL on correctness efforts in Wasmtime+Cranelift (and a tiny bit about how weval can also be seen as a correctness effort). The recording is here if anyone is interested in watching; I'll post slides later as well.
(That's a timestamped link at 40m53s into the whole day workshop recording, which has a bunch of other interesting talks I'd recommend watching too -- full program here)
Great talk and very interesting topic!
One thing that confused me a bit was here: YouTube - - YouTube
You spoke about how important is linear time verifiability because one would not want a Wasm module to load in quadratic time or worse. However, from what I know (and I might very well be wrong) is that Wasm validation (part of loading) already is quadratic since multi-value
Wasm was enabled.
E.g. checking the type of stack manipulating control flow and calls is non-linear with multi-value
Wasm. At least in all implementations I know it is quadratic in the worst-case.
@Robin Freyler good question! I actually was not aware of quadratic complexity of multi-value validation, though I can see how that could be the case with a particular implementation (linear-time equality on signature rather than constant interned signature ID). I believe wasmparser
should be able to do the check in constant time (so linear-time validation overall) because it has an interned ID for the block type (here) rather than an explicit list of params/results. @Alex Crichton could confirm for sure that the implementation is as I suspect...
I'll note in the talk though that I was coming from the Wasm standards angle, where the spec has definitely been careful to ensure linear-time validation is possible, and that's a topic that comes up from time to time in discussion of extensions. Cranelift overall is definitely not a linear-time compiler but with a validator it's a desirable property not to add new superlinear blowups, which was my main point :-)
Oh I wasn't aware of any quadratic behavior in wasmparser's validator myself, but I could also very well have missed something
A slightly more nuanced additional thought as well here after reading some code: a block with a signature, when branched to, will involve pushing/popping types on the type stack (e.g. here for block
); but when branching, a valid module will also have needed to produce those values somehow, and the block consume them; those instructions overall are linear, so the module size also must scale up. In other words, the validation cost is still linear overall in code size
(seen another way: consider any per-parameter work when handling block signature validation to be amortized/accounted against the producer and consumer of that particular value instead)
so I think we're good, and I'd be very surprised if the Wasm spec itself missed a superlinear validation issue like this (@Robin Freyler please do let us know if we're missing something!)
I will try to come up with a Wasm file that I think could be problematic:
(module
(func
(param
i32 i32 i32 i32 i32
i32 i32 i32 i32 i32
)
(result
i32 i32 i32 i32 i32
i32 i32 i32 i32 i32
)
(local.get 0)
(local.get 1)
(local.get 2)
(local.get 3)
(local.get 4)
(local.get 5)
(local.get 6)
(local.get 7)
(local.get 8)
(local.get 9)
(block ;; block is not even necessary but makes it simpler to follow maybe
(param
i32 i32 i32 i32 i32
i32 i32 i32 i32 i32
)
(result
i32 i32 i32 i32 i32
i32 i32 i32 i32 i32
)
;; Some conditional returns with multiple inputs/outputs
;; A Wasm validation implementation might be quadratic here,
;; because encoding a new conditional branch will have constant
;; size but validation overhead in terms of the number of inputs
;; and outputs defined once above. The validation stack has to be
;; manipulated for each conditional branch in the general case for
;; each input and output to check if all inputs and outputs match
;; their expected types.
;;
;; This is a super simplified example that could easily be mutated
;; in order to make it hard to use heuristics to avoid such quadratic
;; cases.
(br_if 0 (local.get 0))
(br_if 0 (local.get 0))
(br_if 0 (local.get 0))
;; etc..
(br_if 0 (local.get 0))
(br_if 0 (local.get 0))
)
)
)
(Not checked if this is syntactically valid Wasm tbh.)
Fascinating, yeah, I had forgotten that br_if
effectively duplicates its args (leaves them on the stack for the not-taken path). I might recommend detaching this thread / putting it under a separate topic since it is a general Wasm validation performance question/issue rather than specific to my talk.
It'd be interesting to see how much blowup this can actually cause (aside from the bad theoretical asymptotic bound) -- I tried expanding your example to 50 params and 1M lines of br_if
, and wasm-tools validate
still validates the wat in 0.41s on my (M1 Air) laptop
Wanted to say as well great talk Chris! Just listened to it and it's a great explanation of the space!
thanks! now I just need to acquire free time somehow to tie up all the loose ends and actually ship PCC...
great talk!
Last updated: Feb 28 2025 at 03:10 UTC