Stream: general

Topic: can wasm be as fast as native?


view this post on Zulip Diego Antonio Rosario Palomino (Sep 29 2023 at 11:12):

A game project i am interested in ( unvanquished ) is going to switch to wasm for its scripting code, which it runs on its own process. I am wondering whether safety checks ( function pointer and bounds check among others ) could be turned off in that situation, to improve performance

view this post on Zulip Alex Crichton (Sep 29 2023 at 13:31):

There's some discussion in a sibling thread about this too, but in general the answer is "no", wasm safety checks can't be turned off because then that wouldn't be wasm but something else entirely.

That being said if you have test cases which perform significantly worse than native (e.g. 50%+) please feel free to open an issue and we can take a look. For example just the other day we fixed a 4x slowdown for a small loop which turned out to be a low-level register issue.

This commit is a result of the investigation on #7085. The int-to-float conversion instructions used right now on the x64 backend will implicitly source the upper bits of the result from a differen...

view this post on Zulip Ondřej Čertík (Sep 29 2023 at 14:05):

@Alex Crichton there are sandboxing checks that you have to make in order to guarantee safety, but if you want to give up safety for performance, a lot can be done. There are still some fundamental limitations with the WASM instruction set, but compiling let's say via LLVM (I think WasmEdge is doing that) with no runtime checks and all optimizations enabled can get quite far I think.

view this post on Zulip Alex Crichton (Sep 29 2023 at 14:06):

Certainly yes wasm has some checks which can be omitted, but at least in my opinion it's not wasm at that point, it's something else entirely.

view this post on Zulip Ondřej Čertík (Sep 29 2023 at 14:07):

It is this point that I want some clarity. If it starts with WASM, and it runs in wasmtime, and then it runs in this "something else entirely" and produces the same answer, why is it not WASM anymore?

view this post on Zulip Ondřej Čertík (Sep 29 2023 at 14:08):

In other words, why cannot WASM be defined as "a given .wasm file runs in wasmtime and doesn't fail."? And as long as this is satisfied, we are free to create "fast runtime" that gives up safety for performance. You can always run the same wasm file in wasmtime and get all the safety guarantees.

view this post on Zulip Diego Antonio Rosario Palomino (Sep 29 2023 at 14:10):

There are many situations where wasm would run in a dedicated process. If the os is of high quality and hardware bugs are absent or mitigated why not disable wasm security checks then?

view this post on Zulip Diego Antonio Rosario Palomino (Sep 29 2023 at 14:11):

is what i am saying, unless disabling wasm security checks can change the semantics of the code or something like that

view this post on Zulip Alex Crichton (Sep 29 2023 at 14:12):

In my mind the difference is the possible inputs. To be the same as wasm you have to perform the exact same as the wasm under all possible inputs, not just one. For example if one input causes wasm to report a call_indirect signature mismatch but an "unchecked" build went and crashed a different way, that's how they're different.

If call_indirect never crashes for a wasm build, then an "unchecked" build was applying an optimization that the wasm compiler wasn't smart enough to do but could have done, but that's a different realm of optimization as opposed to simply turning off checks.

view this post on Zulip Alex Crichton (Sep 29 2023 at 14:13):

Disabling wasm checks can indeed change semantics, for example it could cause crashes to happen at different times which means that any amount of other code could run in the meantime. For example host imports could be called after what would have otherwise been a crash

view this post on Zulip Alex Crichton (Sep 29 2023 at 14:13):

an overly simple example would be:

if array[huge_out_of_bounds_index] == 0 {
   tell_the_host_my_secret_key(&key);
}

view this post on Zulip Alex Crichton (Sep 29 2023 at 14:14):

wasm guarantees that tell_the_host_my_secret_key will never run, whereas an "unchecked build" would probably run that

view this post on Zulip Diego Antonio Rosario Palomino (Sep 29 2023 at 14:15):

What about if use a safe language like lets say haskell or even safer, would the semantics still change in practice?

view this post on Zulip Diego Antonio Rosario Palomino (Sep 29 2023 at 14:16):

Haskell, idris2, etc already manage memory safely and check function pointer usage so wouldnt wasm checks be redundant then?

view this post on Zulip Alex Crichton (Sep 29 2023 at 14:18):

IMO no, that doesn't change the calculus. Languages have bugs, and albeit for big languages like that they're probably rare they still do happen. Wasm's checks are an extra layer of protection against them.

view this post on Zulip Alex Crichton (Sep 29 2023 at 14:18):

For example if you're comfortable trusting Haskell why not run Haskell natively?

view this post on Zulip Diego Antonio Rosario Palomino (Sep 29 2023 at 14:19):

Portable executables and being able to target the web with the same setup

view this post on Zulip Alex Crichton (Sep 29 2023 at 14:21):

True! You won't ever be able to turn off the wasm checks on the web, however, and isn't part of the portability story the same behavior on-and-off the web?

view this post on Zulip Diego Antonio Rosario Palomino (Sep 29 2023 at 14:26):

Ideally safe languages like haskell, idris2, rust, etc could ask the web assembly runtimes to disable wasm checks. Telling their runtime that its safe code so the semantic wont change.

view this post on Zulip Diego Antonio Rosario Palomino (Sep 29 2023 at 14:28):

Even runtimes with this feature could refuse to turn off checks, so the code promises its equivalent whether or not they are present

view this post on Zulip Diego Antonio Rosario Palomino (Sep 29 2023 at 14:30):

I think this is in the same spirit as enabling multi threading. Code with race conditions will behave different with threading than without

view this post on Zulip Ondřej Čertík (Sep 29 2023 at 15:24):

I have a lot of experience with precisely this thing:

if array[huge_out_of_bounds_index] == 0 {
   tell_the_host_my_secret_key(&key);
}

and in Fortran for example (and LFortran), the workflow is that you first compile in Debug mode with bounds checking on. If your code runs for a given input without any runtime errors, then you can turn the bounds checking off and run fast. Then if in production things misbehave due to out of bounds (quite rare actually in my experience for computational physics codes), you recompile and re-run in Debug mode.

Here is a similar example. LPython has the following guarantee / contract: if your Python code compiles in LPython and runs in Debug mode without any runtime errors (such as bounds checking), then we guarantee that exactly the same code on exactly the same input also runs in CPython and produces the same answer (as well as when recompiled with LPython in ReleaseFast mode). Would it mean that LPython is not Python? Well, it's not exactly the same as CPython, but given the contract, it definitely guarantees some subset of Python.

Coming back to WASM. Take our existing fast WASM->x64 (x86_64) backend. We guarantee the following (assuming no bugs in our compiler, which currently is not satisfied): if a given LFortran's code compiles and runs in Debug mode via our WASM->x64 backend and produces the right answer without any runtime errors, then the same a.wasm should run in wasmtime and produce the same answer (as well as running in the browser!), as well as in our WASM->x64 backend in ReleaseFast mode. For the same inputs. In my mind we are supporting a subset of WASM, in a well defined sense of this paragraph.

view this post on Zulip Lann Martin (Sep 29 2023 at 15:36):

I think the point is that your "WASM->x64 backend" isn't a wasm backend, its a "subset of wasm produced by certain compilers" backend. It could not (presumably) compile any arbitrary wasm input into a binary that follows the wasm execution spec. That is fine in an integrated toolchain, but you wouldn't want to use it for potentially-malicious input as many wasm runtimes expect.

view this post on Zulip Ondřej Čertík (Sep 29 2023 at 15:39):

@Lann Martin yes, you definitely can't use it for malicious input. You can't use our ReleaseFast backend (even via LLVM) with malicious input. Only Debug and ReleaseSafe. Yes, I think you are right: it's not a WASM backend that can accept malicious wasm file or malicious input. It's a "subset of WASM produced by certain compilers".

view this post on Zulip Lann Martin (Sep 29 2023 at 15:42):

So to the other points above, it could be fine to carve out some subset of Wasm to target as an LLVM-like IR, just please don't call it "web assembly" :smile:

view this post on Zulip Ondřej Čertík (Sep 29 2023 at 15:58):

@Lann Martin if you can help find some good name for this that would be great. I have a feeling that when you (by "you" I mean the WASM community) say "web assembly" you mean not just the ".wasm" format, but also all the guarantees and ability to accept and run malicious code. Correct?

view this post on Zulip Lann Martin (Sep 29 2023 at 16:00):

I would mostly defer to others here on something like that, but something like "Unsafe Wasm" (.uwasm) might be ok :shrug:

view this post on Zulip Ondřej Čertík (Sep 29 2023 at 16:07):

"Unsafe WASM" is fine with me. It would be nice to have a term for this, to streamline all discussions related to it.

view this post on Zulip Lann Martin (Sep 29 2023 at 16:11):

and just to clarify, I strongly doubt that wasmtime would support this in the foreseeable future

view this post on Zulip Ondřej Čertík (Sep 29 2023 at 16:16):

We can continue maintaining our own backend for "Unsafe WASM", I am not asking that wasmtime does it.

view this post on Zulip Chris Fallin (Sep 29 2023 at 16:33):

One thing that I think is missing from the above discussion: the "bounds checks" are actually built into the semantics of Wasm. Even a subset of Wasm -- say, Wasm modules that would not have trapped in the original execution -- require the following:

So what I'm trying to say is: there is no actual way to turn off bounds checks, because the result wouldn't be Wasm. I don't mean that in a philosophical "it wouldn't have the guarantees we like to talk about" way, I mean that in a "it wouldn't run your program correctly because it wouldn't implement the same instruction set" way. Does that make sense?

view this post on Zulip Ondřej Čertík (Sep 29 2023 at 18:34):

So what I'm trying to say is: there is no actual way to turn off bounds checks, because the result wouldn't be Wasm. I don't mean that in a philosophical "it wouldn't have the guarantees we like to talk about" way, I mean that in a "it wouldn't run your program correctly because it wouldn't implement the same instruction set" way. Does that make sense?

Yes, the "implicit" bounds checks required by the WASM instructions themselves cannot be turned off (you gave two examples: "hidden heap base", br_table, and there is probably more), and even "Fast/Unsafe WASM" must implement those. Consequently, these might impose some (slight) limitations on the maximum performance for "ReleaseFast" mode (fastest runtime, no checks), and so WASM might not be the best fit there. On the other hand, for ReleaseSafe (fast runtime and checks) this might be a perfect fit, using wasmtime for example. And for Debug mode (fast compilation, checks, slower runtime), having our custom fast-to-compile WASM backend might be a good fit.

view this post on Zulip Diego Antonio Rosario Palomino (Aug 04 2024 at 13:45):

Maybe instead of supporting a subset of wasm ( programs that will never go out of bounds or otherwise have memory errors), extra memory instructions could be added whose checking cost completely goes away when safety checks are not necessary. I imagine such instructions would be slower to implement otherwise but they could still be useful


Last updated: Oct 23 2024 at 20:03 UTC