Stream: general

Topic: `br_if` vs `if` performance


view this post on Zulip Piotr Sarnacki (Nov 18 2024 at 12:57):

I've been looking through some WASM code generated by compiling a Rust program using wasm32-wasip1 target and I've noticed that it contains a lot of blocks that are used as ifs. As far as I understand instead of an if like:

(if (i32.eqz (local.get $foo))
  (then
    (call $bar)))

Some of the generated code used br_if:

(block $b1
  ;; if $foo is *not* equal zero, we skip to the end of the block,
  ;; otherwise we handle the if case
  (br_if $b1 (i32.ne (i32.const 0) (local.get $foo)))
  (call $bar)
)

I've create a simple benchmark doing the following for ifs:

    (if (i32.eqz (i32.const 0)) (then
    (if (i32.eqz (i32.const 0)) (then
      ;; 248 more ifs here
    ))))
 ```

and the following benchmark for blocks with `if_br`:

```wat
(block $foo1
(block $foo2
;; 248 more blocks
(br_if $foo2 (i32.ne (i32.const 1) (i32.const 0))))
(br_if $foo1 (i32.ne (i32.const 1) (i32.const 0))))

Then I looped both pieces of code and I measured the runtime. When executing the files in Wasmtime I consistently see the block version being faster then the if version. full runtime is ~300ms for blocks vs ~540ms for ifs. I've also checked on node.js and it's ~1s vs ~1.5s, so it seems like it's not only a detail of Wasmtime implementation.

Is it something that's expected? Anyone more familiar with the VM implementation knows why it would be the case?

view this post on Zulip Chris Fallin (Nov 18 2024 at 19:19):

Definitely not expected. One thing I notice is that the conditions aren't the same -- though both should be constant-folded away. Could you dump the CLIF and/or machine code from this? (For the latter, wasmtime compile then objdump)

view this post on Zulip Piotr Sarnacki (Nov 18 2024 at 21:08):

Yeah, in order to go through all of the br_if statements the statement has to be reversed.

I added the source and binaries here: https://gist.github.com/drogus/019496b172bd33c8b936cbbd168254c2

GitHub Gist: instantly share code, notes, and snippets.

view this post on Zulip Chris Fallin (Nov 18 2024 at 21:30):

oh sorry, I meant an objdump of the .cwasm file, showing native assembly

view this post on Zulip Piotr Sarnacki (Nov 18 2024 at 21:54):

How do I do that? I tried wasm-objdump, but it works only with .wasm files. Wasmtime doesn't seem to have objdump as a subcommand and when I try wasm-tools objdump it says:

error: failed to parse `blocks.cwasm`: input bytes aren't valid utf-8

view this post on Zulip Joel Dice (Nov 18 2024 at 21:56):

I think he means the objdump from e.g. GNU Binutils or similar, i.e. one meant for native binaries, not Wasm.

view this post on Zulip Piotr Sarnacki (Nov 18 2024 at 22:11):

Oh, ok, I thought it needs to be a WASM specific tool :sweat_smile: I updated the gist

view this post on Zulip Chris Fallin (Nov 18 2024 at 22:22):

woah, fascinating -- in the first (blocks) case all the branches are elided, while in the second case they're still there. We don't do any branch-folding in the mid-end, so probably something is different in the Wasm translation itself

view this post on Zulip Chris Fallin (Nov 18 2024 at 22:23):

I don't have a ton of time to dig into this right now but if you could file an issue so we can track this, that'd be very useful!


Last updated: Nov 22 2024 at 16:03 UTC