wasmtime / Issue #972 Naive fibonacci benchmark over 2x s... · git-wasmtime

While a pretty awful benchmark "in the large" I was surprised playing around locally how the naive fibonacci program was so slow relative to native performance. Especially because fibonacci benchmarks don't touch linear memory much, this may at least be a decent benchmark of cranelift and/or the code generator in use.

Given an input file like:

fn main() {
    std::process::exit(run() as i32);
}

#[no_mangle]
pub extern "C" fn run() -> u32 {
    fib(43)
}

fn fib(n: u32) -> u32 {
    if n <= 1 {
        1
    } else {
        fib(n - 1) + fib(n - 2)
    }
}

native execution looks like (for me at least)

$ rustc -O fib.rs
$ time ./fib
./fib  0.90s user 0.00s system 99% cpu 0.902 total

Whereas the wasm execution looks like:

$ rustc -O fib.rs --target wasm32-wasi --crate-type cdylib
$ time wasmtime --disable-cache -O --invoke run fib.wasm
warning: using `--invoke` with a function that returns values is experimental and may break in the future
701408733
wasmtime --disable-cache -O --invoke run fib.wasm  2.20s user 0.02s system 104% cpu 2.123 total

Here the wasm is over 2x slower than native, which was a bit surprising to me!

Some other reference information below is...

<summary>Native assembly for the fib function</summary>

0000000000004d40 <_ZN3fib3fib17h2bacf53cb3845acfE>:
    4d40:   55                      push   %rbp
    4d41:   53                      push   %rbx
    4d42:   50                      push   %rax
    4d43:   bd 01 00 00 00          mov    $0x1,%ebp
    4d48:   83 ff 02                cmp    $0x2,%edi
    4d4b:   72 25                   jb     4d72 <_ZN3fib3fib17h2bacf53cb3845acfE+0x32>
    4d4d:   89 fb                   mov    %edi,%ebx
    4d4f:   bd 01 00 00 00          mov    $0x1,%ebp
    4d54:   66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
    4d5b:   00 00 00
    4d5e:   66 90                   xchg   %ax,%ax
    4d60:   8d 7b ff                lea    -0x1(%rbx),%edi
    4d63:   e8 d8 ff ff ff          callq  4d40 <_ZN3fib3fib17h2bacf53cb3845acfE>
    4d68:   83 c3 fe                add    $0xfffffffe,%ebx
    4d6b:   01 c5                   add    %eax,%ebp
    4d6d:   83 fb 01                cmp    $0x1,%ebx
    4d70:   77 ee                   ja     4d60 <_ZN3fib3fib17h2bacf53cb3845acfE+0x20>
    4d72:   89 e8                   mov    %ebp,%eax
    4d74:   48 83 c4 08             add    $0x8,%rsp
    4d78:   5b                      pop    %rbx
    4d79:   5d                      pop    %rbp
    4d7a:   c3                      retq
    4d7b:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)

</details>

<summary>WebAssembly of the fib function</summary>

  (func $_ZN3fib3fib17hedcc9d2af68c6e00E (type 1) (param i32) (result i32)
    (local i32)
    i32.const 1
    local.set 1
    block  ;; label = @1
      local.get 0
      i32.const 2
      i32.lt_u
      br_if 0 (;@1;)
      i32.const 1
      local.set 1
      loop  ;; label = @2
        local.get 0
        i32.const -1
        i32.add
        call $_ZN3fib3fib17hedcc9d2af68c6e00E
        local.get 1
        i32.add
        local.set 1
        local.get 0
        i32.const -2
        i32.add
        local.tee 0
        i32.const 1
        i32.gt_u
        br_if 0 (;@2;)
      end
    end
    local.get 1)

</details>

At this time I don't know of a great way to get out the assembly generated by cranelift unfortunately, but I'm hoping others may know of an easy way to do so!

Stream: git-wasmtime

Topic: wasmtime / Issue #972 Naive fibonacci benchmark over 2x s...

Wasmtime GitHub notifications bot (Mar 25 2020 at 13:57):

Wasmtime GitHub notifications bot (Mar 25 2020 at 14:03):

Wasmtime GitHub notifications bot (Mar 25 2020 at 14:24):

Wasmtime GitHub notifications bot (Mar 27 2020 at 16:44):

Wasmtime GitHub notifications bot (Feb 03 2021 at 20:43):

Wasmtime GitHub notifications bot (Feb 03 2021 at 20:51):

Wasmtime GitHub notifications bot (Feb 03 2021 at 20:51):