Stream: git-wasmtime

Topic: wasmtime / PR #1660 Add benchmarks


view this post on Zulip Wasmtime GitHub notifications bot (May 05 2020 at 15:33):

Vurich opened PR #1660 from master to master:

So I wrote these benchmarks with the aim of getting a vague idea of how Lightbeam compares to Cranelift w.r.t. compile-time and runtime performance, you can see the results in this gist https://gist.github.com/Vurich/8696e67180aa3c93b4548fb1f298c29e. Unfortunately, it changes the API of wasmtime-wast in order to make public a couple of functions (and one type returned by one of those functions) that the benchmarks need. I considered making these functions public only when cfg(test) is active, or to inline the benchmarks into that file itself instead of making a separate benches binary, but I figured that it would be more helpful to ask here what people would consider the best way to go about that. I personally didn't see anything majorly problematic with just making these methods public and it was the easiest method of implementation, so that's the method that I chose for the original version of this PR, with the understanding that I can change it later if necessary.

view this post on Zulip Wasmtime GitHub notifications bot (May 05 2020 at 15:35):

Vurich edited PR #1660 from master to master:

So I wrote these benchmarks with the aim of getting a vague idea of how Lightbeam compares to Cranelift w.r.t. compile-time and runtime performance, you can see the results in this gist https://gist.github.com/Vurich/8696e67180aa3c93b4548fb1f298c29e.

I'd love any criticisms of the methodology used by these benchmarks and the likely reliability of the results, I tried to dig into the code to make sure that what I thought was being run was actually being run (and likewise, what I didn't think would be run would definitely not be run), but I'd love if there were some holes to poke in the way that these benchmarks have been written.

Unfortunately, it changes the API of wasmtime-wast in order to make public a couple of functions (and one type returned by one of those functions) that the benchmarks need. I considered making these functions public only when cfg(test) is active, or to inline the benchmarks into that file itself instead of making a separate benches binary, but I figured that it would be more helpful to ask here what people would consider the best way to go about that. I personally didn't see anything majorly problematic with just making these methods public and it was the easiest method of implementation, so that's the method that I chose for the original version of this PR, with the understanding that I can change it later if necessary.

view this post on Zulip Wasmtime GitHub notifications bot (May 05 2020 at 15:35):

Vurich edited PR #1660 from master to master:

So I wrote these benchmarks with the aim of getting a vague idea of how Lightbeam compares to Cranelift in compile-time and runtime performance, you can see the results in this gist https://gist.github.com/Vurich/8696e67180aa3c93b4548fb1f298c29e.

I'd love any criticisms of the methodology used by these benchmarks w.r.t. the likely reliability of the results, I tried to dig into the code to make sure that what I thought was being run was actually being run (and likewise, what I didn't think would be run would definitely not be run), but I'd love if there were some holes to poke in the way that these benchmarks have been written.

Unfortunately, it changes the API of wasmtime-wast in order to make public a couple of functions (and one type returned by one of those functions) that the benchmarks need. I considered making these functions public only when cfg(test) is active, or to inline the benchmarks into that file itself instead of making a separate benches binary, but I figured that it would be more helpful to ask here what people would consider the best way to go about that. I personally didn't see anything majorly problematic with just making these methods public and it was the easiest method of implementation, so that's the method that I chose for the original version of this PR, with the understanding that I can change it later if necessary.

view this post on Zulip Wasmtime GitHub notifications bot (May 05 2020 at 15:36):

Vurich edited PR #1660 from master to master:

So I wrote these benchmarks with the aim of getting a vague idea of how Lightbeam compares to Cranelift in compile-time and runtime performance, you can see the results in this gist https://gist.github.com/Vurich/8696e67180aa3c93b4548fb1f298c29e.

I'd love any criticisms of the methodology used by these benchmarks w.r.t. the likely reliability of the results, I tried to dig into the code to make sure that what I thought was being run was actually being run (and what I didn't think would be run was actually not being run), but I'd love if there were some holes to poke in the way that these benchmarks have been written.

Unfortunately, it changes the API of wasmtime-wast in order to make public a couple of functions (and one type returned by one of those functions) that the benchmarks need. I considered making these functions public only when cfg(test) is active, or to inline the benchmarks into that file itself instead of making a separate benches binary, but I figured that it would be more helpful to ask here what people would consider the best way to go about that. I personally didn't see anything majorly problematic with just making these methods public and it was the easiest method of implementation, so that's the method that I chose for the original version of this PR, with the understanding that I can change it later if necessary.

view this post on Zulip Wasmtime GitHub notifications bot (May 05 2020 at 15:36):

Vurich edited PR #1660 from master to master:

So I wrote these benchmarks with the aim of getting a vague idea of how Lightbeam compares to Cranelift in compile-time and runtime performance, you can see the results in this gist https://gist.github.com/Vurich/8696e67180aa3c93b4548fb1f298c29e.

I'd love any criticisms of the methodology used by these benchmarks w.r.t. the likely reliability of the results, I tried to dig into the code to make sure that, in the loop which is actually measured, what I thought was being run was actually being run (and what I didn't think would be run was actually not being run), but I'd love if there were some holes to poke in the way that these benchmarks have been written.

Unfortunately, it changes the API of wasmtime-wast in order to make public a couple of functions (and one type returned by one of those functions) that the benchmarks need. I considered making these functions public only when cfg(test) is active, or to inline the benchmarks into that file itself instead of making a separate benches binary, but I figured that it would be more helpful to ask here what people would consider the best way to go about that. I personally didn't see anything majorly problematic with just making these methods public and it was the easiest method of implementation, so that's the method that I chose for the original version of this PR, with the understanding that I can change it later if necessary.

view this post on Zulip Wasmtime GitHub notifications bot (May 05 2020 at 15:36):

Vurich edited PR #1660 from master to master:

So I wrote these benchmarks with the aim of getting a vague idea of how Lightbeam compares to Cranelift in compile-time and runtime performance, you can see the results in this gist https://gist.github.com/Vurich/8696e67180aa3c93b4548fb1f298c29e.

I'd love any criticisms of the methodology used by these benchmarks w.r.t. the likely reliability of the results, I tried to dig into the code to make sure that, in the loop which is being measured, what I thought was being run was actually being run (and what I didn't think would be run was actually not being run), but I'd love if there were some holes to poke in the way that these benchmarks have been written.

Unfortunately, it changes the API of wasmtime-wast in order to make public a couple of functions (and one type returned by one of those functions) that the benchmarks need. I considered making these functions public only when cfg(test) is active, or to inline the benchmarks into that file itself instead of making a separate benches binary, but I figured that it would be more helpful to ask here what people would consider the best way to go about that. I personally didn't see anything majorly problematic with just making these methods public and it was the easiest method of implementation, so that's the method that I chose for the original version of this PR, with the understanding that I can change it later if necessary.

view this post on Zulip Wasmtime GitHub notifications bot (May 05 2020 at 15:36):

Vurich requested sunfishcode for a review on PR #1660.

view this post on Zulip Wasmtime GitHub notifications bot (May 05 2020 at 15:56):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 05 2020 at 15:56):

bjorn3 created PR Review Comment:

Will this disable the verifier for fuzzing?

view this post on Zulip Wasmtime GitHub notifications bot (May 06 2020 at 07:21):

Vurich submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 06 2020 at 07:21):

Vurich created PR Review Comment:

Great point, although I'm not sure if it actually will disable these checks in fuzzing - this file appears to only be for the spec test suite, and fuzzing should really have debug_assertions turned on anyway. Either way, I can easily revert this change, as I made it before I split the benchmarks into their own file - before, the tests and benchmarks were identical except that the benchmarks had b.iter(|| {}) around wast_context.run_file(wast). It's now not really necessary.

view this post on Zulip Wasmtime GitHub notifications bot (May 06 2020 at 07:21):

Vurich updated PR #1660 from master to master:

So I wrote these benchmarks with the aim of getting a vague idea of how Lightbeam compares to Cranelift in compile-time and runtime performance, you can see the results in this gist https://gist.github.com/Vurich/8696e67180aa3c93b4548fb1f298c29e.

I'd love any criticisms of the methodology used by these benchmarks w.r.t. the likely reliability of the results, I tried to dig into the code to make sure that, in the loop which is being measured, what I thought was being run was actually being run (and what I didn't think would be run was actually not being run), but I'd love if there were some holes to poke in the way that these benchmarks have been written.

Unfortunately, it changes the API of wasmtime-wast in order to make public a couple of functions (and one type returned by one of those functions) that the benchmarks need. I considered making these functions public only when cfg(test) is active, or to inline the benchmarks into that file itself instead of making a separate benches binary, but I figured that it would be more helpful to ask here what people would consider the best way to go about that. I personally didn't see anything majorly problematic with just making these methods public and it was the easiest method of implementation, so that's the method that I chose for the original version of this PR, with the understanding that I can change it later if necessary.

view this post on Zulip Wasmtime GitHub notifications bot (May 06 2020 at 07:33):

Vurich updated PR #1660 from master to master:

So I wrote these benchmarks with the aim of getting a vague idea of how Lightbeam compares to Cranelift in compile-time and runtime performance, you can see the results in this gist https://gist.github.com/Vurich/8696e67180aa3c93b4548fb1f298c29e.

I'd love any criticisms of the methodology used by these benchmarks w.r.t. the likely reliability of the results, I tried to dig into the code to make sure that, in the loop which is being measured, what I thought was being run was actually being run (and what I didn't think would be run was actually not being run), but I'd love if there were some holes to poke in the way that these benchmarks have been written.

Unfortunately, it changes the API of wasmtime-wast in order to make public a couple of functions (and one type returned by one of those functions) that the benchmarks need. I considered making these functions public only when cfg(test) is active, or to inline the benchmarks into that file itself instead of making a separate benches binary, but I figured that it would be more helpful to ask here what people would consider the best way to go about that. I personally didn't see anything majorly problematic with just making these methods public and it was the easiest method of implementation, so that's the method that I chose for the original version of this PR, with the understanding that I can change it later if necessary.

view this post on Zulip Wasmtime GitHub notifications bot (May 07 2020 at 15:30):

alexcrichton submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 07 2020 at 15:30):

alexcrichton submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 07 2020 at 15:30):

alexcrichton created PR Review Comment:

Could this file be renamed to perhaps spec.rs or something that indicates that it's benchmarking the spec test suite?

FWIW the spec test suite I don't think is really the most interesting of benchmarks, but at least for our own internal tracking and performance monitoring it probably isn't so bad to track!

view this post on Zulip Wasmtime GitHub notifications bot (May 07 2020 at 15:30):

alexcrichton created PR Review Comment:

Could this be deduplicated with the wasmtime-wast crate? Perhaps something where a function to handle each directive is provided and then we say "run this file", or something like that?

view this post on Zulip Wasmtime GitHub notifications bot (May 07 2020 at 15:30):

alexcrichton created PR Review Comment:

Would it be possible to use a hand-rolled harness = false scheme or something like criterion that works on stable?

view this post on Zulip Wasmtime GitHub notifications bot (May 07 2020 at 15:30):

alexcrichton created PR Review Comment:

This I think is duplicated with the testing code as well as with the function below, could this all be consolidated into one function, perhaps in the wasmtime-wast crate itself?

view this post on Zulip Wasmtime GitHub notifications bot (Jun 25 2020 at 18:49):

alexcrichton edited PR #1660 from master to main:

So I wrote these benchmarks with the aim of getting a vague idea of how Lightbeam compares to Cranelift in compile-time and runtime performance, you can see the results in this gist https://gist.github.com/Vurich/8696e67180aa3c93b4548fb1f298c29e.

I'd love any criticisms of the methodology used by these benchmarks w.r.t. the likely reliability of the results, I tried to dig into the code to make sure that, in the loop which is being measured, what I thought was being run was actually being run (and what I didn't think would be run was actually not being run), but I'd love if there were some holes to poke in the way that these benchmarks have been written.

Unfortunately, it changes the API of wasmtime-wast in order to make public a couple of functions (and one type returned by one of those functions) that the benchmarks need. I considered making these functions public only when cfg(test) is active, or to inline the benchmarks into that file itself instead of making a separate benches binary, but I figured that it would be more helpful to ask here what people would consider the best way to go about that. I personally didn't see anything majorly problematic with just making these methods public and it was the easiest method of implementation, so that's the method that I chose for the original version of this PR, with the understanding that I can change it later if necessary.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 19 2020 at 07:29):

Vurich updated PR #1660 from master to main:

So I wrote these benchmarks with the aim of getting a vague idea of how Lightbeam compares to Cranelift in compile-time and runtime performance, you can see the results in this gist https://gist.github.com/Vurich/8696e67180aa3c93b4548fb1f298c29e.

I'd love any criticisms of the methodology used by these benchmarks w.r.t. the likely reliability of the results, I tried to dig into the code to make sure that, in the loop which is being measured, what I thought was being run was actually being run (and what I didn't think would be run was actually not being run), but I'd love if there were some holes to poke in the way that these benchmarks have been written.

Unfortunately, it changes the API of wasmtime-wast in order to make public a couple of functions (and one type returned by one of those functions) that the benchmarks need. I considered making these functions public only when cfg(test) is active, or to inline the benchmarks into that file itself instead of making a separate benches binary, but I figured that it would be more helpful to ask here what people would consider the best way to go about that. I personally didn't see anything majorly problematic with just making these methods public and it was the easiest method of implementation, so that's the method that I chose for the original version of this PR, with the understanding that I can change it later if necessary.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 19 2020 at 09:16):

Vurich updated PR #1660 from master to main:

So I wrote these benchmarks with the aim of getting a vague idea of how Lightbeam compares to Cranelift in compile-time and runtime performance, you can see the results in this gist https://gist.github.com/Vurich/8696e67180aa3c93b4548fb1f298c29e.

I'd love any criticisms of the methodology used by these benchmarks w.r.t. the likely reliability of the results, I tried to dig into the code to make sure that, in the loop which is being measured, what I thought was being run was actually being run (and what I didn't think would be run was actually not being run), but I'd love if there were some holes to poke in the way that these benchmarks have been written.

Unfortunately, it changes the API of wasmtime-wast in order to make public a couple of functions (and one type returned by one of those functions) that the benchmarks need. I considered making these functions public only when cfg(test) is active, or to inline the benchmarks into that file itself instead of making a separate benches binary, but I figured that it would be more helpful to ask here what people would consider the best way to go about that. I personally didn't see anything majorly problematic with just making these methods public and it was the easiest method of implementation, so that's the method that I chose for the original version of this PR, with the understanding that I can change it later if necessary.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 21 2020 at 15:17):

Vurich updated PR #1660 from master to main:

So I wrote these benchmarks with the aim of getting a vague idea of how Lightbeam compares to Cranelift in compile-time and runtime performance, you can see the results in this gist https://gist.github.com/Vurich/8696e67180aa3c93b4548fb1f298c29e.

I'd love any criticisms of the methodology used by these benchmarks w.r.t. the likely reliability of the results, I tried to dig into the code to make sure that, in the loop which is being measured, what I thought was being run was actually being run (and what I didn't think would be run was actually not being run), but I'd love if there were some holes to poke in the way that these benchmarks have been written.

Unfortunately, it changes the API of wasmtime-wast in order to make public a couple of functions (and one type returned by one of those functions) that the benchmarks need. I considered making these functions public only when cfg(test) is active, or to inline the benchmarks into that file itself instead of making a separate benches binary, but I figured that it would be more helpful to ask here what people would consider the best way to go about that. I personally didn't see anything majorly problematic with just making these methods public and it was the easiest method of implementation, so that's the method that I chose for the original version of this PR, with the understanding that I can change it later if necessary.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 09 2020 at 14:46):

Vurich updated PR #1660 from master to main:

So I wrote these benchmarks with the aim of getting a vague idea of how Lightbeam compares to Cranelift in compile-time and runtime performance, you can see the results in this gist https://gist.github.com/Vurich/8696e67180aa3c93b4548fb1f298c29e.

I'd love any criticisms of the methodology used by these benchmarks w.r.t. the likely reliability of the results, I tried to dig into the code to make sure that, in the loop which is being measured, what I thought was being run was actually being run (and what I didn't think would be run was actually not being run), but I'd love if there were some holes to poke in the way that these benchmarks have been written.

Unfortunately, it changes the API of wasmtime-wast in order to make public a couple of functions (and one type returned by one of those functions) that the benchmarks need. I considered making these functions public only when cfg(test) is active, or to inline the benchmarks into that file itself instead of making a separate benches binary, but I figured that it would be more helpful to ask here what people would consider the best way to go about that. I personally didn't see anything majorly problematic with just making these methods public and it was the easiest method of implementation, so that's the method that I chose for the original version of this PR, with the understanding that I can change it later if necessary.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 16 2020 at 10:38):

Vurich updated PR #1660 from master to main:

So I wrote these benchmarks with the aim of getting a vague idea of how Lightbeam compares to Cranelift in compile-time and runtime performance, you can see the results in this gist https://gist.github.com/Vurich/8696e67180aa3c93b4548fb1f298c29e.

I'd love any criticisms of the methodology used by these benchmarks w.r.t. the likely reliability of the results, I tried to dig into the code to make sure that, in the loop which is being measured, what I thought was being run was actually being run (and what I didn't think would be run was actually not being run), but I'd love if there were some holes to poke in the way that these benchmarks have been written.

Unfortunately, it changes the API of wasmtime-wast in order to make public a couple of functions (and one type returned by one of those functions) that the benchmarks need. I considered making these functions public only when cfg(test) is active, or to inline the benchmarks into that file itself instead of making a separate benches binary, but I figured that it would be more helpful to ask here what people would consider the best way to go about that. I personally didn't see anything majorly problematic with just making these methods public and it was the easiest method of implementation, so that's the method that I chose for the original version of this PR, with the understanding that I can change it later if necessary.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 27 2021 at 17:30):

alexcrichton closed without merge PR #1660.


Last updated: Jan 24 2025 at 00:11 UTC