Stream: wasmtime

Topic: Metering options


view this post on Zulip spino17 (Dec 29 2024 at 06:57):

I wanted to know about various metering options in Wasmtime and which one seems to be performant. My aim is to get number of instructions executed (during runtime and not number of instructions in the WASM binary) when I do call_async. I see there is something called fuel but read that it might have performance concerns.

view this post on Zulip Victor Adossi (Dec 30 2024 at 08:07):

Fuel was initially the only option but Epoch-based interruption should be more performant for most use cases especially those centered around preventing DOS/infinite-looping components.

I think that if you want to get the exact number of instructions executed you may have to use fuel, but the mapping isn't quite so easy -- set_fuel docs note that *most* instructions consume one unit, but there are some that consume 0, so not sure how pedantic you want to be in that area.

view this post on Zulip spino17 (Dec 30 2024 at 08:12):

thanks @Victor Adossi for the reply. I may not want the exact number of instructions executed but a measure of it being linearly proportional to it would be fine, so for example I might be okay with 0 costed instructions using fuel. Is there a way for it in Epoch based interruptions ?

view this post on Zulip Victor Adossi (Dec 30 2024 at 08:14):

So epochs are time based -- they don't really work at the instruction level, so it's completely divorced. Depends on your use case of course, but if you can identify some period of time that makes sense to segment as epochs, then you should be able to achieve a similar result... I think also epoch-based limitation also fits quite well with a lot of scheduling theory/how various schedulers (below WASM in "the stack") work as well.

view this post on Zulip spino17 (Dec 30 2024 at 08:17):

yeah that's a good suggestion. I was interested in this naive instruction count because of concrete number and associated pricing (for the product). for epoch I think I could do how many epochs passed in executing that wasm function and have pricing per epoch.

view this post on Zulip spino17 (Dec 30 2024 at 08:19):

what if I don't want to interrupt but just get the count of epochs at the end of the execution ? something like fuel API in this example: https://github.com/bytecodealliance/wasmtime/blob/main/examples/fuel.rs

A lightweight WebAssembly runtime that is fast, secure, and standards-compliant - bytecodealliance/wasmtime

view this post on Zulip Victor Adossi (Dec 30 2024 at 08:19):

Ah so if the idea is tied to pricing I think you could definitely do better with a time-based restriction -- this is how most compute platforms will charge right, the idea of "compute seconds" or something similar

view this post on Zulip spino17 (Dec 30 2024 at 08:20):

yeah, this makes sense. It's more performant so that's definitely a pros

view this post on Zulip Victor Adossi (Dec 30 2024 at 08:20):

If you'd like to maintain metrics around used epoch/timing then you'd have to add that implementation yourself to the epoch based approach I think

view this post on Zulip spino17 (Dec 30 2024 at 08:21):

ohh okay, let me take a look at it. Much thanks @Victor Adossi for the help!

view this post on Zulip Victor Adossi (Dec 30 2024 at 08:22):

Ah, what's interesting is you could actually use a combination of fuel and epochs --

the epoch_deadline_callback you pass in gets a StoreContextMut which actually can get access to the amount of fuel remaining

view this post on Zulip Victor Adossi (Dec 30 2024 at 08:22):

You should have a lot of lattitude for making this stuff performant with those tools and the ability to change stuff like the yield intervals and deadlines dynamically (as long as that callback stays fast of course!)

view this post on Zulip Victor Adossi (Dec 30 2024 at 08:23):

Good luck -- please feel free to contribute examples/issues/documentation if you come across anything confusing!

view this post on Zulip spino17 (Dec 30 2024 at 08:25):

Victor Adossi said:

Ah, what's interesting is you could actually use a combination of fuel and epochs --

the epoch_deadline_callback you pass in gets a StoreContextMut which actually can get access to the amount of fuel remaining

That's a really great suggestion! so for interruption epoch will be used which is performant and can compute fuel in the callback.

view this post on Zulip spino17 (Dec 30 2024 at 08:25):

Victor Adossi said:

Good luck -- please feel free to contribute examples/issues/documentation if you come across anything confusing!

yeah absolutely

view this post on Zulip Chris Fallin (Dec 30 2024 at 17:11):

I think this discussion has misunderstood some of the performance characteristics of the options: in particular, if you are getting fuel remaining in an epoch-interrupt handler, that implies fuel instrumentation is compiled in, at which point also using epochs is only added overhead

view this post on Zulip Chris Fallin (Dec 30 2024 at 17:12):

Also, for what it's worth, epochs are not time-based, necessarily: they are counter-based. The usual mode of use is to have some external thing (a thread on the side, or some other part of your runtime) increment the epoch counter regularly. But I would strongly caution against using its notion of "time" for anything related to real money (billing): it's not precise at all

view this post on Zulip Chris Fallin (Dec 30 2024 at 17:12):

Or not necessarily tied to a real system clock

view this post on Zulip Notification Bot (Dec 30 2024 at 17:12):

Chris Fallin has marked this topic as unresolved.

view this post on Zulip Chris Fallin (Dec 30 2024 at 17:13):

There is also the aspect of "noisy neighbor" interference: billing based on wallclock time gives one an unpredictable amount of work for given cost because we're timeslicing on the async executor threads. If you want deterministic and explainable billing, fuel alone is the only way to go

view this post on Zulip Chris Fallin (Dec 30 2024 at 17:13):

(at least, for billing based on "work done", rather than some other cost model, e.g. per-request up to a limit or ...)

view this post on Zulip Victor Adossi (Dec 31 2024 at 03:21):

I think this discussion has misunderstood some of the performance characteristics of the options: in particular, if you are getting fuel remaining in an epoch-interrupt handler, that implies fuel instrumentation is compiled in, at which point also using epochs is only added overhead

Maybe this wasn't clear -- my interpretation of the request was that while fuel is the only precise way to count the actual instructions used, setting the instruction count low and actually stopping frequently was probably not desired.

If I understood @spino17's desire right, they wanted an accurate count of of the number of instructions executed, and I responded that fuel was the way to do that (with some caveats on instruction counting) -- I don't think there was a misunderstanding there. Then the question was asked about how this might interplay in epoch based interruptions, and we discussed that.

The final idea (which I probably didn't explain enough) about using them both in the context of billing was that it might make more sense to enable fuel but not actually try to trap frequently (i.e. set the fuel really high) but only check the used fuel periodically on an epoch determined time schedule.

Maybe I'm wrong here and simply having the fuel instrumentation will be the majority of the negative performance impact, but the idea was that you have the fuel counting happening but do not have to do the fine-grained management, with some ability to stop computations from running an unreasonable amount of time and blocking the executor.

Also, for what it's worth, epochs are not time-based, necessarily: they are counter-based. The usual mode of use is to have some external thing (a thread on the side, or some other part of your runtime) increment the epoch counter regularly. But I would strongly caution against using its notion of "time" for anything related to real money (billing): it's not precise at all

I think we might be splitting hairs here -- I still consider a virtual/logical clock that is incremented in line with a physical one to be time-based (never mind that you still have access to the actual system clock during the callback), but that's neither here nor there.

While we're here though, I'd love to hear how would you actually solve this, @Chris Fallin ? Would you go for purely fuel as a solution?

view this post on Zulip Chris Fallin (Dec 31 2024 at 04:44):

Victor Adossi said:

The final idea (which I probably didn't explain enough) about using them both in the context of billing was that it might make more sense to enable fuel but not actually try to trap frequently (i.e. set the fuel really high) but only check the used fuel periodically on an epoch determined time schedule.

Maybe I'm wrong here and simply having the fuel instrumentation will be the majority of the negative performance impact, but the idea was that you have the fuel counting happening but do not have to do the fine-grained management, with some ability to stop computations from running an unreasonable amount of time and blocking the executor.

Right, yes, I understood the proposal; my main point is that this will be slower. I studied the performance of fuel-counting mechanisms extensively when I came up with and implemented epochs; the overhead exists whether you use fuel to actually interrupt, or just to count execution steps. Additionally adding the epoch checks -- which are another set of instructions on every backedge -- will only slow down execution further.

So: we want to count execution precisely; so we need to flip the "fuel" switch on. We can set the fuel reserves to u64::MAX, never take a fuel interrupt, we're still paying the cost. Now we need a way to periodically interrupt and context-switch back to the caller. We can use fuel -- which we've already paid the instrumentation cost of -- by setting some fuel reserve less than u64::MAX -- or we can turn on epochs, which will add another set of instructions to every backedge. Both have roughly the same interuption cost (a call back into the runtime), but epochs will add yet another slowdown factor.

Fuel checks consist of two parts: load-decrement-store on "used fuel", and a compare-and-branch to see if that crosses zero. The latter is very cheap (almost always correctly predicted); almost all the cost is in the load-and-store traffic. In particular the latencies (three or four cycles each for ld and st) and the store-to-load forwarding is painful on modern CPUs, and hard for the instruction scheduler to predict. We need to store back to memory at least across every callsite, because we don't have a custom calling convention that pins the value in a register (doing so would be a perf tradeoff to evaluate; unclear if always a win). All that to say: the cost is in the counting, not the checking.

While we're here though, I'd love to hear how would you actually solve this, @Chris Fallin ? Would you go for purely fuel as a solution?

Yes, definitely, for all the reasons above. Epochs were designed to be used only when one doesn't need the deterministic counting of fuel.

view this post on Zulip Victor Adossi (Dec 31 2024 at 04:52):

Fuel checks consist of two parts: load-decrement-store on "used fuel", and a compare-and-branch to see if that crosses zero. The latter is very cheap (almost always correctly predicted); almost all the cost is in the load-and-store traffic. In particular the latencies (three or four cycles each for ld and st) and the store-to-load forwarding is painful on modern CPUs, and hard for the instruction scheduler to predict. We need to store back to memory at least across every callsite, because we don't have a custom calling convention that pins the value in a register (doing so would be a perf tradeoff to evaluate; unclear if always a win). All that to say: the cost is in the counting, not the checking.

Thanks for the thorough explanation here -- this really solved misconceptions I had about where the cost was.

view this post on Zulip spino17 (Dec 31 2024 at 16:40):

Thanks @Chris Fallin for the complete picture. This makes much sense now and I should move forward with fuel approach only then. Thanks again @Chris Fallin and @Victor Adossi for the help and the above elaborate discussion.


Last updated: Jan 24 2025 at 00:11 UTC