Stream: git-wasmtime

Topic: wasmtime / issue #7550 Trap on excessive storage use


view this post on Zulip Wasmtime GitHub notifications bot (Nov 16 2023 at 16:39):

rahulksnv opened issue #7550:

Our rust code(substrate executor in this case which embeds wasmtime) currently calls into wasmtime::TypedFunc::call() to execute some WASM code. And the WASM code calls back (via the host functions/ABI) to get()/put() storage on the substrate side (these APIs always succeed right now). What we want to achieve is this: when the storage use during execution exceeds a threshold, stop execution. Just like out of gas scenario, except for other reasons.

  1. One approach could be for the put() APIs to return failure, and the caller WASM code checks the return value and exits with error. This has the downside that we have to change all the call sites, which is not easy or not feasible in some cases (where we don't have the source)
  2. Cleaner approach would be for wasmtime to trap on return from the host function, changes being confined to few places.

It is not clear how (2) could be implemented, assuming we maintain a fork of wasmtime for now. Some mechanisms might be:

  1. Have the called host function signal wasmtime (via a flag, set by a callback from host function -> wasmtime) that it should stop executing.
  2. Associate a cost with the called host function, which can be queried by wasmtime on return from host function

May be there are other ways to add this support, please let me know.

Thanks

view this post on Zulip Wasmtime GitHub notifications bot (Nov 16 2023 at 16:47):

alexcrichton commented on issue #7550:

It should be the case that all host function definitions are allowed to return a Result<T> where Err becomes a trap, would that work for your use case? I'm not sure if you've tried that and it didn't work for your use case though

view this post on Zulip Wasmtime GitHub notifications bot (Nov 16 2023 at 17:02):

rahulksnv commented on issue #7550:

Thanks, guess that would be approach 1 mentioned above? May be I am missing it - if the host function returns Result<T>, would this automatically result in a trap? Or does the caller need to check for error and exit?

The latter would be a problem unfortunately, as it may not be possible/feasible to change all call sites

view this post on Zulip Wasmtime GitHub notifications bot (Nov 16 2023 at 17:08):

alexcrichton commented on issue #7550:

No I think it would be the (2) above you mentioned, where if a host function returns Result<T> then all Err payloads are interpreted as a traps and it'l lbe made availble to the original call() invocation. WebAssembly doesn't get to run once the host returns a trap

view this post on Zulip Wasmtime GitHub notifications bot (Nov 16 2023 at 17:33):

rahulksnv commented on issue #7550:

Great, I am going to try the changes for 2 and will update here, thanks

view this post on Zulip Wasmtime GitHub notifications bot (Dec 02 2023 at 21:05):

rahulksnv commented on issue #7550:

I was able to get it working. This needed a fix on substrate side (the issue: Result<> from a host function gets ABI encoded into a Result<u64, String>, which is always Ok() and thus not resulting in a trap)

But have since discovered this alone is not sufficient for our use case. We would need some way to catch exceptions. An example call stack when trap happens (frame 0 causes the trap):
```
0: 0x171846 - <unknown>!sp_io::storage::extern_host_function_impls::get_with_limit_check::h068fc620f18bf7e4
1: 0xfcedd - <unknown>!frame_support::storage::unhashed::get::hb4c6c9ed6b10008c
2: 0x11570f - <unknown>!evm_runtime::eval::system::sload::he1390a565d392c58
3: 0xf0f8e - <unknown>!evm_runtime::eval::eval::h903150f53b82662b
4: 0x11c30c - <unknown>!evm::executor::stack::executor::StackExecutor<S,P>::execute_with_call_stack::h281d1dec751deb4a
5: 0x11aea0 - <unknown>!evm::executor::stack::executor::StackExecutor<S,P>::transact_call::h101f1e2ecb66bb30
6: 0x9f55c - <unknown>!<pallet_evm::runner::stack::Runner<T> as pallet_evm::runner::Runner<T>>::call::h75d468b40e2d2cf5
7: 0x87697 - <unknown>!pallet_ethereum::<impl pallet_ethereum::pallet::Pallet<T>>::apply_validated_transaction::hbddf3dce60009c61
8: 0xe58cd - <unknown>!frame_support::storage::transactional::with_transaction::hc324dea241f9d919
9: 0xf9f0c - <unknown>!environmental::local_key::LocalKey<T>::with::hbe7945445f325907
10: 0x106a87 - <unknown>!<evm_domain_runtime::RuntimeCall as frame_support::traits::dispatch::UnfilteredDispatchable>::dispatch_bypass_filter::hef04c77190b8d67e
11: 0x10633f - <unknown>!<evm_domain_runtime::RuntimeCall as fp_self_contained::SelfContainedCall>::apply_self_contained::h037bf4d32120f73a
12: 0xea80e - <unknown>!<fp_self_contained::checked_extrinsic::CheckedExtrinsic<AccountId,Call,Extra,SelfContainedSignedInfo> as sp_runtime::traits::Applyable>::apply::heeb43ab468d6407d
13: 0x9bb0f - <unknown>!domain_pallet_executive::Executive<>::apply_extrinsic::h4b45100c1f9650db <-- Catch
15: 0xa5bfe - <unknown>!BlockBuilder_apply_extrinsic

Currently, when the trap is raised as result of host function failing[1], it unwinds all the way and exits the runtime. This leaves some incomplete operations in progress. Instead, what we need is get back to frame 13 and handle some cleanups before exiting.

Question: is it possible to execute/does wasm allow a call under `try { .. } catch { .. }` kind of thing? Or is there any eqiuvalent mechanism that could be used? Thanks

[1] https://github.com/bytecodealliance/wasmtime/blob/v8.0.1/crates/wasmtime/src/func.rs#L1955
~~~

view this post on Zulip Wasmtime GitHub notifications bot (Dec 02 2023 at 21:27):

bjorn3 commented on issue #7550:

A trap is fatal. Once a trap happens it is no longer safe to enter the wasm module. What you may be looking for are wasm exceptions, which are not yet stable and currently unimplemented in wasmtime.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 02 2023 at 22:50):

rahulksnv commented on issue #7550:

yes, throw/catch exception would be what I am looking for, which looks like unavailable unfortunately.

As a work around, is there some way to start a nested/sandboxed runtime from within wasm code? (the sandbox would run the part that could possibly trap). There may be performance implications with nested execution, but wanted to check

view this post on Zulip Wasmtime GitHub notifications bot (Dec 03 2023 at 18:47):

alexcrichton commented on issue #7550:

Wasm can technically call the host to call back into wasm which could catch a trap, but in general I would not recommend catching traps and resuming wasm. That's ripe for things like memory leaks because no destructors are ever run so nothing on the stack would get cleaned up.

I would otherwise agree with @bjorn3 that your use case seems tailor-made for wasm exceptions.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 03 2023 at 18:58):

bjorn3 commented on issue #7550:

That's ripe for things like memory leaks because no destructors are ever run so nothing on the stack would get cleaned up.

And UB for the wasm guest! LLVM assumes that traps will cause execution to stop and optimizes accordingly.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 03 2023 at 19:10):

rahulksnv commented on issue #7550:

Thanks for the inputs. Any idea on this:

As a work around, is there some way to start a nested/sandboxed runtime from within wasm code? (the sandbox would run the part that could possibly trap). There may be performance implications with nested execution, but wanted to check

view this post on Zulip Wasmtime GitHub notifications bot (Dec 03 2023 at 22:43):

rahulksnv commented on issue #7550:

I was also wondering if the Wasm instance/engine can be signaled to stop in some scenarios (e.g) host function returns an error code indicating limit reached, etc around [1]. That sounds like a cleaner solution, than having to deal with traps/exceptions.

Found this related thread: https://github.com/bytecodealliance/wasmtime/issues/860. From the thread I couldn't tell if this is already supported or not (also, substrate currently uses wasmtime version 8.0.1)

[1] https://github.com/bytecodealliance/wasmtime/blob/v8.0.1/crates/wasmtime/src/func.rs#L1955

view this post on Zulip Wasmtime GitHub notifications bot (Dec 04 2023 at 15:30):

alexcrichton commented on issue #7550:

My first paragraph above is addressing your idea. It's theoretically possible, but not recommended.

I'm not sure what you mean in your latest comment though because that trap-raising function is what you're already experimenting with above and you're saying it's not suitable above but may be suitable in your latest comment? Host functions can halt the execution of wasm at any time through returning an error which is translated to a WebAssembly trap.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 04 2023 at 15:45):

rahulksnv commented on issue #7550:

I was thinking of 4th option upon return from host function: stop the engine (now we have Ok/Err/Panic). On more thought, this also won't work even if feasible, as stack won't be unwound

So yeah, looks like exceptions is the only way to go unfortunately

view this post on Zulip Wasmtime GitHub notifications bot (Dec 04 2023 at 15:47):

rahulksnv commented on issue #7550:

Just to close the loop on this, please let me know about this question from above as well:

As a work around, is there some way to start a nested/sandboxed runtime from within wasm code? (the sandbox would run the part that could possibly trap). There may be performance implications with nested execution, but wanted to check

view this post on Zulip Wasmtime GitHub notifications bot (Apr 08 2024 at 16:45):

rahulksnv closed issue #7550:

Our rust code(substrate executor in this case which embeds wasmtime) currently calls into wasmtime::TypedFunc::call() to execute some WASM code. And the WASM code calls back (via the host functions/ABI) to get()/put() storage on the substrate side (these APIs always succeed right now). What we want to achieve is this: when the storage use during execution exceeds a threshold, stop execution. Just like out of gas scenario, except for other reasons.

  1. One approach could be for the put() APIs to return failure, and the caller WASM code checks the return value and exits with error. This has the downside that we have to change all the call sites, which is not easy or not feasible in some cases (where we don't have the source)
  2. Cleaner approach would be for wasmtime to trap on return from the host function, changes being confined to few places.

It is not clear how (2) could be implemented, assuming we maintain a fork of wasmtime for now. Some mechanisms might be:

  1. Have the called host function signal wasmtime (via a flag, set by a callback from host function -> wasmtime) that it should stop executing.
  2. Associate a cost with the called host function, which can be queried by wasmtime on return from host function

May be there are other ways to add this support, please let me know.

Thanks


Last updated: Nov 22 2024 at 16:03 UTC