Stream: git-wasmtime

Topic: wasmtime / issue #4153 Call `memory_grow_failed` hook for...


view this post on Zulip Wasmtime GitHub notifications bot (May 13 2022 at 08:05):

jlkiri opened issue #4153:

Thanks for filing a feature request! Please fill out the TODOs below.

Feature

Call memory_grow_failed hook for custom ResourceLimiters.

Benefit

Currently, although a ResourceLimiter can prevent further memory growth, it is not notified of an error. The error returned from call methods on Linker for example is about reaching unreachable code(???) in this case.

Implementation

Turn this: https://github.com/bytecodealliance/wasmtime/blob/3dbdcfa220df1406b3627772194eff859104e17a/crates/runtime/src/memory.rs#L568-L571

into this:

        // Store limiter gets first chance to reject memory_growing.
        if !store.memory_growing(old_byte_size, new_byte_size, maximum)? {
            store.memory_grow_failed(&format_err!("Memory maximum size exceeded"));
            return Ok(None);
        }

Alternatives

view this post on Zulip Wasmtime GitHub notifications bot (May 13 2022 at 08:06):

jlkiri edited issue #4153:

Feature

Call memory_grow_failed hook for custom ResourceLimiters.

Benefit

Currently, although a ResourceLimiter can prevent further memory growth, it is not notified of an error. The error returned from call methods on Linker for example is about reaching unreachable code(???) in this case.

Implementation

Turn this: https://github.com/bytecodealliance/wasmtime/blob/3dbdcfa220df1406b3627772194eff859104e17a/crates/runtime/src/memory.rs#L568-L571

into this:

        // Store limiter gets first chance to reject memory_growing.
        if !store.memory_growing(old_byte_size, new_byte_size, maximum)? {
            store.memory_grow_failed(&format_err!("Memory maximum size exceeded"));
            return Ok(None);
        }

Alternatives

view this post on Zulip Wasmtime GitHub notifications bot (May 16 2022 at 12:35):

alexcrichton commented on issue #4153:

The intention here is that something in ResourceLimiter is notified about memory growth and failures. Otherwise though I'm not sure it makes sense to do this since if memory_growing returns false we otherwise don't have an error to yield (as you've done in your example a new error is created) and additionally false can only originate from the memory_growing implementation itself so presumably the "grow failed" hook can be applied internally when memory_growing returns false?

view this post on Zulip Wasmtime GitHub notifications bot (May 17 2022 at 13:10):

jlkiri commented on issue #4153:

so presumably the "grow failed" hook can be applied internally when memory_growing returns false

that's what I temporarily did, but it seems like it is called in other cases as well, like when I timeout the instance using the deadline API, so I never really know 100% that memory growth was the cause

view this post on Zulip Wasmtime GitHub notifications bot (May 17 2022 at 16:33):

alexcrichton commented on issue #4153:

Do you have an example of the ambiguous case? Ideally that shouldn't happen and may indicate a bug elsewhere.

view this post on Zulip Wasmtime GitHub notifications bot (May 23 2022 at 17:50):

jlkiri commented on issue #4153:

I have something like this:

impl ResourceLimiter for MemoryLimiter {
    fn memory_growing(&mut self, _current: usize, desired: usize, _maximum: Option<usize>) -> bool {
        if !self.memory_limit_exceeded {
            let is_rejected = self.limiter.memory_growing(_current, desired, _maximum);
            if is_rejected {
                self.memory_limit_exceeded = true;
            }
            return is_rejected;
        }

        false
    }
}


let mut store = create_wasm_store(&engine, wasi);
store.set_epoch_deadline(TICKS_BEFORE_TIMEOUT);

let timeout = run_wasm_timeout(engine.clone(), tx);

let mut linker: Linker<WasmStoreData> = Linker::new(engine);
add_to_linker(&mut linker, |state| &mut state.wasi)?;

linker.module(&mut store, "", module)?;
linker
    .get_default(&mut store, "")?
    .typed::<(), (), _>(&mut store)?
    .call(&mut store, ())
    .or_else(|_| {
        if rx.try_recv().is_ok() {
            return Err(SandboxError::Timeout);
        }

        if store.data().memory_limiter.memory_limit_exceeded {
            timeout.abort();
            return Err(SandboxError::OOM);
        }

        return Err(SandboxError::Internal);
    })?;

The important bits is the ResourceLimiter impl which sets the boolean if grow failed, and the timeout which just periodically calls engine.increment_epoch() and notifying the channel when it's done. The channel is the hack because normally I would expect the call to call on linker to fail with an error like DeadlineExceeded but it doesn't and on top of that even though the VM does terminate after the specified number of ticks, the memory_growing func DOES get called, returns false and sets my boolean. Without explicitly checking in the or_else closure that it was the timeout first, I would think that it was the memory problem. The error return from call by the is the Unreachable` trap.

view this post on Zulip Wasmtime GitHub notifications bot (May 23 2022 at 17:51):

jlkiri edited a comment on issue #4153:

I have something like this:

impl ResourceLimiter for MemoryLimiter {
    fn memory_growing(&mut self, _current: usize, desired: usize, _maximum: Option<usize>) -> bool {
        if !self.memory_limit_exceeded {
            let is_rejected = self.limiter.memory_growing(_current, desired, _maximum);
            if is_rejected {
                self.memory_limit_exceeded = true;
            }
            return is_rejected;
        }

        false
    }
}


let mut store = create_wasm_store(&engine, wasi);
store.set_epoch_deadline(TICKS_BEFORE_TIMEOUT);

let timeout = run_wasm_timeout(engine.clone(), tx);

let mut linker: Linker<WasmStoreData> = Linker::new(engine);
add_to_linker(&mut linker, |state| &mut state.wasi)?;

linker.module(&mut store, "", module)?;
linker
    .get_default(&mut store, "")?
    .typed::<(), (), _>(&mut store)?
    .call(&mut store, ())
    .or_else(|_| {
        if rx.try_recv().is_ok() {
            return Err(SandboxError::Timeout);
        }

        if store.data().memory_limiter.memory_limit_exceeded {
            timeout.abort();
            return Err(SandboxError::OOM);
        }

        return Err(SandboxError::Internal);
    })?;

The important bits is the ResourceLimiter impl which sets the boolean if grow failed, and the timeout which just periodically calls engine.increment_epoch() and notifying the channel when it's done. The channel is the hack because normally I would expect the call to call on linker to fail with an error like DeadlineExceeded but it doesn't and on top of that even though the VM does terminate after the specified number of ticks, the memory_growing func DOES get called, returns false and sets my boolean. Without explicitly checking in the or_else closure that it was the timeout first, I would think that it was the memory problem. The error return from call by the way is the Unreachable` trap.

view this post on Zulip Wasmtime GitHub notifications bot (May 23 2022 at 17:52):

jlkiri edited a comment on issue #4153:

I have something like this:

impl ResourceLimiter for MemoryLimiter {
    fn memory_growing(&mut self, _current: usize, desired: usize, _maximum: Option<usize>) -> bool {
        let is_rejected = self.limiter.memory_growing(_current, desired, _maximum);
        if is_rejected {
            self.memory_limit_exceeded = true;
        }
        return is_rejected;
    }
}


let mut store = create_wasm_store(&engine, wasi);
store.set_epoch_deadline(TICKS_BEFORE_TIMEOUT);

let timeout = run_wasm_timeout(engine.clone(), tx);

let mut linker: Linker<WasmStoreData> = Linker::new(engine);
add_to_linker(&mut linker, |state| &mut state.wasi)?;

linker.module(&mut store, "", module)?;
linker
    .get_default(&mut store, "")?
    .typed::<(), (), _>(&mut store)?
    .call(&mut store, ())
    .or_else(|_| {
        if rx.try_recv().is_ok() {
            return Err(SandboxError::Timeout);
        }

        if store.data().memory_limiter.memory_limit_exceeded {
            timeout.abort();
            return Err(SandboxError::OOM);
        }

        return Err(SandboxError::Internal);
    })?;

The important bits is the ResourceLimiter impl which sets the boolean if grow failed, and the timeout which just periodically calls engine.increment_epoch() and notifying the channel when it's done. The channel is the hack because normally I would expect the call to call on linker to fail with an error like DeadlineExceeded but it doesn't and on top of that even though the VM does terminate after the specified number of ticks, the memory_growing func DOES get called, returns false and sets my boolean. Without explicitly checking in the or_else closure that it was the timeout first, I would think that it was the memory problem. The error return from call by the way is the Unreachable` trap.

view this post on Zulip Wasmtime GitHub notifications bot (May 23 2022 at 17:52):

jlkiri edited a comment on issue #4153:

I have something like this:

impl ResourceLimiter for MemoryLimiter {
    fn memory_growing(&mut self, _current: usize, desired: usize, _maximum: Option<usize>) -> bool {
        let is_allowed = self.limiter.memory_growing(_current, desired, _maximum);
        if !is_allowed {
            self.memory_limit_exceeded = true;
        }
        return is_allowed ;
    }
}


let mut store = create_wasm_store(&engine, wasi);
store.set_epoch_deadline(TICKS_BEFORE_TIMEOUT);

let timeout = run_wasm_timeout(engine.clone(), tx);

let mut linker: Linker<WasmStoreData> = Linker::new(engine);
add_to_linker(&mut linker, |state| &mut state.wasi)?;

linker.module(&mut store, "", module)?;
linker
    .get_default(&mut store, "")?
    .typed::<(), (), _>(&mut store)?
    .call(&mut store, ())
    .or_else(|_| {
        if rx.try_recv().is_ok() {
            return Err(SandboxError::Timeout);
        }

        if store.data().memory_limiter.memory_limit_exceeded {
            timeout.abort();
            return Err(SandboxError::OOM);
        }

        return Err(SandboxError::Internal);
    })?;

The important bits is the ResourceLimiter impl which sets the boolean if grow failed, and the timeout which just periodically calls engine.increment_epoch() and notifying the channel when it's done. The channel is the hack because normally I would expect the call to call on linker to fail with an error like DeadlineExceeded but it doesn't and on top of that even though the VM does terminate after the specified number of ticks, the memory_growing func DOES get called, returns false and sets my boolean. Without explicitly checking in the or_else closure that it was the timeout first, I would think that it was the memory problem. The error return from call by the way is the Unreachable` trap.

view this post on Zulip Wasmtime GitHub notifications bot (May 23 2022 at 17:52):

jlkiri edited a comment on issue #4153:

I have something like this:

impl ResourceLimiter for MemoryLimiter {
    fn memory_growing(&mut self, _current: usize, desired: usize, _maximum: Option<usize>) -> bool {
        let is_allowed = self.limiter.memory_growing(_current, desired, _maximum);
        if !is_allowed {
            self.memory_limit_exceeded = true;
        }
        is_allowed
    }
}


let mut store = create_wasm_store(&engine, wasi);
store.set_epoch_deadline(TICKS_BEFORE_TIMEOUT);

let timeout = run_wasm_timeout(engine.clone(), tx);

let mut linker: Linker<WasmStoreData> = Linker::new(engine);
add_to_linker(&mut linker, |state| &mut state.wasi)?;

linker.module(&mut store, "", module)?;
linker
    .get_default(&mut store, "")?
    .typed::<(), (), _>(&mut store)?
    .call(&mut store, ())
    .or_else(|_| {
        if rx.try_recv().is_ok() {
            return Err(SandboxError::Timeout);
        }

        if store.data().memory_limiter.memory_limit_exceeded {
            timeout.abort();
            return Err(SandboxError::OOM);
        }

        return Err(SandboxError::Internal);
    })?;

The important bits is the ResourceLimiter impl which sets the boolean if grow failed, and the timeout which just periodically calls engine.increment_epoch() and notifying the channel when it's done. The channel is the hack because normally I would expect the call to call on linker to fail with an error like DeadlineExceeded but it doesn't and on top of that even though the VM does terminate after the specified number of ticks, the memory_growing func DOES get called, returns false and sets my boolean. Without explicitly checking in the or_else closure that it was the timeout first, I would think that it was the memory problem. The error return from call by the way is the Unreachable` trap.

view this post on Zulip Wasmtime GitHub notifications bot (May 23 2022 at 17:54):

jlkiri edited a comment on issue #4153:

I have something like this:

impl ResourceLimiter for MemoryLimiter {
    fn memory_growing(&mut self, _current: usize, desired: usize, _maximum: Option<usize>) -> bool {
        let is_allowed = self.limiter.memory_growing(_current, desired, _maximum);
        if !is_allowed {
            self.memory_limit_exceeded = true;
        }
        is_allowed
    }
}
let mut store = create_wasm_store(&engine, wasi);
store.set_epoch_deadline(TICKS_BEFORE_TIMEOUT);

let timeout = run_wasm_timeout(engine.clone(), tx);

let mut linker: Linker<WasmStoreData> = Linker::new(engine);
add_to_linker(&mut linker, |state| &mut state.wasi)?;

linker.module(&mut store, "", module)?;
linker
    .get_default(&mut store, "")?
    .typed::<(), (), _>(&mut store)?
    .call(&mut store, ())
    .or_else(|_| {
        if rx.try_recv().is_ok() {
            return Err(SandboxError::Timeout);
        }

        if store.data().memory_limiter.memory_limit_exceeded {
            timeout.abort();
            return Err(SandboxError::OOM);
        }

        return Err(SandboxError::Internal);
    })?;

The important bits is the ResourceLimiter impl which sets the boolean if grow failed, and the timeout which just periodically calls engine.increment_epoch() and notifying the channel when it's done. The channel is the hack because normally I would expect the call to call on linker to fail with an error like DeadlineExceeded but it doesn't and on top of that even though the VM does terminate after the specified number of ticks, the memory_growing func DOES get called, returns false and sets my boolean. Without explicitly checking in the or_else closure that it was the timeout first, I would think that it was the memory problem. The error return from call by the way is the Unreachable` trap.

view this post on Zulip Wasmtime GitHub notifications bot (May 23 2022 at 17:54):

jlkiri edited a comment on issue #4153:

I have something like this:

impl ResourceLimiter for MemoryLimiter {
    fn memory_growing(&mut self, _current: usize, desired: usize, _maximum: Option<usize>) -> bool {
        let is_allowed = self.limiter.memory_growing(_current, desired, _maximum);
        if !is_allowed {
            self.memory_limit_exceeded = true;
        }
        is_allowed
    }
}
let mut store = create_wasm_store(&engine, wasi);
store.set_epoch_deadline(TICKS_BEFORE_TIMEOUT);

let timeout = run_wasm_timeout(engine.clone(), tx);

let mut linker: Linker<WasmStoreData> = Linker::new(engine);
add_to_linker(&mut linker, |state| &mut state.wasi)?;

linker.module(&mut store, "", module)?;
linker
    .get_default(&mut store, "")?
    .typed::<(), (), _>(&mut store)?
    .call(&mut store, ())
    .or_else(|_| {
        if rx.try_recv().is_ok() {
            return Err(SandboxError::Timeout);
        }

        if store.data().memory_limiter.memory_limit_exceeded {
            timeout.abort();
            return Err(SandboxError::OOM);
        }

        return Err(SandboxError::Internal);
    })?;

The important bits is the ResourceLimiter impl which sets the boolean if grow failed, and the timeout which just periodically calls engine.increment_epoch() and notifying the channel when it's done. The channel is the hack because normally I would expect the call to call on linker to fail with an error like DeadlineExceeded but it doesn't and on top of that even though the VM does terminate after the specified number of ticks, the memory_growing func DOES get called, returns false and sets my boolean. Without explicitly checking in the or_else closure that it was the timeout first, I would think that it was the memory problem. The error return from call by the way is the Unreachable trap.

view this post on Zulip Wasmtime GitHub notifications bot (May 23 2022 at 17:55):

jlkiri edited a comment on issue #4153:

I have something like this:

impl ResourceLimiter for MemoryLimiter {
    fn memory_growing(&mut self, _current: usize, desired: usize, _maximum: Option<usize>) -> bool {
        let is_allowed = self.limiter.memory_growing(_current, desired, _maximum);
        if !is_allowed {
            self.memory_limit_exceeded = true;
        }
        is_allowed
    }
}
let mut store = create_wasm_store(&engine, wasi);
store.set_epoch_deadline(TICKS_BEFORE_TIMEOUT);

let timeout = run_wasm_timeout(engine.clone(), tx);

let mut linker: Linker<WasmStoreData> = Linker::new(engine);
add_to_linker(&mut linker, |state| &mut state.wasi)?;

linker.module(&mut store, "", module)?;
linker
    .get_default(&mut store, "")?
    .typed::<(), (), _>(&mut store)?
    .call(&mut store, ())
    .or_else(|_| {
        if rx.try_recv().is_ok() {
            return Err(SandboxError::Timeout);
        }

        if store.data().memory_limiter.memory_limit_exceeded {
            timeout.abort();
            return Err(SandboxError::OOM);
        }

        return Err(SandboxError::Internal);
    })?;

The important bits is the ResourceLimiter impl which sets the boolean if grow failed, and the timeout which just periodically calls engine.increment_epoch() and notifying the channel when it's done. The channel is the hack because normally I would expect the call to call on linker to fail with an error like DeadlineExceeded but it doesn't and on top of that even though the VM does terminate after the specified number of ticks, the memory_growing func DOES get called, returns false and sets my boolean. Without explicitly checking in the or_else closure that it was the timeout first, I would think that it was the memory problem. The error returned from call by the way is the Unreachable trap.

view this post on Zulip Wasmtime GitHub notifications bot (May 23 2022 at 18:27):

alexcrichton commented on issue #4153:

When dealing with asynchronous "interrupts" like epochs you'll want to stick to one source of truth which is what's happening in the wasm itself. Wasm can miss an epoch tick due to something like a trap which is expected. In the example you wrote you're reinterpreting an error from calling wasm by associating external data with what happened but that's where the race is introduced. Instead I think you'll want to inspect the error coming out of wasm and if it's an epoch error you leave it alone, but otherwise if it's a normal error you can attach information saying that OOM also happened before the error was generated.


Last updated: Dec 23 2024 at 12:05 UTC