Stream: git-wasmtime

Topic: wasmtime / PR #6867 Add kserve backend implementation for...


view this post on Zulip Wasmtime GitHub notifications bot (Aug 21 2023 at 09:53):

geekbeast opened PR #6867 from geekbeast:feature/kserve to bytecodealliance:main:

This implements a kserve backend allowing forwarding of wasi-nn calls over http to servers implementing the kserve protocol (documented here https://github.com/kserve/kserve/blob/master/docs/predict-api/v2/required_api.md).

This makes it easy to offload evaluation of inference workloads through async methods to an external service that can support all the various frameworks. Inference workloads tend to be resource heavy and the most popular frameworks have large security attack surfaces. Being able to control when and where they run will make it easier for people to safely use wasmtime with wasi-nn without having to parse and execute model on arbitrary inputs in process.

This a draft PR and still needs a few more items:

view this post on Zulip Wasmtime GitHub notifications bot (Aug 21 2023 at 10:55):

geekbeast updated PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 21 2023 at 20:45):

geekbeast edited PR #6867:

This implements a kserve backend allowing forwarding of wasi-nn calls over http to servers implementing the kserve protocol (documented here https://github.com/kserve/kserve/blob/master/docs/predict-api/v2/required_api.md). It also makes certain API calls that are expected to be expensive, async.

This makes it easy to offload evaluation of inference workloads through async methods to an external service that can support all the various frameworks. Inference workloads tend to be resource heavy and the most popular frameworks have large security attack surfaces. Being able to control when and where they run will make it easier for people to safely use wasmtime with wasi-nn without having to parse and execute model on arbitrary inputs in process.

This a draft PR and still needs a few more items:

view this post on Zulip Wasmtime GitHub notifications bot (Aug 23 2023 at 06:26):

geekbeast updated PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 23 2023 at 06:31):

geekbeast updated PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 23 2023 at 06:36):

geekbeast updated PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 23 2023 at 10:23):

geekbeast updated PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 24 2023 at 11:26):

geekbeast updated PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 26 2023 at 06:23):

geekbeast updated PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 11 2023 at 19:49):

geekbeast has marked PR #6867 as ready for review.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 11 2023 at 19:49):

geekbeast requested pchickey for a review on PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 11 2023 at 19:49):

geekbeast requested wasmtime-default-reviewers for a review on PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 11 2023 at 19:49):

geekbeast requested wasmtime-core-reviewers for a review on PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 11 2023 at 20:00):

pchickey requested abrown for a review on PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 19 2023 at 20:45):

abrown submitted PR review:

Before I do a closer review, here might be some things to look at:

view this post on Zulip Wasmtime GitHub notifications bot (Nov 07 2023 at 08:07):

geekbeast updated PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 07 2023 at 09:16):

geekbeast updated PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 07 2023 at 09:19):

geekbeast updated PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 07 2023 at 09:24):

geekbeast edited PR #6867:

This implements a kserve backend allowing forwarding of wasi-nn calls over http to servers implementing the kserve protocol (documented here https://github.com/kserve/kserve/blob/master/docs/predict-api/v2/required_api.md). It also makes certain API calls that are expected to be expensive, async.

This makes it easy to offload evaluation of inference workloads through async methods to an external service that can support all the various frameworks. Inference workloads tend to be resource heavy and the most popular frameworks have large security attack surfaces. Being able to control when and where they run will make it easier for people to safely use wasmtime with wasi-nn without having to parse and execute model on arbitrary inputs in process.

This PR also implements a kserve registry so that models can be loaded via load named models and adds some better error reporting within what is possible in the current framework.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 07 2023 at 09:30):

geekbeast updated PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 07 2023 at 09:41):

geekbeast updated PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 07 2023 at 09:50):

geekbeast updated PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 07 2023 at 15:38):

geekbeast updated PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 07 2023 at 21:55):

geekbeast updated PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 07 2023 at 21:57):

geekbeast updated PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 07 2023 at 22:02):

geekbeast updated PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 07 2023 at 22:13):

geekbeast updated PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 07 2023 at 22:18):

geekbeast updated PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 07 2023 at 23:16):

geekbeast updated PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 07 2023 at 23:16):

github-merge-queue[bot] updated PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 07 2023 at 23:20):

geekbeast updated PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 07 2023 at 23:20):

geekbeast updated PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 08 2023 at 02:09):

geekbeast updated PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 08 2023 at 02:29):

geekbeast updated PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 08 2023 at 02:31):

geekbeast updated PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 08 2023 at 03:43):

geekbeast updated PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 08 2023 at 04:21):

geekbeast updated PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 09 2023 at 17:47):

pchickey submitted PR review:

I don't have the context on wasi-nn and kserve to review this comprehensively but from the Rust prespective there are some idiomatic suggestions I provided, and in general this needs a pass to replace all uses of expect outside of #[test] with error handling, as well as cleaning out commented out code from earlier stages in the development process.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 09 2023 at 17:47):

pchickey submitted PR review:

I don't have the context on wasi-nn and kserve to review this comprehensively but from the Rust prespective there are some idiomatic suggestions I provided, and in general this needs a pass to replace all uses of expect outside of #[test] with error handling, as well as cleaning out commented out code from earlier stages in the development process.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 09 2023 at 17:47):

pchickey created PR review comment:

delete commented out code

view this post on Zulip Wasmtime GitHub notifications bot (Nov 09 2023 at 17:47):

pchickey created PR review comment:

for these debugging strings, better to use tracing::error! over eprintln

view this post on Zulip Wasmtime GitHub notifications bot (Nov 09 2023 at 17:47):

pchickey created PR review comment:

delete

view this post on Zulip Wasmtime GitHub notifications bot (Nov 09 2023 at 17:47):

pchickey created PR review comment:

this comment doesn't make sense, maybe its dead code?

view this post on Zulip Wasmtime GitHub notifications bot (Nov 09 2023 at 17:47):

pchickey created PR review comment:

commented out code, not clear to me whether this still needs to be implemented as part of this PR?

view this post on Zulip Wasmtime GitHub notifications bot (Nov 09 2023 at 17:47):

pchickey created PR review comment:

Instead of this helper function, use a std::io::Cursor and then the byteorder::WriteBytesExt trait at each of the call sites.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 09 2023 at 17:47):

pchickey created PR review comment:

not clear what these commented out members are about

view this post on Zulip Wasmtime GitHub notifications bot (Nov 09 2023 at 17:47):

pchickey created PR review comment:

this code panics whenever the server url is malformed or the server doesn't respond correctly, please change it to return errors instead and handle those errors appropriately where this is invoked.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 09 2023 at 17:47):

pchickey created PR review comment:

tracing::error! for this failure

view this post on Zulip Wasmtime GitHub notifications bot (Nov 09 2023 at 17:47):

pchickey created PR review comment:

tracing::debug! here

view this post on Zulip Wasmtime GitHub notifications bot (Nov 09 2023 at 17:47):

pchickey created PR review comment:

commented out code should be deleted?

view this post on Zulip Wasmtime GitHub notifications bot (Nov 09 2023 at 17:47):

pchickey created PR review comment:

? instead of expect

view this post on Zulip Wasmtime GitHub notifications bot (Nov 09 2023 at 20:33):

abrown submitted PR review:

I am in favor of merging this PR. Obviously there are a bunch of Rust-level programming issues to clean up before we do that (like @pchickey points out) but, at the higher wasi-nn level, this mostly makes sense. Maybe if we resolve most of that here, we can do the following in follow up PRs:

Otherwise, I think the general idea of plugging in a new backend is a good one and validates the refactoring to these Box<dyn ...> wrappers. Let's just clean up the extra code here that @pchickey notes and we can move on from there; @geekbeast, let me know if you want to pair up to finish this.

view this post on Zulip Wasmtime GitHub notifications bot (Mar 21 2024 at 22:58):

mtr-fastly updated PR #6867.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 10 2024 at 16:47):

geekbeast updated PR #6867.


Last updated: Dec 23 2024 at 12:05 UTC