Stream: wasmtime

Topic: Interpreter support


view this post on Zulip Môshe van der Sterre (Aug 31 2020 at 23:45):

I'm wondering what the scope of missing features looks like for interpreter support in wasmtime. Has anyone already thought about this, beyond what is mentioned in the docs (that it is a desired feature)?

view this post on Zulip Yury Delendik (Sep 01 2020 at 14:00):

Running everything in an interpreter is not desired. So additional step can be made is tiering. Currently entire module is compiled using single compiler. It is preferable though to switch between compilers in the middle of execution.

view this post on Zulip Yury Delendik (Sep 01 2020 at 14:02):

Also the wasmtime heavily depends on the cranelift data structures. There is a desire to abstract there to be used with an interpreter as well.

view this post on Zulip Till Schneidereit (Sep 01 2020 at 16:58):

I think there are real use cases for an interpreter-only build config, such as support for platforms our JITs don't support, or very low-memory environments.

Nevertheless, an interpreter should definitely support tiering, too, which imposes some requirements that otherwise might not be important, such as using the same data structures where required.

I don't think we have documentation of these requirements anywhere. @Môshe van der Sterre are you interested in working on this kind of project?

view this post on Zulip fitzgen (he/him) (Sep 01 2020 at 16:59):

also, worth mentioning that WAMR exists and is such an interpreter: https://github.com/bytecodealliance/wasm-micro-runtime

WebAssembly Micro Runtime (WAMR). Contribute to bytecodealliance/wasm-micro-runtime development by creating an account on GitHub.

view this post on Zulip Môshe van der Sterre (Sep 01 2020 at 18:41):

We are currently using wasmtime as the prefered wasm runtime at my company. One relevant example is that we use wasm modules for a pluggable data-transform function. We kind of want this to be available on all platforms, including when the application is running in a browser. (The use-case isn't really low-memory or performance sensitive.)
I'm evaluating the different options, including WAMR, and also using the javascript API when in a browser. I'm worried about error handling when using the javascript API, and I'm worried about having to build hybrid Rust+C on multiple platforms when using WAMR. Combined with the fact that we are already using wasmtime, the wasmtime interpreter route would have by far the least impact on our internal codebase.

What direction we want to go in depends on the size of the work within wasmtime. If the scope is relatively small, implementing this myself could be one option. That would require me to dive into the wasmtime codebase, and I currently haven't done that. I have also internally discussed the option of sponsoring a feature bounty for this, in which case more information about what is needed also helps. If the effort to implement interpreter support for wasmtime is truly huge, then I might be better off just learning to live with one of the other options :-)

What are your thoughts about the different options?

view this post on Zulip Andrew Brown (Sep 02 2020 at 00:36):

@Wang Xin can talk more about the possibility of using WAMR as an interpreter and interfacing with Wasmtime. On the Wasmtime side, I want to mention that there is more than one option: 1) implement or import a Wasm interpreter built in Rust, 2) use a C/C++ interpreter (e.g. Wasm) through bindings, or 3) finish and wire up the existing Cranelift interpreter so that it can execute Wasm modules. I started implementing the Cranelift interpreter as an exploration and have not implemented all of the Cranelift instructions since I have not yet needed them. However, if this interpreter were completed and wired up to Wasmtime, it could interpret the Cranelift IR that Wasmtime produces and (theoretically) execute Wasm modules the same as when Wasmtime compiles modules to machine code.

Standalone JIT-style runtime for WebAssembly, using Cranelift - bytecodealliance/wasmtime

view this post on Zulip Andrew Brown (Sep 02 2020 at 00:41):

I would guess option 3 would be less work than option 1 (and not a huge amount of work) but I am familiar with the Cranelift interpreter code so I may be biased. Option 2 might be more work than option 3 (but less than 1?) and would have to be done quite carefully but it could leverage already existing code (e.g. WAMR).

view this post on Zulip Andrew Brown (Sep 02 2020 at 01:00):

Correction above: in option 2) use a C/C++ interpreter (e.g. "Wasm" should read "WAMR")

view this post on Zulip Môshe van der Sterre (Sep 02 2020 at 20:50):

@Andrew Brown Thank you for that information. I think I can make a roundabout guess about what is needed to complete the cranelift interpreter, but I think I'm still lacking a bit of context regarding what it takes to wire it up with wasmtime. Perhaps I can remedy that myself by doing some digging around. I'm going to think about this a bit more (and read the code) and discuss it internally. My current thinking is that having this available in wasmtime prevents a lot of complexity creep into our internal code. What are the experiences within this project with having bounties for a feature like this? Although I'm now also considering to just make time available to work on this myself, as that perhaps has more immediate results.

view this post on Zulip Andrew Brown (Sep 02 2020 at 23:28):

I don't have a crystal clear picture of what it would take to wire up an interpreter to Wasmtime (perhaps @Alex Crichton or @Dan Gohman or @fitzgen (he/him) do) but I would guess that it would involve changes to the wasmtime and wasmtime-runtime crates to allow an interpreter to execute modules instead of only the current "compiled module" paradigm. This type of work would have to be done for any interpreter to be integrated with Wasmtime--it is not special to option 1. Another question is what other Wasmtime developers think about supporting these changes; I'll ask tomorrow in the Wasmtime meeting. (Re: bounties, I haven't seen any of that for this project yet).

view this post on Zulip Alex Crichton (Sep 02 2020 at 23:33):

I believe there are ideas of how we could integrate an interpreter, but the codebase in general is very much not ready for doing so yet

view this post on Zulip Alex Crichton (Sep 02 2020 at 23:34):

We've generally up to this point not had a compelling use case to implement an interpreter, so we were waiting for that

view this post on Zulip Alex Crichton (Sep 02 2020 at 23:34):

for example an interpreter which can tier up would be a very different design than one where you just don't want a jit compiler

view this post on Zulip Alex Crichton (Sep 02 2020 at 23:35):

Foundation-wise there's lots of deep dependencies on cranelift itself for simply types, so we would require a lot of refactoring to get it such that you can build wasmtime without cranelfit at all. Very much possible to do, just to say that it's likely not one PR away!

view this post on Zulip Andrew Brown (Sep 02 2020 at 23:37):

Good points; sounds like @Môshe van der Sterre might have a use case for the non-tiering compiler, which I feel would be simpler. If we did want to discuss a tiering compiler, I do have a branch in which I can do some of what is necessary for on-stack replacement from the Cranelift interpreter to compiled code and back.

view this post on Zulip Andrew Brown (Sep 02 2020 at 23:39):

Also, I wonder if some of the hard work to remove dependencies to cranelift could be avoided by using the Cranelift interpreter...

view this post on Zulip Môshe van der Sterre (Sep 02 2020 at 23:39):

@Alex Crichton I was wondering about the cranelift dependencies. How much of those are JIT-specific and still get in the way when the interpreter is itself cranelift based?

view this post on Zulip Môshe van der Sterre (Sep 02 2020 at 23:40):

Ok. Sorry to repeat you :-)

view this post on Zulip Alex Crichton (Sep 02 2020 at 23:40):

There's two sets of dependencies I think. One is just a shared set of types (e.g. ValType and such), those are easy to extract.

view this post on Zulip Alex Crichton (Sep 02 2020 at 23:40):

More worrisome, though, is that wasmtime-runtime assumes that there's a natively-callable function (e.g. a function pointer)

view this post on Zulip Alex Crichton (Sep 02 2020 at 23:41):

which matches the platform native ABI

view this post on Zulip Alex Crichton (Sep 02 2020 at 23:41):

which interpreters won't be using since they're not generating code on the fly

view this post on Zulip Alex Crichton (Sep 02 2020 at 23:41):

so a great place to start would be to refactor the literal crate dependencies such that wasmtime-cranelfit is the only crate in wasmtime that depends on cranelift

view this post on Zulip Alex Crichton (Sep 02 2020 at 23:41):

but beyond that there's still work to be done in how modules are represented in wasmtime-runtime in a way that's friendly for interpreters too

view this post on Zulip Alex Crichton (Sep 02 2020 at 23:42):

(probably something that's like an enum or an option which describes the ABI of functions pointers

view this post on Zulip Alex Crichton (Sep 02 2020 at 23:42):

does that make sense?

view this post on Zulip Andrew Brown (Sep 02 2020 at 23:44):

re: ABI, I think what I meant above is that, since the Cranelift IR encodes the ABI, a Cranelift interpreter could execute IR for any ABI without really noticing the difference

view this post on Zulip Andrew Brown (Sep 02 2020 at 23:45):

but it still might make sense to reduce dependencies to cranelift, etc. and for sure we would need to make Module more flexible

view this post on Zulip Alex Crichton (Sep 02 2020 at 23:47):

oh also to be clear I'm assuming that a wasmtime interpreter would not use cranelift

view this post on Zulip Alex Crichton (Sep 02 2020 at 23:47):

even if cranelift has its own interpreter

view this post on Zulip Alex Crichton (Sep 02 2020 at 23:47):

but we also haven't talked about the design of this at all yet

view this post on Zulip Alex Crichton (Sep 02 2020 at 23:47):

and this would be a very good point to bring up!

view this post on Zulip Andrew Brown (Sep 02 2020 at 23:48):

yeah, I blurted out a few options far up above and we probably need to discuss more; I'll bring it up in the Wasmtime meeting tomorrow.

view this post on Zulip Andrew Brown (Sep 02 2020 at 23:50):

summary: use the Cranelift interpreter, use/build a Wasm interpreter hosted in Rust, use/build a Wasm interpreter hosted in C/C++ (e.g. WAMR)

view this post on Zulip Môshe van der Sterre (Sep 03 2020 at 01:28):

I'm reading through the code, and I did notice that the call to build_code_memory is baked into the code path for compilation. As far as I can tell this is the source of the eventual VMFunctionBody pointers. I can see mem::transmute::<*const VMFunctionBody, unsafe extern "C" fn(*mut VMContext, *mut VMContext, ...)> calls in a number of places, and I agree that an interpreter cannot really accommodate to that.

I understand that it may be best to refactor some of this to make things more obviously interpreter friendly. But I'm entirely new to this codebase, so I'm currently trying to imagine the least impactful way one could write a proof-of-concept impl Compiler for AndrewsCraneliftInterpreter. Perhaps I will be better able to understand the pros/cons of the refactoring you suggest after understanding that.

With a proof of concept in mind, I think interpreters would be able to accommodate FnMut(*mut VMContext, *mut VMContext, ...) instead of the mem::transmute, right? Terribly shoehorned perhaps, but do you think there are any other pieces of code that a sufficiently creative interpreter implementation of trait Compiler cannot accommodate?

view this post on Zulip fitzgen (he/him) (Sep 03 2020 at 16:50):

Some background reading about making interpreters fast (beyond just using macro-ops):

view this post on Zulip fitzgen (he/him) (Sep 03 2020 at 16:51):

Although I suppose there isn't a ton of dynamic stuff in Wasm that would benefit from this kind of thing...

view this post on Zulip fitzgen (he/him) (Sep 03 2020 at 16:51):

maybe just indirect calls

view this post on Zulip Andrew Brown (Sep 03 2020 at 17:25):

Terribly shoehorned perhaps, but do you think there are any other pieces of code that a sufficiently creative interpreter implementation of trait Compiler cannot accommodate?

We talked about this in the meeting a bit: @fitzgen (he/him) mentioned that with an added void * parameter we could probably make things work but my impression was @Alex Crichton thought it would be better to refactor it so that it could be done cleanly. (Accurate summary?)

view this post on Zulip Alex Crichton (Sep 03 2020 at 17:27):

Oh I wouldn't say I think there's a best route to go forward, I think whatever works is a good place to start from and we can iterate from there

view this post on Zulip Andrew Brown (Sep 03 2020 at 17:29):

@Môshe van der Sterre, the conversation became a discussion on what type of interpreter Wasmtime even needs and perhaps your use case can help clarify that: I re-read what you said above and it is my understanding that you want an interpreter for portability reasons, not as much for performance reasons. Is that accurate? Can you comment more on whether memory constraints are an issue for interpretation? How important is interpreter performance? I assume the Wasm modules touch system interfaces in some way; how do you plan to make that work in the browser?

view this post on Zulip Môshe van der Sterre (Sep 03 2020 at 22:31):

@Andrew Brown: Yes, portability is pretty much my main concern.

The use-case I mentioned is about our data storage system. This system is distributed, and the data is typed. If a storage user want to register a new datatype, they also provide a wasm module that has various exports. Besides validation, some of these exports are used by the storage nodes when communicating, for example if the datatype requires a logical clock. This is what currently exists. For added context, the user normally only writes a rust struct that compiles into that wasm module. We provide primitives they can add to the struct (such as logical clocks) and macro's for generating those exports that the storage nodes use.

We now want to allow (web)applications to use this storage system in "work offline" mode:

  1. When still online, the application instructs the storage client library (that it is linked with) to keep a small number of objects in local memory (or local disk). For example, this could be objects related to the logged-in user account.
  2. When the device goes offline, and the application calls into the storage client library for one of those objects, the client library should now respond as if it is still connected to a storage node (albeit one that is partitioned from the rest of the cluster). For datatypes that are partition-tolerant without quorum, this means that from the applications point of view the storage system continues to function normally.
  3. When the device comes back online, the client library syncs back to (the rest of) the cluster.

For this to work, the client library essentially has to behave as if it is a storage node itself (while the device is offline). Calling exports from the registered wasm modules is part of the normal storage node behaviour, so this is what I'm aiming for. Obviously we want to keep the client library as portable as possible, but browsers in particular are a target platform for the client library.

(I don't see performance or memory constraits here.)

view this post on Zulip Jiayi Hu (Sep 30 2020 at 19:55):

Joining late the discussion, but I have also a use case for the interpreter. In my case the goal is size in order to be used within an embedded STM32F4 with 2MB flash and 192KB RAM. The idea is the use it as WASM runtime for embedded nodes in a Kubernetes cluster, where https://github.com/deislabs/krustlet is installed on the devices

Kubernetes Rust Kubelet. Contribute to deislabs/krustlet development by creating an account on GitHub.

view this post on Zulip Andrew Brown (Oct 06 2020 at 23:11):

Sounds like there is more interest in this interpreter topic: https://github.com/bytecodealliance/wasmtime/issues/156#issuecomment-703319935.

Now having the lightbeam backend being added, I am curious on what is the scope of this project. Will it be beneficial to add, say, wasmi backend (for ensuring correctness) or llvm for highly optim...

view this post on Zulip Andrew Brown (Oct 06 2020 at 23:13):

@Môshe van der Sterre, @Jiayi Hu: if you all still have interest in this topic the issue I just linked to (or another one like it) would be a good one to comment on and follow. I think there may be enough interest in this interpreter issue to start proposing ideas on GitHub.

view this post on Zulip Môshe van der Sterre (Oct 10 2020 at 23:45):

@Andrew Brown, Yes still interested very much. In my current estimation, wasmtime interpreter support is the best and possibly easiest way forward for the use-case I mentioned. (Even if it means implementing that support in-house.) Other areas of our product also need my attention, so I need to discuss with my colleagues to determine the priorities for what I work on.
In practical terms, I'm not really sure what steps I need to take (wrt to: commenting on that issue / or proposing ideas on github / or describing meaningful bounty requirements that my company could sponsor). I think the ticket you linked is about new backends much more broadly, and perhaps I should open an issue about an interpreter backend specifically. I also want to learn more about the wasmtime internals myself, perhaps by adding that void * parameter and exploring the problems I encounter.


Last updated: Nov 22 2024 at 17:03 UTC