I'm wondering what the scope of missing features looks like for interpreter support in wasmtime. Has anyone already thought about this, beyond what is mentioned in the docs (that it is a desired feature)?
Running everything in an interpreter is not desired. So additional step can be made is tiering. Currently entire module is compiled using single compiler. It is preferable though to switch between compilers in the middle of execution.
Also the wasmtime heavily depends on the cranelift data structures. There is a desire to abstract there to be used with an interpreter as well.
I think there are real use cases for an interpreter-only build config, such as support for platforms our JITs don't support, or very low-memory environments.
Nevertheless, an interpreter should definitely support tiering, too, which imposes some requirements that otherwise might not be important, such as using the same data structures where required.
I don't think we have documentation of these requirements anywhere. @Môshe van der Sterre are you interested in working on this kind of project?
also, worth mentioning that WAMR exists and is such an interpreter: https://github.com/bytecodealliance/wasm-micro-runtime
We are currently using wasmtime as the prefered wasm runtime at my company. One relevant example is that we use wasm modules for a pluggable data-transform function. We kind of want this to be available on all platforms, including when the application is running in a browser. (The use-case isn't really low-memory or performance sensitive.)
I'm evaluating the different options, including WAMR, and also using the javascript API when in a browser. I'm worried about error handling when using the javascript API, and I'm worried about having to build hybrid Rust+C on multiple platforms when using WAMR. Combined with the fact that we are already using wasmtime, the wasmtime interpreter route would have by far the least impact on our internal codebase.
What direction we want to go in depends on the size of the work within wasmtime. If the scope is relatively small, implementing this myself could be one option. That would require me to dive into the wasmtime codebase, and I currently haven't done that. I have also internally discussed the option of sponsoring a feature bounty for this, in which case more information about what is needed also helps. If the effort to implement interpreter support for wasmtime is truly huge, then I might be better off just learning to live with one of the other options :-)
What are your thoughts about the different options?
@Wang Xin can talk more about the possibility of using WAMR as an interpreter and interfacing with Wasmtime. On the Wasmtime side, I want to mention that there is more than one option: 1) implement or import a Wasm interpreter built in Rust, 2) use a C/C++ interpreter (e.g. Wasm) through bindings, or 3) finish and wire up the existing Cranelift interpreter so that it can execute Wasm modules. I started implementing the Cranelift interpreter as an exploration and have not implemented all of the Cranelift instructions since I have not yet needed them. However, if this interpreter were completed and wired up to Wasmtime, it could interpret the Cranelift IR that Wasmtime produces and (theoretically) execute Wasm modules the same as when Wasmtime compiles modules to machine code.
I would guess option 3 would be less work than option 1 (and not a huge amount of work) but I am familiar with the Cranelift interpreter code so I may be biased. Option 2 might be more work than option 3 (but less than 1?) and would have to be done quite carefully but it could leverage already existing code (e.g. WAMR).
Correction above: in option 2) use a C/C++ interpreter (e.g. "Wasm" should read "WAMR")
@Andrew Brown Thank you for that information. I think I can make a roundabout guess about what is needed to complete the cranelift interpreter, but I think I'm still lacking a bit of context regarding what it takes to wire it up with wasmtime. Perhaps I can remedy that myself by doing some digging around. I'm going to think about this a bit more (and read the code) and discuss it internally. My current thinking is that having this available in wasmtime prevents a lot of complexity creep into our internal code. What are the experiences within this project with having bounties for a feature like this? Although I'm now also considering to just make time available to work on this myself, as that perhaps has more immediate results.
I don't have a crystal clear picture of what it would take to wire up an interpreter to Wasmtime (perhaps @Alex Crichton or @Dan Gohman or @fitzgen (he/him) do) but I would guess that it would involve changes to the wasmtime
and wasmtime-runtime
crates to allow an interpreter to execute modules instead of only the current "compiled module" paradigm. This type of work would have to be done for any interpreter to be integrated with Wasmtime--it is not special to option 1. Another question is what other Wasmtime developers think about supporting these changes; I'll ask tomorrow in the Wasmtime meeting. (Re: bounties, I haven't seen any of that for this project yet).
I believe there are ideas of how we could integrate an interpreter, but the codebase in general is very much not ready for doing so yet
We've generally up to this point not had a compelling use case to implement an interpreter, so we were waiting for that
for example an interpreter which can tier up would be a very different design than one where you just don't want a jit compiler
Foundation-wise there's lots of deep dependencies on cranelift itself for simply types, so we would require a lot of refactoring to get it such that you can build wasmtime without cranelfit at all. Very much possible to do, just to say that it's likely not one PR away!
Good points; sounds like @Môshe van der Sterre might have a use case for the non-tiering compiler, which I feel would be simpler. If we did want to discuss a tiering compiler, I do have a branch in which I can do some of what is necessary for on-stack replacement from the Cranelift interpreter to compiled code and back.
Also, I wonder if some of the hard work to remove dependencies to cranelift could be avoided by using the Cranelift interpreter...
@Alex Crichton I was wondering about the cranelift dependencies. How much of those are JIT-specific and still get in the way when the interpreter is itself cranelift based?
Ok. Sorry to repeat you :-)
There's two sets of dependencies I think. One is just a shared set of types (e.g. ValType
and such), those are easy to extract.
More worrisome, though, is that wasmtime-runtime
assumes that there's a natively-callable function (e.g. a function pointer)
which matches the platform native ABI
which interpreters won't be using since they're not generating code on the fly
so a great place to start would be to refactor the literal crate dependencies such that wasmtime-cranelfit
is the only crate in wasmtime that depends on cranelift
but beyond that there's still work to be done in how modules are represented in wasmtime-runtime
in a way that's friendly for interpreters too
(probably something that's like an enum
or an option which describes the ABI of functions pointers
does that make sense?
re: ABI, I think what I meant above is that, since the Cranelift IR encodes the ABI, a Cranelift interpreter could execute IR for any ABI without really noticing the difference
but it still might make sense to reduce dependencies to cranelift, etc. and for sure we would need to make Module
more flexible
oh also to be clear I'm assuming that a wasmtime interpreter would not use cranelift
even if cranelift has its own interpreter
but we also haven't talked about the design of this at all yet
and this would be a very good point to bring up!
yeah, I blurted out a few options far up above and we probably need to discuss more; I'll bring it up in the Wasmtime meeting tomorrow.
summary: use the Cranelift interpreter, use/build a Wasm interpreter hosted in Rust, use/build a Wasm interpreter hosted in C/C++ (e.g. WAMR)
I'm reading through the code, and I did notice that the call to build_code_memory
is baked into the code path for compilation. As far as I can tell this is the source of the eventual VMFunctionBody
pointers. I can see mem::transmute::<*const VMFunctionBody, unsafe extern "C" fn(*mut VMContext, *mut VMContext, ...)>
calls in a number of places, and I agree that an interpreter cannot really accommodate to that.
I understand that it may be best to refactor some of this to make things more obviously interpreter friendly. But I'm entirely new to this codebase, so I'm currently trying to imagine the least impactful way one could write a proof-of-concept impl Compiler for AndrewsCraneliftInterpreter
. Perhaps I will be better able to understand the pros/cons of the refactoring you suggest after understanding that.
With a proof of concept in mind, I think interpreters would be able to accommodate FnMut(*mut VMContext, *mut VMContext, ...)
instead of the mem::transmute, right? Terribly shoehorned perhaps, but do you think there are any other pieces of code that a sufficiently creative interpreter implementation of trait Compiler
cannot accommodate?
Some background reading about making interpreters fast (beyond just using macro-ops):
"Efficient Interpretation using Quickening" by Brunthaler: https://publications.sba-research.org/publications/dls10.pdf
"Inline Caching Meets Quickening" by Brunthaler: https://publications.sba-research.org/publications/ecoop10.pdf
Although I suppose there isn't a ton of dynamic stuff in Wasm that would benefit from this kind of thing...
maybe just indirect calls
Terribly shoehorned perhaps, but do you think there are any other pieces of code that a sufficiently creative interpreter implementation of trait Compiler cannot accommodate?
We talked about this in the meeting a bit: @fitzgen (he/him) mentioned that with an added void *
parameter we could probably make things work but my impression was @Alex Crichton thought it would be better to refactor it so that it could be done cleanly. (Accurate summary?)
Oh I wouldn't say I think there's a best route to go forward, I think whatever works is a good place to start from and we can iterate from there
@Môshe van der Sterre, the conversation became a discussion on what type of interpreter Wasmtime even needs and perhaps your use case can help clarify that: I re-read what you said above and it is my understanding that you want an interpreter for portability reasons, not as much for performance reasons. Is that accurate? Can you comment more on whether memory constraints are an issue for interpretation? How important is interpreter performance? I assume the Wasm modules touch system interfaces in some way; how do you plan to make that work in the browser?
@Andrew Brown: Yes, portability is pretty much my main concern.
The use-case I mentioned is about our data storage system. This system is distributed, and the data is typed. If a storage user want to register a new datatype, they also provide a wasm module that has various exports. Besides validation, some of these exports are used by the storage nodes when communicating, for example if the datatype requires a logical clock. This is what currently exists. For added context, the user normally only writes a rust struct that compiles into that wasm module. We provide primitives they can add to the struct (such as logical clocks) and macro's for generating those exports that the storage nodes use.
We now want to allow (web)applications to use this storage system in "work offline" mode:
For this to work, the client library essentially has to behave as if it is a storage node itself (while the device is offline). Calling exports from the registered wasm modules is part of the normal storage node behaviour, so this is what I'm aiming for. Obviously we want to keep the client library as portable as possible, but browsers in particular are a target platform for the client library.
(I don't see performance or memory constraits here.)
Joining late the discussion, but I have also a use case for the interpreter. In my case the goal is size in order to be used within an embedded STM32F4 with 2MB flash and 192KB RAM. The idea is the use it as WASM runtime for embedded nodes in a Kubernetes cluster, where https://github.com/deislabs/krustlet is installed on the devices
Sounds like there is more interest in this interpreter topic: https://github.com/bytecodealliance/wasmtime/issues/156#issuecomment-703319935.
@Môshe van der Sterre, @Jiayi Hu: if you all still have interest in this topic the issue I just linked to (or another one like it) would be a good one to comment on and follow. I think there may be enough interest in this interpreter issue to start proposing ideas on GitHub.
@Andrew Brown, Yes still interested very much. In my current estimation, wasmtime interpreter support is the best and possibly easiest way forward for the use-case I mentioned. (Even if it means implementing that support in-house.) Other areas of our product also need my attention, so I need to discuss with my colleagues to determine the priorities for what I work on.
In practical terms, I'm not really sure what steps I need to take (wrt to: commenting on that issue / or proposing ideas on github / or describing meaningful bounty requirements that my company could sponsor). I think the ticket you linked is about new backends much more broadly, and perhaps I should open an issue about an interpreter backend specifically. I also want to learn more about the wasmtime internals myself, perhaps by adding that void *
parameter and exploring the problems I encounter.
Last updated: Nov 22 2024 at 17:03 UTC