Winch & target features · wasmtime

To me these don't necessarily have obvious answers. A knee-jerk reaction might be to support everything in the limit of time, but that I feel like is quite a lot of duplication with Cranelift which already handles all of this sort of target-feature-specific lowerings. I feel like "support everything" in the long term is probably only viable if more is shared with Cranelift rather than duplicating the lowering rules effectively.

The other alternative of "support exactly this one feature set" seems most reasonable to me from a maintainability point of view, but it has the obvious drawback of supporting less hardware or not going out of its way to support better lowerings (but I suppose Winch isn't "all about speed" anyway).

I suppose more-or-less these PRs got me thinking about all the lowering rules we have for various supported CPU features on x86 and if there's a way it can be supported in Winch without duplication. I don't think this is trivial do to at the moment, but was curious if others had thought about this too.

Add clz and ctz instructions to Winch by jeffcharles · Pull Request #6557 · bytecodealliance/wasmtime

Add support for clz and ctz instructions to Winch. Also adds passing CPU flags to filetests.

wip - add i32.popcnt and i64.popcnt to winch by itsrainy · Pull Request #6531 · bytecodealliance/wasmtime

A fast and secure runtime for WebAssembly. Contribute to bytecodealliance/wasmtime development by creating an account on GitHub.

Chris Fallin (Jun 12 2023 at 20:36):

One thought that occurs to me: the way that SpiderMonkey handles some of this duplication is to put some lowering sequences into the MacroAssembler, and then share that between tiers. So one could in theory do masm.tricky_simd(regA, regB, regTemp) in both baseline and optimizing backends. One way of adopting that idea could be to have MachInst variants that are pseudoinsts encapsulating some whole sequence. I'm not sure how much I like that but it is one direction that could work

Chris Fallin (Jun 12 2023 at 20:37):

Regarding concrete support-points in the configuration space: I do think it'd be nice to at least support base-SSE2 and SSE4.2; maybe those two are enough overall for a baseline tier

Chris Fallin (Jun 12 2023 at 20:38):

if we have to pick one, "just SSE4.2" seems OK; we already have the ability to run Wasmtime on SSE2 (or will shortly anyway) with Cranelift

Alex Crichton (Jun 12 2023 at 20:44):

At least from my experience porting all the simd stuff to SSE2 is probably not worth the effort to put in two locations. It's somewhat error-prone and was "questionably worth it" the first time around and almost certainly not worth it to do again.

I do like though that today temporary registers are internal details of the lowerings where the masm approach would have to burn a register to reserve it for temporary use or, as you mentioned, expose it as an API implementation detail that a temporary is needed. That being said I don't really see an alternative to share code here.

Saúl Cabrera (Jun 12 2023 at 20:45):

Those are good points Alex, thanks for bringing this up. The ideal state that I've had in my mind is one in which Winch has full parity with Cranelift, so in this case my expectations lean towards the "support everything" approach that you've described. I'm not coming at this from the speed angle necessarily, but mainly from the embedder's usability and developer experience. To me it seems that giving the embedder a guarantee that both compilers behave the same under the same set of circumstances would be the ideal one. On the other hand, I must admit that I haven't put a lot of thought on how to get there, it was not until I started exploring these instructions when I realized that my assumptions were wrong, since initially I expected that x86_64 backend's emit module would handle any fallback implementations. All that said, I'm curious to know if you or Chris have any ideas on how we could share some of these lowerings; even though I think that the duplication introduced by the two PRs above is not as heavy today, my expectation is that with the introduction of proposals like SIMD, a more manageable solution is desirable, so I'd be open to exploring any potential avenues.

I can also add an agenda item for next week's meeting so that we can discuss this in a bit more detail if everyone is ok with that.

wasmtime/emit.rs at main · bytecodealliance/wasmtime

A fast and secure runtime for WebAssembly. Contribute to bytecodealliance/wasmtime development by creating an account on GitHub.

Saúl Cabrera (Jun 12 2023 at 20:46):

Chris Fallin (Jun 12 2023 at 20:47):

Alex Crichton (Jun 12 2023 at 20:53):

nah that all definitely makes sense, so I guess I would phrase my thinking as we shouldn't duplicate the lowerings across both winch and cranelift while still trying to achieve full support in both

Alex Crichton (Jun 12 2023 at 20:54):

Pseudo-instructions are probably the way to go but I think we may want to explore better ways of defining and modelling them as right now you've got to edit a lot of files to add a pseudo-instruction, but if they were more ergonomic and/or easier to understand I think it'd make more sense to have more "macro instructions" like this

Stream: wasmtime

Topic: Winch & target features

Alex Crichton (Jun 12 2023 at 17:46):