My current model for GUI applications:
The GUI world extends the CLI world, so the program has a main function that is run first, can access arguments, etc., but after the main function returns, the instance stays alive and an exported event handler function gets called when events on windows occur (The first windows have to be created in the main function, more may be created in the event handler). I don't know if that's necessarily the best model, some languages may not like being called after the main function returns and could not work properly. My initial example in Rust works though, and that event model also works with the browser event loop. I'd like to transition that to an event stream model for preview 3, that way everything can stay in an (async) main function.
Windows system capabilities are represented via bitflags that can be queried at runtime. Applications should not use unavailable functions, but actions should just be a no-op in that case and return values are invalid values or default values for the types, e.g. (0, 0) for position.
I still have to figure out how to best represent key events. Are there keys with no corresponding Unicode value (e.g. media keys)? If not, sending the code point should suffice. Otherwise these keys will be handled with their own enum type for the keys.
The underlying type for the window resource is currently an Arc<winit::Window>
.
Also high-dpi scaling is a question, currently I'd just expose the scaling factor from winit to the applications.
@Mendy Berger Thoughts on this?
@Tarek Sander would love to discuss over video. Mind shooting me a DM?
Before I begin, have you had a chance to look at https://github.com/MendyBerger/wasi-webgpu/blob/main/wit/mini-canvas.wit? Seems to cover similar things as what you describe in MVP here. Wonder what you think of our different approaches and how we can combine efforts.
The GUI world extends the CLI...
What's the advantage of having the main function return? Why not have the main function listen for events?
Also, I'm not sure how I feel about having a single event handler for all event types. Don't wanna force all apps to wake up for every possible event type, when it only really cares about a few. (e.g. mouse-move will fire very often, but I don't expect most apps to care about it)
Windows system capabilities are represented via...
I think this can work, no strong opinions here.
I think that feature detection is needed in other places in wasi as well. Maybe make sure that everyone is taking the same approach?
I still have to figure out how to best represent key events...
Can you do something similar to what the web does with KeyboardEvents? I don't know very much about this, but I'm assuming that the web is the best abstraction over these kinds of things to draw from? Really not sure though.
The underlying type for the window resource is currently an Arc<winit::Window>.
Great choice! Just keep in mind that other implementations will wanna use other ways to represent windows. So make sure you don't tie the spec too much to a single library.
Finally, would you be interested in joining the webgpu effort? Don't let the name fool you, we're doing much more than webgpu.
We're meeting every Tuesday 5:30 UTC. I can shoot you an invite if you wanna join, even if just to listen in.
The GUI world extends the CLI...
What's the advantage of having the main function return? Why not have the main function listen for events?
Also, I'm not sure how I feel about having a single event handler for all event types. Don't wanna force all apps to wake up for every possible event type, when it only really cares about a few. (e.g. mouse-move will fire very often, but I don't expect most app to care about it)
The Browser has its own event loop, so to support browsers before the component model gets async support, the event loop needs to lie in the host.
I still have to figure out how to best represent key events...
Can you do something similar to what the web does with KeyboardEvents? I don't know very much about this, but I'm assuming that the web is the best abstraction over these kinds of things to draw from? Really not sure though.
You're probably right, the Web already needs to work on all platforms.
The underlying type for the window resource is currently an Arc<winit::Window>.
Great choice! Just keep in mind that other implementations will wanna use other ways to represent windows. So make sure you don't tie the spec too much to a single library.
There's also raw_window_handle, but all its types are non-owning. The only important thing is that the GPU proposal and the Windowing proposal need to agree on one window type for the host implementation. What type that is is for the individual implementations to decide.
Finally, would you be interested in joining the webgpu effort? Don't let the name fool you, we're doing much more than webgpu.
Yeah, I'm interested in WebGPU anyways, I'm primarily using it with the wgpu crate in Rust. WebGPU + Rust finally fills the hole I had for some of my personal game projects: Low-level language with high performance and cross-platform Graphics and compute API.
Windows system capabilities are represented via...
I think this can work, no strong opinions here.
I think that feature detection is needed in other places in wasi as well. Maybe make sure that everyone is taking the same approach?
Is there already an existing convention for feature detection in WASI? The component model flags seem perfect for it.
Also, I'm not sure how I feel about having a single event handler for all event types. Don't wanna force all apps to wake up for every possible event type, when it only really cares about a few. (e.g. mouse-move will fire very often, but I don't expect most apps to care about it)
This is for API simplicity for now. Ideally once the component model gets streams, you'd be able to get event streams for all event types and subscribe to them. So the API will need a major overhaul for the final version of the component model.
Mendy Berger said:
We're meeting every Tuesday 5:30 UTC. I can shoot you an invite if you wanna join, even if just to listen in.
Just noticed I made a mistake. We meet every Tuesday 5:00 UTC
If you didn't come along with this proposal, I probably would have suggested something like this around now lol. I'm working on (currently: have an idea for) a game engine that works by using WebAssembly modules as the user code and mods, supporting a safe mod system through the WASM guarantees and multiple languages through compilation to WASM. Because I need graphics for that, I was thinking about writing a wgpu API to pass through all safe WebGPU functions.
Tarek Sander said:
The GUI world extends the CLI...
What's the advantage of having the main function return? Why not have the main function listen for events?
Also, I'm not sure how I feel about having a single event handler for all event types. Don't wanna force all apps to wake up for every possible event type, when it only really cares about a few. (e.g. mouse-move will fire very often, but I don't expect most app to care about it)The Browser has its own event loop, so to support browsers before the component model gets async support, the event loop needs to lie in the host.
Good point.
But since this is going to be a solved problem before 1.0, can you maybe build it in a way that has everything in main. And maybe have some temporary method to deal specifically with web? Something like temp-event-handler
that will be removed before 1.0? I don't want the whole spec to be fundamentally changed just because the current beta won't work on browsers.
Tarek Sander said:
Windows system capabilities are represented via...
I think this can work, no strong opinions here.
I think that feature detection is needed in other places in wasi as well. Maybe make sure that everyone is taking the same approach?Is there already an existing convention for feature detection in WASI? The component model flags seem perfect for it.
Dunno. Bug I know that there was discussion about it.
Tarek Sander said:
The GUI world extends the CLI...
What's the advantage of having the main function return? Why not have the main function listen for events?
Also, I'm not sure how I feel about having a single event handler for all event types. Don't wanna force all apps to wake up for every possible event type, when it only really cares about a few. (e.g. mouse-move will fire very often, but I don't expect most app to care about it)The Browser has its own event loop, so to support browsers before the component model gets async support, the event loop needs to lie in the host.
Good point.
But since this is going to be a solved problem before 1.0, can you maybe build it in a way that has everything in main. And maybe have some temporary method to deal specifically with web? Something like
temp-event-handler
that will be removed before 1.0? I don't want the whole spec to be fundamentally changed just because the current beta won't work on browsers.
The component model types I'd use aren't finished though. Can I just use WASI IO pollable? Also I'm not sure if/when browsers will support async WASM. The events will stay the same, the window interaction, too, only the way the events are delivered would change in the final version with the async component model.
I'm actually using pullables in my mini-canvas example. Have you had a chance to look at it?
Yeah. Does it work good with pollables?
Yes
I looked a bit at the component model issues, and for now it seems my approach of a mandatory exported function is the standard approach for callbacks. I think proper browser support has to wait for the async component model if you want to stick everything into main, and for browser support for the component model. But given that the component model is the current pathway to supporting WASM as ES modules, browsers should jump on it pretty quickly once it's finished.
So what do you think? Should my Windowing proposal be merged with this one, or should it stay separate? Since I plan to make almost everything optional through capabilities, including it shouldn't have issues for headless compute applications.
@Mendy Berger ?
Tarek Sander said:
So what do you think? Should my Windowing proposal be merged with this one, or should it stay separate? Since I plan to make almost everything optional through capabilities, including it shouldn't have issues for headless compute applications.
Yes. I would love to join forces!
The windowing proposal would probably replace the mini-canvas in wasi-webgpu since both do the same thing
Have you had a chance to look at the graphics-context
in our repo? It's how we connect the gpu or frame-buffer to the canvas/window.
graphics-context-buffer
would be a TextureView to draw into? mini-canvas
itself would then be a Canvas element's GPUCanvasContext
? graphics-context
seems weird, because it's sort of an intermediary, you create it independently of a WebGPU device or Canvas and just need to connect both and it magically works?
Why is there a need for graphics-context
anyways? Can't you just draw to the Texture of a Canvas/window, or do you want to support the 2d graphics context in the browser, too?
I asked a bit about the Component Model and it's expected for jco
to polyfill async support, too, so keeping everything inside main is a good option.
I just saw there's also the framebuffer you can draw into using the CPU. In that case, wouldn't you just create a frame-buffer
from a method in mini-canvas
? So mini-canvas
can either have an active frame-buffer
or an active WebGPU Texture lend out (so they don't get in each others way during presentation).
Graphics context exists to keep graphics api and presentation layer separate. For now we're only considering two graphics apis (webgpu and frame-buffer) and one windowing api, but that's likely to change over time. New graphics apis show up every few years, and new display types also show up every few years.
On the graphics side:
On the presentations side:
So graphics-context exists to decouple the api from the presentation, so that one can evolve without the other.
Makes sense?
Ok, but many abstractions wouldn't hold true for other display types anyways. E.g. in VR, you need essentially 2 buffers for the screens of both eyes. frame-buffer
seems only concerned with one buffer at the moment, and if you want proper 3D, don't you need to vary the camera position for each eye? WebGPU can't do that on it's own, so doing that also needs special support. I think a common "Surface" primitive for a presentable buffer that a graphics API can use would be enough. The surface configuration should be immutable, and if the application wants to change it, it needs to go through the display API, e.g. applications in car infotainment systems may only support one specific resolution and provide no way of changing it. One other display API could be the raw presented framebuffer, like it's common on embedded systems. The VR display would have 2 surfaces in that case, with both the same size and type, a desktop window or a Canvas element has a single surface.
Mendy Berger said:
The Browser has its own event loop, so to support browsers before the component model gets async support, the event loop needs to lie in the host.
Just thinking out loud: can we require for now that it's run in a service worker, where we can just block?
Yes, that would be an option. Some runtimes like the WASI implementation in VSCode that are purely in JS have to go that route because blocking APIs are impossible otherwise in JS.
Last updated: Jan 24 2025 at 00:11 UTC