wasmtime / issue #6282 wasmtime: In-process sampling prof... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / issue #6282 wasmtime: In-process sampling prof...

Wasmtime GitHub notifications bot (Apr 26 2023 at 00:32):

This PR now produces output that works as expected with https://profiler.firefox.com/, and I don't think it's fundamentally missing anything.

Before merging, I'd like to move the GuestProfiler struct somewhere in the wasmtime crate, instead of leaving it in wasmtime-cli where only wasmtime run can use it.

I don't think any of the implementation will change though, just move elsewhere, so I'm marking this as ready for review.

Wasmtime GitHub notifications bot (Apr 26 2023 at 21:44):

github-actions[bot] commented on issue #6282:

Subscribe to Label Action

cc @kubkon, @peterhuene

<details>
This issue or pull request has been labeled: "wasi", "wasmtime:api"

Thus the following users have been cc'd because of the following labels:

kubkon: wasi

peterhuene: wasmtime:api

To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.

Learn more.
</details>

Wasmtime GitHub notifications bot (Apr 29 2023 at 01:22):

jameysharp commented on issue #6282:

I got rid of the need for Store in sample(). Then I discovered that I can't put a GuestProfiler directly in the Host structure because some of the WASI extensions require that Host must be cloneable.

As far as I can tell I end up needing an Arc and Mutex anyway. I would love for somebody to tell me how to avoid that.

It was still worth doing this because now any particular PC is only looked up once per profile and cached by fxprof_processed_profile. Different PCs within the same function need to be looked up separately but since we only collect stack traces at epoch interruption points there aren't so many distinct PCs. This should significantly reduce profiling overhead but I haven't measured how much impact it actually has.

I struggled with this a bunch because I was trying to make it automatically use all modules which have been registered within the current Store's ModuleRegistry. But for WASI commands the module isn't registered in the store until you actually try to invoke it, and I tried all sorts of ways to hook into that registration before realizing that the caller knows which modules they care about and I can just make it their problem to tell me.

Then I decided that it's a feature to be able to exclude modules from profiles: if you have some proprietary module that you're linking with your customer's modules, you may not want to reveal anything about the implementation of your module when providing them with a profile.

Wasmtime GitHub notifications bot (May 01 2023 at 15:26):

alexcrichton commented on issue #6282:

As far as I can tell I end up needing an Arc and Mutex anyway.

Ah this is due to the integration of wasi-threads where that's modeled by cloning the store's context to duplicate it on another thread. Otherwise though the cloning is not required. You can take a similar route as the WasiNnCtx integration, for example, which additionally does not implement Clone so what it does is it's bundled into an Arc where whenever it's required it calls Arc::get_mut. That fails if there's more than one clone, or more than one thread, and the error message says "this doesn't support threads yet". I think that profiling will fall into the same bucket?

In the long-term future I could imagine that there's a standalone profiler struct where you acquire thread-specific profilers from that structure. The Host could store the standalone profiler (which internally probably has an arc and a mutex) and additionally thread-specific profilers. The Clone for Host impl would then use the standalone profiler to get a new thread-specific profiler, allowing each thread to have mutable access to its profiler but no others. Something like that. No need to do here of course, mostly just modeling how this may look in the future with support for threads.

I can just make it their problem to tell me.

This seems like a neat idea to me! I like the idea of precomputing symbol tables. I do think this has the downside of that it has more setup required, but that's also perhaps not a bad thing and it's not exactly onerous at this time, so it seems fine by me.

I got rid of the need for Store in sample().

Personally I'm a bit wary of this because to me I wouldn't want to commit long-term to being able to capture a stack trace without any input. I'm not sure how we'll ever move away from those thread locals but I sure would love to one day if someone gets a good idea. In such a situation there's no way to start the backtrace without a contextual store being passed in. All that being said though it's not like we need to get the API 100% prefect to begin with, that's why we have major version bumps after all. I mostly wanted to put this out there, but I additionally think it's fine to land as-is with your current strategy. If and when we remove TLS (fingers crossed) we can figure out how to change this API at this point. I realize that passing the store into something stored within the store is not easy, so let's stick with what you've got.

Wasmtime GitHub notifications bot (May 02 2023 at 00:41):

jameysharp commented on issue #6282:

Thanks for the tips, Alex: that worked!

Okay, I've addressed most of the issues that you all have raised. What I haven't done is added documentation to the Wasmtime book, or API usage examples to the doc-comments. I'd kind of like to merge the implementation and then come back to that, if that's okay. I hope the doc-comments I've written so far will be a good start for the book, and the usage in run.rs will be a good start for the usage examples…

There are also still several to-do items, like measuring CPU time and noting wasm entry/leave events, which I also want to defer to future work.

Last updated: Apr 17 2025 at 08:04 UTC