Stream: general

Topic: wasmtime profiling + pprof-rs?


view this post on Zulip Xinyu Zeng (Sep 21 2024 at 13:00):

Is there any way we can use the wasmtime profiler together with https://github.com/tikv/pprof-rs ? The problem is that I am using the Rust API and directly calling perf cannot only profile some parts of the code. pprof-rs has that function but it seems we cannot use them together.

A Rust CPU profiler implemented with the help of backtrace-rs - tikv/pprof-rs

view this post on Zulip bjorn3 (Sep 21 2024 at 13:09):

Perf allows jit engines to create a file with all debuginfo and then when processing the profile it will read this jit engine generated file to handle the debuginfo for the code generated by the jit engine. It seems that pprof doesn't support reading this file.

view this post on Zulip bjorn3 (Sep 21 2024 at 13:14):

A bit of a hack, but maybe launching perf record -p $your_own_pid from within your program right before the point where you want to start recording and killing it when you want to stop recording would work? You probably need to add a sleep of a couple of ms after the starting perf to account for the time it takes for perf to initialize and starts creating a profile.

view this post on Zulip bjorn3 (Sep 21 2024 at 13:17):

Also make sure to use either config.profiler(ProfilingStrategy::PerfMap) or config.profiler(ProfilingStrategy::JitDump) to generate the aformentioned debuginfo file. The PerfMap option produces smaller files, but only includes function names, while the JitDump option produces larger files, but includes a lot more debuginfo (including line info I believe) and requires you to run perf inject --jit on the profile file to merge the jitdump file into the profile file.

view this post on Zulip Xinyu Zeng (Sep 21 2024 at 13:56):

Thanks a lot! I grouped the wasm function calls I want to profile inside a function so I can view them in flamegraph inside that function stack. A similar hack :)

view this post on Zulip Alex Crichton (Sep 21 2024 at 15:34):

I'm not familiar with pprof-rs itself but it looks like to integrate with that directly it would require walking the stack from a signal handler. In theory the DWARF that Wasmtime emits for Cranelift-generated code supports this because we use frame pointers for all functions and otherwise only have to describe the prologue (which I think we do correctly). In that sense in theory it should work (so long as backtrace works from a signal handler, which I thought it didn't...).

By default though if you're just using dwarf unwind tables for jit code you probably won't have good symbols for cranelift-generated code, so if profile collection works you'd likely still need to perform a postprocessing step to symbolicate what was found in cranelift. That's something that we don't provide many APIs for today (e.g. the Module type can't be used for that) but there's no reason we couldn't add more support for that since Wasmtime already does it internally


Last updated: Nov 22 2024 at 17:03 UTC