Stream: wasmtime

Topic: jitdump support with multiple modules?


view this post on Zulip Benjamin Bouvier (Mar 13 2023 at 18:12):

Hi! I've tried the jitdump support in our embedding today, and I can say that something works \o/ I can profile with perf record -k 1, then see some guest symbols show up in hotspot caller/callee trees :partying_face:

However, I'm seeing very little, and most time is still spent under unknown symbols. Our embedding will load multiple wasm modules at the same time, and it seems that only one wasm module's symbols will be injected into the jitdump, as most of the information from other modules isn't available at all.

Is that expected/known? (I will investigate a bit more tomorrow, if no one knows!)

view this post on Zulip Alex Crichton (Mar 13 2023 at 18:38):

I know at least personally I've never exercised multiple modules, but otherwise with perf I rarely see unknown symbols

view this post on Zulip Alex Crichton (Mar 13 2023 at 18:39):

so my guess is you're running into a bug with multiple modules somehow (not that I know how though)

view this post on Zulip Benjamin Bouvier (Mar 14 2023 at 10:27):

I've found the reason: we're building one Engine per wasm module, and each jitdump profiler instance will recreate (thus clobber) the jitdump file with the same id every time. I'll look into reusing the same Engine in our codebase, but wonder if we shouldn't still support having multiple jitdumps, or reuse the same jitdump profiler instance across all Engines instances.

view this post on Zulip Alex Crichton (Mar 14 2023 at 13:43):

Oh good point! I'd lean towards unique files for each engine if that works but one global file also seems reasonable

view this post on Zulip Benjamin Bouvier (Mar 14 2023 at 17:09):

Ok, i've tried suffixing the jitdump file name with a globally incremented atomic counter... then perf doesn't pick it up anymore. I've tried looking into why, if it was related to the mmapping done so that perf detects those files, but with no success. So I've used a single jitdump file for the entire process, and it seems to work now!

view this post on Zulip Alex Crichton (Mar 14 2023 at 17:23):

Ah looks like this is required by perf to have only one file per process since given any one process it'll only recognize one filename

Linux kernel source tree. Contribute to torvalds/linux development by creating an account on GitHub.

view this post on Zulip Benjamin Bouvier (Mar 14 2023 at 22:41):

Ok, after running for 5 hours and not completing (at 15% utilization of a single core :pensive:) the perf inject step is definitely taking too long, unfortunately. Either need to rewrite it in Rust and make it massively parallel, or I shall try vtune instead.

view this post on Zulip Benjamin Bouvier (Mar 14 2023 at 22:43):

It generated 179153 .so files lol

view this post on Zulip Alex Crichton (Mar 14 2023 at 22:44):

Yeah it generates a *.so per-jit-function, not exactly efficient...

view this post on Zulip Benjamin Bouvier (Mar 15 2023 at 09:26):

Wasn't there another dumb way to map code regions to symbols? I do recall something like this for Spidermonkey, where we'd generate a very simplistic file. At the scale of our wasm module, having a dumb mode like this would be pretty sweet :thinking:

view this post on Zulip Alex Crichton (Mar 15 2023 at 14:02):

I'm not aware myself, but it there's an easier option than perf I think that'd be great to have implemented!

view this post on Zulip Benjamin Bouvier (Mar 15 2023 at 15:20):

Oh I was talking about the simpler perf support that just assigns symbol names to code regions, with little granularity. Basically a file that contains lines that are {function start address} {code size} {symbol name}. That was used in the past in Spidermonkey (now SM also uses jitdump!), and I recall it was pretty effective, especially when there are many functions, so I'm tempted to try to implement this, as an additional profiling agent impl.

view this post on Zulip bjorn3 (Mar 15 2023 at 15:58):

Cranelift-jit already supports it. This format doesn't handle reusing memory locations between functions though, so you did have to leak all modules.

view this post on Zulip Benjamin Bouvier (Mar 15 2023 at 16:06):

Interesting, thanks. Turns out wasmtime-jit doesn't use cranelift-jit, the format is simple enough that it might be fun trying it out separately.


Last updated: Nov 22 2024 at 16:03 UTC