lethalbit opened issue #11862:
Feature
As the title says, the ability to optionally append a prefix (or suffix, doesn't matter which really) to the output
JitDumporPerfMapfiles when profiling is enabled.Benefit
The main benefit would be to allow for wasmtime to live in a process that has another different JIT engine inside of it that also emits a perf map or jitdump files.
While this is a niche use case, i've run into needing it myself on a few occasions, primarily when doing Python <-> Rust <-> WASM interop, as Python also emits perf map (or jitdump) files.
Granted this would need post-processing, but that would need to happen regardless as
perfon linux makes the (likely sane) assumption that only one JIT runtime will be in any given process.Implementation
Implementation wise, it wouldn't be too complex, the
jitdumpandperfmapprofiling agents could take anOption<T>for the file prefix/suffix, and then just update the file name with that. TheProfilingStrategyenum would either need an API breaking change to add saidOption<T>to thePerfMap/JitDumpelements or have 2 new "prefixed" variants added that take just theT.In the case of the
PerfMapAgent, we would also likely want to prefixwasm::to the sanitized name entries to properly namespace it, i've not looked too much into theJitDumpAgentbut it would likely need a similar change.Alternatives
The primary alternative would be to not use this, and have each hosted JIT use a different method, one perf map and the other JIT dump, however, the
perftools will only really use the input from one or the other because as mentioned previouslyperfassumes only one JIT per process.This also applies to doing the JIT dump processing and then trying to apply a perf map file to it after.
Example
Here is a screenshot for a locally patched version of wasmtime I had to use in order to get the wanted behaviour, showing a use case for this specific feature, in this case it is Python and wasmtime both emitting a perf map file, which I then post-process to namespace the WASM map and then merge them into a single perf map file prior to using
perf report<img width="1650" height="1085" alt="Image" src="https://github.com/user-attachments/assets/3653cf41-13da-4ca5-a9f0-ca5aa35fb433" />
bjorn3 commented on issue #11862:
For perfmap would it work if both Wasmtime and Python open the perfmap file in append mode and try to atomically write a full entry at a time? That way you wouldn't need any post processing I think. That wouldn't work for jitdump though.
lethalbit commented on issue #11862:
In Pythons case it makes the very much sane assumption that it is the only thing writing to the perf map file.
Plus there might be contention when flushing the file contents regardless, so I don't think that's a very viable option.
bjorn3 commented on issue #11862:
Plus there might be contention when flushing the file contents regardless
If you only buffer full entries that wouldn't matter. At worst you did get a non-deterministic order, not corrupt entries.
In Pythons case it makes the very much sane assumption that it is the only thing writing to the perf map file.
I guess nobody thought about having multiple jits in the same process when they designed this interface, so now everyone has to add workarounds. I presume the same kind of issues would occur if you try to use Python and C# or a javascript engine in the same process.
lethalbit commented on issue #11862:
Yeah, I mean, to be fair, having multiple different JITs in the same process is a bit of an esoteric situation, so it's not too surprising that it was overlooked.
On top of that the
perftools have some long outstanding issues in the first place, in the case of the JIT dump files, when post-processing that into the.sofilesperfwill try to d a symbol lookup for each object if you haveDEBUGINFOD_URLSset, which will fail because of course the upstream debug info server won't have any symbols for them.In general the perf tooling around JIT debug is pretty rough, and while I would love for there to be a better way, I think having prefixed/suffixed dumps/maps would be the current "best" solution for it, even if not optimal.
lethalbit edited a comment on issue #11862:
Yeah, I mean, to be fair, having multiple different JITs in the same process is a bit of an esoteric situation, so it's not too surprising that it was overlooked.
On top of that the
perftools have some long outstanding issues in the first place, in the case of the JIT dump files, when post-processing that into the.sofilesperfwill try to do a symbol lookup for each object if you haveDEBUGINFOD_URLSset, which will fail because of course the upstream debug info server won't have any symbols for them.In general the perf tooling around JIT debug is pretty rough, and while I would love for there to be a better way, I think having prefixed/suffixed dumps/maps would be the current "best" solution for it, even if not optimal.
alexcrichton commented on issue #11862:
For perfmap support unforunately this isn't something we can fix I believe. Perf itself hardcodes the filename. Similarly for jitdump the filename to be recognized is hardcoded. Given that all we can really do I think is what @bjorn3 is suggestion, which is to open files in
O_APPENDmode and write out entire entries at once.
lethalbit commented on issue #11862:
Hence why I was suggesting a prefixed or postfixed variant so that it could be post-processed as needed.
But looking at the current implementation in Python that looks like they also open in append mode:
So maybe that would be the best way, assuming the "namespacing" of the wasm entries is done,
While I would still personally like to be able to split it out, if this is the better solution then that works.
alexcrichton commented on issue #11862:
Is your goal to separate out the profiles so wasm stacks are excluded, for example when you're profiling Python? or vice-versa? Or is the goal to be able to profile everything together? In some ways our hands are tied here as we don't really want to maintain/recommend nonstandard tooling for post-processing files (it's already confusing enough as is sort of...) so if it works for you to have everything in one file I think then that it's basically on us to work with preexisting files and ensure that we append entire records at once and fail if a partial write is ever done.
lethalbit commented on issue #11862:
Ideally It would have been nice to be able to have them seperate if I wanted to look at wasm-only or Python-only, but honestly, as long as the wasm entries are properly prefixed (with
wasm::or something, as Python doespy::) then they can be split out after if needed and it's not a huge deal.
alexcrichton closed issue #11862:
Feature
As the title says, the ability to optionally append a prefix (or suffix, doesn't matter which really) to the output
JitDumporPerfMapfiles when profiling is enabled.Benefit
The main benefit would be to allow for wasmtime to live in a process that has another different JIT engine inside of it that also emits a perf map or jitdump files.
While this is a niche use case, i've run into needing it myself on a few occasions, primarily when doing Python <-> Rust <-> WASM interop, as Python also emits perf map (or jitdump) files.
Granted this would need post-processing, but that would need to happen regardless as
perfon linux makes the (likely sane) assumption that only one JIT runtime will be in any given process.Implementation
Implementation wise, it wouldn't be too complex, the
jitdumpandperfmapprofiling agents could take anOption<T>for the file prefix/suffix, and then just update the file name with that. TheProfilingStrategyenum would either need an API breaking change to add saidOption<T>to thePerfMap/JitDumpelements or have 2 new "prefixed" variants added that take just theT.In the case of the
PerfMapAgent, we would also likely want to prefixwasm::to the sanitized name entries to properly namespace it, i've not looked too much into theJitDumpAgentbut it would likely need a similar change.Alternatives
The primary alternative would be to not use this, and have each hosted JIT use a different method, one perf map and the other JIT dump, however, the
perftools will only really use the input from one or the other because as mentioned previouslyperfassumes only one JIT per process.This also applies to doing the JIT dump processing and then trying to apply a perf map file to it after.
Example
Here is a screenshot for a locally patched version of wasmtime I had to use in order to get the wanted behaviour, showing a use case for this specific feature, in this case it is Python and wasmtime both emitting a perf map file, which I then post-process to namespace the WASM map and then merge them into a single perf map file prior to using
perf report<img width="1650" height="1085" alt="Image" src="https://github.com/user-attachments/assets/3653cf41-13da-4ca5-a9f0-ca5aa35fb433" />
Last updated: Dec 06 2025 at 06:05 UTC