alexcrichton opened issue #3547:
Upon thinking about this recently I believe we can shrink the
.addrmap
section of compiled artifacts a significant amount. Currently this section is used to translate from machine code addresses to addresses of instructions within the original wasm file itself. This information is used primarily backtraces to go from machine address to wasm address and then via the wasm dwarf from wasm address to filename and line number.At this time, though, we have a mapping from machine code address to wasm address for every single wasm instruction in the entire module. I don't actually think that this is necessary. Instead I think we only need mappings for trapping instructions and instructions which call a function (not a wasm function but instead a Cranelift-level call to include things like
memory.grow
and such). At this time we're not collecting "asynchronous backtraces" or anything like that so there's no need to actually have an address map for every single wasm instruction in the module.I suspect that this would lead to huge savings on the
.addrmap
section which is currently sometimes even larger than the.text
section. I don't think this will necessarily be trivially implemented, though, and will involve some trickery on the cranelift side of things to correlating the source of all machine instructions, whether they're calling, and whether they can trap.
cfallin commented on issue #3547:
Great idea!
I'll note that the line-number-per-wasm-instruction (or actually one really wants line-number-per-compiled-instruction I think) is actually useful if one is single-stepping through code; the infra right now is I guess fully general because of this use-case. But of course backtraces are different, as you say!
It'd probably be reasonably easy to add a knob implements your suggestion -- I'd do it by returning an
Option<SourceLoc>
here, probably...
fitzgen commented on issue #3547:
As discussed in private chat, we should actually be able to remove the
.addrmap
section completely by updating/fixing/special casing ourwasm -> source
tonative -> source
DWARF translation so that we can use the relevant DWARF sections directly without anynative -> wasm
translation that.addrmap
is providing.
fitzgen commented on issue #3547:
(The DWARF translation would happen at module compilation time, not runtime.)
bjorn3 commented on issue #3547:
DWARF translation only happens when there is DWARF debuginfo in the source modules in the first place.
.addrmap
is also used for backtraces when there is no DWARF debuginfo at all or when DWARF translation is not enabled.
fitzgen commented on issue #3547:
That's correct. In the mode I am proposing, we would still only include the subset of DWARF that we are querying today to reconstruct backtraces after translating the native PC to a Wasm PC via
.addrmap
, this wouldn't imply including every DWARF section and all of their contents.
bjorn3 commented on issue #3547:
The DWARF
.debug_line
section can't encode a native pc -> wasm pc mapping as necessary when there is no debuginfo. It can only encode a native pc -> (file, line, column, flags) tuple mapping. Wasmtime shows module name + wasm pc when there is no debuginfo, right? On one hand if you encode it as native pc -> (file, line=wasm pc, 0, no flags) that would be bigger than what you can do using.addrmap
if you were to encode it using deltas. On the other hand.debug_line
always encodes it as deltas, which makes lookup much slower than the current scheme as you have to traverse the entire section. On native DWARF this is somewhat less painful as every compilation unit gets it's own.debug_line
mapping, but for generated wasm.debug_line
you would likely put the entire wasm module in a single compilation unit.
Last updated: Nov 22 2024 at 16:03 UTC