wasmtime / issue #10336 Add a `wasmtime objdump` subcommand · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / issue #10336 Add a `wasmtime objdump` subcommand

Wasmtime GitHub notifications bot (Mar 05 2025 at 17:13):

alexcrichton added the wasmtime label to Issue #10336.

Wasmtime GitHub notifications bot (Mar 05 2025 at 17:13):

alexcrichton opened issue #10336:

We talked about this in today's Cranelift meeting but I wanted to write down an issue for this idea. The basic idea is that we have *.cwasm files which have binary tables/metadata in various sections, and we only have objdump -S today to look at the .text section. Otherwise most of our other sections are not easily debuggable. Most sections, however, have something to do with the .text section, so it'd be great to be able to debug all these at the same time.

The general idea is to do something roughly along these lines:

Add a new subcommand, wasmtime objdump, that takes a *.cwasm file as input.

This primarily either use Capstone to disassemble the .text section or invokes llvm-objdump or objdump or similar. My preference would be to use Capstone to have deterministic output regardless of what's installed on the system.

Disasembly would then be printed in a manner similar to objdump -S, one line per instructions. The main difference is that we would weave debug views of other tables at the same time.

For example.wasmtime.traps would be interleaved as ;; trap = CODE after each instruction

Another example is .wasmtime.addrmap would ineterleave ;; @0x0FF537 to show where the instruction maps back to the original binary location.

The reach goal would be to integrate both stack maps and exception tables from the exception-handling proposal into this view as well. That'd ideally make it much easier to view and debug issues with stack maps and exception tables.

Ideally this view would then be used for all disas tests instead of today's "just use capstone". We could then add various --flags to this subcommand to customize exactly what's disassembled (everything? just stack maps? nothing?) along with other options like "print the instruction bytes" etc

Wasmtime GitHub notifications bot (Mar 05 2025 at 17:21):

cfallin commented on issue #10336:

One more thought: once we do this, then one of the main reasons for our continued use of actual ELF as an on-disk container format -- convenient use of other tools like objdump -- goes away. We've had various user questions and confusion over the years that result from our use of ELF -- superficially it looks like Wasmtime AOT compilation produces "just an object/executable" that can be linked in to another program, or run on its own. In order to avoid that confusion, maybe it makes sense to switch to something not-quite-ELF. My favorite idea so far is to switch the magic from \x7FELF to WELF ("Wasm ELF"). (To evaluate: does this impact gdb or perf's implicit understanding of the image mapped in? If so, probably not worth it.)

Wasmtime GitHub notifications bot (Mar 05 2025 at 17:33):

SingleAccretion commented on issue #10336:

To evaluate: does this impact gdb or perf's implicit understanding of the image mapped in? If so, probably not worth it

It would definitely impact LLDB's ability to understand the registered "JIT images", unless we give up on the opportunity to use mapped memory as-is (it is currently being copied, so it's not a problem to mutate the header in the copy, but this copy should ideally be unnecessary).

Besides objdump, there is also [llvm-]dwarfdump.

Wasmtime GitHub notifications bot (Mar 05 2025 at 18:11):

alexcrichton commented on issue #10336:

While a niche use case I do think this would affect perf, specifically in the case where deserialize_file is used to map *.cwasm from disk into the address space that works as-is with perf surprisingly well (I think because it looks like a dlopen'd library). I suspect if we changed the header perf would stop using it by deafult and such frames would turn into [unknown]. The other "niche" part of this use case is that perf-based integration in Wasmtime requires that perf is recording before the module is compiled, so if you attach perf to a process after-the-fact the perf integration of Wasmtime doesn't work.

Basically tl;dr; I do think that changing away from ELF entirely would break perf-attach-to-preexisting process.

Wasmtime GitHub notifications bot (Mar 06 2025 at 17:52):

fitzgen commented on issue #10336:

We could have a Wasm module that has one custom section whose contents are the existing ELF .cwasms, and then when we map it, we map only the custom section. I think that gives us both our existing mappability and hides that we are internally using ELF from unsuspecting users.

Whether that is worth the trouble, however...

Wasmtime GitHub notifications bot (Mar 06 2025 at 18:51):

alexcrichton commented on issue #10336:

Oh that's a good point that the container-on-disk doesn't have to be what we map-in-process, it's always possible to map a subset of the container. I'll note though that the offset has to be page-aligned which means on aarch64 the container has to be at least 64k large to accomodate all page sizes which is relatively hefty to add to all files...

Personally I like having a native object for debugging purposes still in the sense that even if we have wasmtime objdump sometimes it's just more convenient to use native tooling (e.g. llvm-objdump might support more RISC-V instructions than Capstone)

Wasmtime GitHub notifications bot (Mar 06 2025 at 18:52):

cfallin commented on issue #10336:

Yeah, it doesn't seem worth it given all that; trying to solve a social/docs/support problem with a technical solution. Nevermind the diversion, back to the primary topic of building this tool!

Wasmtime GitHub notifications bot (Mar 20 2025 at 19:45):

alexcrichton closed issue #10336:

We talked about this in today's Cranelift meeting but I wanted to write down an issue for this idea. The basic idea is that we have *.cwasm files which have binary tables/metadata in various sections, and we only have objdump -S today to look at the .text section. Otherwise most of our other sections are not easily debuggable. Most sections, however, have something to do with the .text section, so it'd be great to be able to debug all these at the same time.

The general idea is to do something roughly along these lines:

Add a new subcommand, wasmtime objdump, that takes a *.cwasm file as input.

This primarily either use Capstone to disassemble the .text section or invokes llvm-objdump or objdump or similar. My preference would be to use Capstone to have deterministic output regardless of what's installed on the system.

Disasembly would then be printed in a manner similar to objdump -S, one line per instructions. The main difference is that we would weave debug views of other tables at the same time.

For example.wasmtime.traps would be interleaved as ;; trap = CODE after each instruction

Another example is .wasmtime.addrmap would ineterleave ;; @0x0FF537 to show where the instruction maps back to the original binary location.

The reach goal would be to integrate both stack maps and exception tables from the exception-handling proposal into this view as well. That'd ideally make it much easier to view and debug issues with stack maps and exception tables.

Ideally this view would then be used for all disas tests instead of today's "just use capstone". We could then add various --flags to this subcommand to customize exactly what's disassembled (everything? just stack maps? nothing?) along with other options like "print the instruction bytes" etc

Last updated: Apr 17 2025 at 00:13 UTC