Currently, UnwindInfo
resides in the cranelift-codegen crate e.g. https://github.com/bytecodealliance/wasmtime/blob/c9a3f05afd45961b0b397f97c4ad79cd7a7c807d/cranelift/codegen/src/isa/unwind/systemv.rs#L100
Nothing stops to make this structures more generic to be used with other compilers
Main issue is knowing what stack unwind info can be provided by Lightbeam, and only Lightbeam knows how registers, IP, SP are saved on the stack
As easy hack, CallFrameInstruction
in this file, can be made public
Yury Delendik said:
As easy hack,
CallFrameInstruction
in this file, can be made public
Looks like a potential path forward, let's see what Jack has to say.
I'm investigating exactly how this works and how Lightbeam could use it, but we will also need Wasmtime to use CallFrameInstruction
s from both Lightbeam and Cranelift instead of only one or the other. Lightbeam is currently not quite up-to-date with the latest Wasmtime because multi-value is causing tests to fail and I don't feel comfortable merging it back in while it's still causing test failures, so if that work is done on the latest version of Wasmtime we'll have to figure out a way to backport it into the version that I've been working on (I believe the last version I merged into my branch is around a month old at this point)
So it looks like Lightbeam will just be able to emit CallFrameInstruction::Cfa(RBP, 16)
at the start of every function and that should be enough, right?
Maybe we could emit rsp + 8
at the start of the function and then rsp + 16
after push rbp
, but we emit the prelude at the start of every function anyway right now. We used to never emit a prelude, and I might change it in the future, but for now our backtraces should be pretty simple.
can you post sample/typical function disassembly?
as a hack, simple set of CallFrameInstruction
might work, but in practice you will need to save all common registers
(I had plans to recover vmctx from parameters based on unwind info, though it might not be feasible just yet)
Yury Delendik said:
as a hack, simple set of
CallFrameInstruction
might work, but in practice you will need to save all common registers
I would say that the fact that we reserve rbp
for the stack pointer is a hack, but just generating simple CallFrameInstruction
s doesn't sound like a hack if we know that we always generate a prelude. How do you mean that we will need to save all common registers?
unwinders follow these instructions, starting from the moment of signal. unwinders often want to know values of specific register (as in my example)
I never use callee-saved registers right now anyway. There's a refactor that I want to do that would make implementing saving callee-saved registers cleanly easy, but I haven't finished that refactor yet.
FDE will help you walk the frame, but sometimes data inside of these frames is important too
Ah, I see what you mean
So I should be emitting Expression
, ValExpression
, Register
and so forth for every instruction I emit?
/me just saying that some effort shall be done so it will not look like a hack
Right, exactly. I'm just trying to find where that boundary is between hack and MVP, because although I don't want it to be a hack, I don't want to implement absolutely full functionality right now because I'm the only one doing the programming on this project and there are other things I need to do.
normally only handful of commands it needed, e.g. Cfa, Offset, maybe RememberState/RestoreState
Should I be annotating every time the stack pointer changes, or is it enough to just tell the debugger to use RBP
?
Because I don't use rbp as a general-purpose register right now, I always push it at the start of the function and reserve it for the stack pointer (same reason as why I don't use callee-saved registers - I want to wait until some refactoring is done before I use it as a general-purpose register)
/me would like to see typical output from lightbeam
Sure, one moment
Wasm:
(func $fib (export "fib") (param $p0 i32) (result i32)
(local $l1 i32)
(local.set $l1
(i32.const 1))
(block $B0
(br_if $B0
(i32.lt_u
(local.get $p0)
(i32.const 2)))
(local.set $l1
(i32.const 1))
(loop $L1
(local.set $l1
(i32.add
(call $fib
(i32.add
(local.get $p0)
(i32.const -1)))
(local.get $l1)))
(br_if $L1
(i32.gt_u
(local.tee $p0
(i32.add
(local.get $p0)
(i32.const -2)))
(i32.const 1)))))
(local.get $l1))
Assembly:
131 bytes:
0: 55 push rbp
1: 48 89 e5 mov rbp, rsp
4: 40 81 fa 02 00 00 00 cmp edx, 2
b: 40 b8 01 00 00 00 mov eax, 1
11: 0f 82 63 00 00 00 jb 0x7a
17: 40 b8 01 00 00 00 mov eax, 1
1d: 48 8d a5 00 00 00 00 lea rsp, [rbp]
24: 48 89 d1 mov rcx, rdx
27: 40 81 c1 ff ff ff ff add ecx, 0xffffffff
2e: 40 52 push rdx
30: 40 50 push rax
32: 40 51 push rcx
34: 48 8d a4 24 f8 ff ff ff lea rsp, [rsp - 8]
3c: 48 8b 94 24 08 00 00 00 mov rdx, qword ptr [rsp + 8]
44: e8 b7 ff ff ff call 0
49: 40 03 84 24 10 00 00 00 add eax, dword ptr [rsp + 0x10]
51: 48 8b 8c 24 18 00 00 00 mov rcx, qword ptr [rsp + 0x18]
59: 40 81 c1 fe ff ff ff add ecx, 0xfffffffe
60: 40 81 f9 01 00 00 00 cmp ecx, 1
67: 48 89 ca mov rdx, rcx
6a: 48 89 ec mov rsp, rbp
6d: 0f 87 aa ff ff ff ja 0x1d
73: 48 8d a5 00 00 00 00 lea rsp, [rbp]
7a: 48 8d a5 00 00 00 00 lea rsp, [rbp]
81: 5d pop rbp
82: c3 ret
so at 1, you will need to change offset of CFA, and at 4 define CFA at rbp
Right, ok so that's what I expected. Is there anything else I'd need to do right now?
here is what cranelift does https://gist.github.com/yurydelendik/1dc5f78bb5edc67041100d7b3b837835#file-fib-frames-dump-L20-L24
BTW my lightbeam output is
0000000000000000 <__wasm_function_0>:
0: 40 81 fa 02 00 00 00 rex cmp $0x2,%edx
7: 40 b8 01 00 00 00 rex mov $0x1,%eax
d: 0f 82 52 00 00 00 jb 65 <__wasm_function_0+0x65>
13: 40 b8 01 00 00 00 rex mov $0x1,%eax
19: 48 89 d1 mov %rdx,%rcx
1c: 40 81 c1 ff ff ff ff rex add $0xffffffff,%ecx
23: 40 52 rex push %rdx
25: 40 50 rex push %rax
27: 40 51 rex push %rcx
29: 48 8b 94 24 00 00 00 mov 0x0(%rsp),%rdx
30: 00
31: e8 ca ff ff ff callq 0 <__wasm_function_0>
36: 40 03 84 24 08 00 00 rex add 0x8(%rsp),%eax
3d: 00
3e: 48 8b 8c 24 10 00 00 mov 0x10(%rsp),%rcx
45: 00
46: 40 81 c1 fe ff ff ff rex add $0xfffffffe,%ecx
4d: 40 81 f9 01 00 00 00 rex cmp $0x1,%ecx
54: 48 89 ca mov %rcx,%rdx
57: 48 8d a4 24 18 00 00 lea 0x18(%rsp),%rsp
5e: 00
5f: 0f 87 b4 ff ff ff ja 19 <__wasm_function_0+0x19>
65: c3 retq
So the upstream version of Lightbeam at bytecodealliance/wasmtime generates different code to the current version that I'm using. The main difference is that the upstream version generates the prelude and resets rsp
by doing mov rsp, rbp
, but that's not necessary. I might actually remove the prelude again, it doesn't look like it'd be too much more difficult to generate the CallFrameInstruction
s for the code that you've posted
in the case above CFA needs to be offset at least after 29 and adjusted at 31/36, but preferably after each push + registers saved (since there is no dedicated frame base)
so it you are getting rid off dedicated frame base (BP), than you will need to take care of maintaining proper CFA
notice that CIE says it is in SP, so if SP changes you need to have CfaOffset
/me hopes it makes sense
cc @Peter Huene just to see if Windows will need more arrangements
So now we've got an idea of how Lightbeam's side of this will work, how do you think we should go about allowing Wasmtime to consume this metadata?
Currently lightbeam creates Compilation
just from the (code) buffer (at crates/environ/src/lightbeam.rs). It is marked with comment at https://github.com/bytecodealliance/wasmtime/blob/main/crates/environ/src/compilation.rs#L52 -- all generated UnwindInfo are consumed by Wasmtime if present in the CompiledFunction
I recommend to move from_buffer logic to lightbeam.rs and just create Compilation/CompiledFunction there with UnwindInfo generated by the lightbeam.
to troubleshot you can use wasmtime wasm2obj -g ...
-- it will create obj file that you can further inspect using objdump -d
and dwarfdump -debug-frame
https://github.com/bytecodealliance/wasmtime/pull/2086 addresses removal of from_buffer
Any progress on the subject? The https://github.com/bytecodealliance/wasmtime/pull/2117 landed, will it affect unwindinfo progress?
So to keep everyone informed, I quit Parity and it's unlikely that I will continue working on the unwind info before my last day of work. Keep this chat open though, it's very possible that Michael or Dmitriy will work on it in the future.
Unwind data structures were made public at https://github.com/bytecodealliance/wasmtime/pull/2357 -- it can make life easier for the lightbeam
Last updated: Dec 23 2024 at 12:05 UTC