SingleAccretion opened issue #9900:
Reproduction steps:
1) Check out #9898.
2) Modify the added test to stop passing-Oopt-level=0
to wasmtime.
3) Run the test.Expected result: the test passes or fails with 'expression unavailable'.
Actual result:
error: Execution was interrupted, reason: Exception 0xc0000005 encountered at address 0x7ff6fcf15f90: Access violation reading location 0x1ae198763ec. The process has been returned to the state before expression evaluation. * thread #1, name = 'main', stop reason = breakpoint 1.1 frame #0: 0x0000028d720c15b8 JIT(0x28d72210080)`DerivedType::InstanceMethod(__this=(__ptr = <read memory from 0x28de65f5f0c failed (0 of 4 bytes read)>)) at generic.cpp:26:5 23 long long DerivedValue = 2; 24 25 int InstanceMethod() { -> 26 debug_break(); 27 return BaseValue + DerivedValue; 28 } 29 (lldb) v (WasmtimeVMContext *) __vmctx = 0x0000028d72165f00 (WebAssemblyPtrWrapper<DerivedType>) __this = (__ptr = <read memory from 0x28de65f5f0c failed (0 of 4 bytes read)>)
It means the location descriptor for
__this
is incorrect.
SingleAccretion commented on issue #9900:
This is an interesting one. Here's how to reproducing the failure:
> lldb.exe -o "b InstanceMethod" -o "r" -- wasmtime.exe -Ccache=n -Ddebug-info generic.wasm 25 int InstanceMethod() { -> 26 debug_break(); 27 return BaseValue + DerivedValue; 28 } 29 (lldb) v (WasmtimeVMContext *) __vmctx = 0x000001dc1b2b2e10 (WebAssemblyPtrWrapper<DerivedType>) this = (__ptr = <read memory from 0x1dc38812e1c failed (0 of 4 bytes read)>)
Here's the expression for
this
/__this
:Variable: id = {0x0000048d}, name = "this", type = "WebAssemblyPtrWrapper<DerivedType>", valid ranges = <block>, location = DW_OP_breg4 RSI+0, DW_OP_plus_uconst 0xc, DW_OP_breg5 RDI+96, DW_OP_deref, DW_OP_swap, DW_OP_const4u 0xffffffff, DW_OP_and, DW_OP_plus, decl = {RSI + 0xc, [RDI + 96]} ==> {[RDI + 96], RSI + 0xc} ==> [RDI + 96] + (RSI + 0xc) & 0xffffffff ; [RDI + 96] is __vmctx->memory ; RSI is the linear memory frame pointer (WASM level frame base, sans offset)
Here's annotated disassembly:
0x262b49d1520 <+0>: push rbp 0x262b49d1521 <+1>: mov rbp, rsp 0x262b49d1524 <+4>: mov r10, qword ptr [rdi + 0x8] 0x262b49d1528 <+8>: mov r10, qword ptr [r10 + 0x10] 0x262b49d152c <+12>: add r10, 0x30 0x262b49d1530 <+16>: cmp r10, rsp 0x262b49d1533 <+19>: ja 0x262b49d15ae ; <+142> at generic.cpp:27:5 0x262b49d1539 <+25>: sub rsp, 0x20 0x262b49d153d <+29>: mov qword ptr [rsp], rbx 0x262b49d1541 <+33>: mov qword ptr [rsp + 0x8], r12 0x262b49d1546 <+38>: mov qword ptr [rsp + 0x10], r13 0x262b49d154b <+43>: mov qword ptr [rsp + 0x18], r15 0x262b49d1550 <+48>: mov r15d, dword ptr [rdi + 0xa0] ; r15 = __vmctx->__stack_pointer 0x262b49d1557 <+55>: mov rsi, r15 ; rsi = r15 (__stack_pointer) 0x262b49d155a <+58>: sub esi, 0x10 ; rsi -= 16 0x262b49d155d <+61>: mov dword ptr [rdi + 0xa0], esi ; __vmctx->__stack_pointer = rsi 0x262b49d1563 <+67>: mov r12, qword ptr [rdi + 0x60] 0x262b49d1567 <+71>: mov dword ptr [r12 + rsi + 0xc], edx 0x262b49d156c <+76>: mov rbx, rdi 0x262b49d156f <+79>: mov r13, rdx 0x262b49d1572 <+82>: mov rsi, rbx ; rsi = __vmctx -> 0x262b49d1575 <+85>: call 0x262b49d1180 ; debug_break at generic.h:9:28 0x262b49d157a <+90>: mov rdx, r13 0x262b49d157d <+93>: mov edx, edx 0x262b49d157f <+95>: mov ecx, dword ptr [r12 + rdx] 0x262b49d1583 <+99>: mov rdx, qword ptr [r12 + rdx + 0x8] 0x262b49d1588 <+104>: mov dword ptr [rbx + 0xa0], r15d ; __vmctx->__stack_pointer = r15 0x262b49d158f <+111>: lea eax, [rcx + rdx] 0x262b49d1592 <+114>: mov rbx, qword ptr [rsp] 0x262b49d1596 <+118>: mov r12, qword ptr [rsp + 0x8] 0x262b49d159b <+123>: mov r13, qword ptr [rsp + 0x10] 0x262b49d15a0 <+128>: mov r15, qword ptr [rsp + 0x18] 0x262b49d15a5 <+133>: add rsp, 0x20 0x262b49d15a9 <+137>: mov rsp, rbp 0x262b49d15ac <+140>: pop rbp 0x262b49d15ad <+141>: ret 0x262b49d15ae <+142>: ud2
So the culprit here is the fact
rsi
, which initially does indeed contain the original WASM frame base (local0
), from which everything else gets evaluated, gets overwritten just before the call we've stopped on, and this is either not reflected in the frame info, or is not accounted for correctly.BTW, here we can also see why the WASM frame base is so prone to being "optimized out" even as it is always used at the end of the method. Local zero gets split into two distinct values and the original, not referenced by anything DI-related, gets allocated into
r15
.
SingleAccretion edited a comment on issue #9900:
This is an interesting one. Here's how to reproducing the failure:
> lldb.exe -o "b InstanceMethod" -o "r" -- wasmtime.exe -Ccache=n -Ddebug-info generic.wasm 25 int InstanceMethod() { -> 26 debug_break(); 27 return BaseValue + DerivedValue; 28 } 29 (lldb) v (WasmtimeVMContext *) __vmctx = 0x000001dc1b2b2e10 (WebAssemblyPtrWrapper<DerivedType>) this = (__ptr = <read memory from 0x1dc38812e1c failed (0 of 4 bytes read)>)
Here's the expression for
this
/__this
:Variable: id = {0x0000048d}, name = "this", type = "WebAssemblyPtrWrapper<DerivedType>", valid ranges = <block>, location = DW_OP_breg4 RSI+0, DW_OP_plus_uconst 0xc, DW_OP_breg5 RDI+96, DW_OP_deref, DW_OP_swap, DW_OP_const4u 0xffffffff, DW_OP_and, DW_OP_plus, decl = {RSI + 0xc, [RDI + 96]} ==> {[RDI + 96], RSI + 0xc} ==> [RDI + 96] + (RSI + 0xc) & 0xffffffff ; [RDI + 96] is __vmctx->memory ; RSI is the linear memory frame pointer (WASM level frame base, sans offset)
Here's annotated disassembly:
0x262b49d1520 <+0>: push rbp 0x262b49d1521 <+1>: mov rbp, rsp 0x262b49d1524 <+4>: mov r10, qword ptr [rdi + 0x8] 0x262b49d1528 <+8>: mov r10, qword ptr [r10 + 0x10] 0x262b49d152c <+12>: add r10, 0x30 0x262b49d1530 <+16>: cmp r10, rsp 0x262b49d1533 <+19>: ja 0x262b49d15ae ; <+142> at generic.cpp:27:5 0x262b49d1539 <+25>: sub rsp, 0x20 0x262b49d153d <+29>: mov qword ptr [rsp], rbx 0x262b49d1541 <+33>: mov qword ptr [rsp + 0x8], r12 0x262b49d1546 <+38>: mov qword ptr [rsp + 0x10], r13 0x262b49d154b <+43>: mov qword ptr [rsp + 0x18], r15 0x262b49d1550 <+48>: mov r15d, dword ptr [rdi + 0xa0] ; r15 = __vmctx->__stack_pointer 0x262b49d1557 <+55>: mov rsi, r15 ; rsi = r15 (__stack_pointer) 0x262b49d155a <+58>: sub esi, 0x10 ; rsi -= 16 0x262b49d155d <+61>: mov dword ptr [rdi + 0xa0], esi ; __vmctx->__stack_pointer = rsi 0x262b49d1563 <+67>: mov r12, qword ptr [rdi + 0x60] 0x262b49d1567 <+71>: mov dword ptr [r12 + rsi + 0xc], edx 0x262b49d156c <+76>: mov rbx, rdi 0x262b49d156f <+79>: mov r13, rdx 0x262b49d1572 <+82>: mov rsi, rbx ; rsi = __vmctx -> 0x262b49d1575 <+85>: call 0x262b49d1180 ; debug_break at generic.h:9:28 0x262b49d157a <+90>: mov rdx, r13 0x262b49d157d <+93>: mov edx, edx 0x262b49d157f <+95>: mov ecx, dword ptr [r12 + rdx] 0x262b49d1583 <+99>: mov rdx, qword ptr [r12 + rdx + 0x8] 0x262b49d1588 <+104>: mov dword ptr [rbx + 0xa0], r15d ; __vmctx->__stack_pointer = r15 0x262b49d158f <+111>: lea eax, [rcx + rdx] 0x262b49d1592 <+114>: mov rbx, qword ptr [rsp] 0x262b49d1596 <+118>: mov r12, qword ptr [rsp + 0x8] 0x262b49d159b <+123>: mov r13, qword ptr [rsp + 0x10] 0x262b49d15a0 <+128>: mov r15, qword ptr [rsp + 0x18] 0x262b49d15a5 <+133>: add rsp, 0x20 0x262b49d15a9 <+137>: mov rsp, rbp 0x262b49d15ac <+140>: pop rbp 0x262b49d15ad <+141>: ret 0x262b49d15ae <+142>: ud2
So the culprit here is the fact
rsi
, which initially does indeed contain the original WASM frame base (local0
), from which everything else gets evaluated, gets overwritten just before the call we've stopped on, and this is either not reflected in the frame info, or is not accounted for correctly.BTW, here we can also see why the WASM frame base is so prone to being "optimized out" even as it is always used at the end of the method. Local zero gets split into two distinct values: one referenced in code and not referenced in DI (
r15
), the other - not referenced in code, but referenced in DI (rsi
).
SingleAccretion edited a comment on issue #9900:
This is an interesting one. Here's how to reproduce the failure:
> lldb.exe -o "b InstanceMethod" -o "r" -- wasmtime.exe -Ccache=n -Ddebug-info generic.wasm 25 int InstanceMethod() { -> 26 debug_break(); 27 return BaseValue + DerivedValue; 28 } 29 (lldb) v (WasmtimeVMContext *) __vmctx = 0x000001dc1b2b2e10 (WebAssemblyPtrWrapper<DerivedType>) this = (__ptr = <read memory from 0x1dc38812e1c failed (0 of 4 bytes read)>)
Here's the expression for
this
/__this
:Variable: id = {0x0000048d}, name = "this", type = "WebAssemblyPtrWrapper<DerivedType>", valid ranges = <block>, location = DW_OP_breg4 RSI+0, DW_OP_plus_uconst 0xc, DW_OP_breg5 RDI+96, DW_OP_deref, DW_OP_swap, DW_OP_const4u 0xffffffff, DW_OP_and, DW_OP_plus, decl = {RSI + 0xc, [RDI + 96]} ==> {[RDI + 96], RSI + 0xc} ==> [RDI + 96] + (RSI + 0xc) & 0xffffffff ; [RDI + 96] is __vmctx->memory ; RSI is the linear memory frame pointer (WASM level frame base, sans offset)
Here's annotated disassembly:
0x262b49d1520 <+0>: push rbp 0x262b49d1521 <+1>: mov rbp, rsp 0x262b49d1524 <+4>: mov r10, qword ptr [rdi + 0x8] 0x262b49d1528 <+8>: mov r10, qword ptr [r10 + 0x10] 0x262b49d152c <+12>: add r10, 0x30 0x262b49d1530 <+16>: cmp r10, rsp 0x262b49d1533 <+19>: ja 0x262b49d15ae ; <+142> at generic.cpp:27:5 0x262b49d1539 <+25>: sub rsp, 0x20 0x262b49d153d <+29>: mov qword ptr [rsp], rbx 0x262b49d1541 <+33>: mov qword ptr [rsp + 0x8], r12 0x262b49d1546 <+38>: mov qword ptr [rsp + 0x10], r13 0x262b49d154b <+43>: mov qword ptr [rsp + 0x18], r15 0x262b49d1550 <+48>: mov r15d, dword ptr [rdi + 0xa0] ; r15 = __vmctx->__stack_pointer 0x262b49d1557 <+55>: mov rsi, r15 ; rsi = r15 (__stack_pointer) 0x262b49d155a <+58>: sub esi, 0x10 ; rsi -= 16 0x262b49d155d <+61>: mov dword ptr [rdi + 0xa0], esi ; __vmctx->__stack_pointer = rsi 0x262b49d1563 <+67>: mov r12, qword ptr [rdi + 0x60] 0x262b49d1567 <+71>: mov dword ptr [r12 + rsi + 0xc], edx 0x262b49d156c <+76>: mov rbx, rdi 0x262b49d156f <+79>: mov r13, rdx 0x262b49d1572 <+82>: mov rsi, rbx ; rsi = __vmctx -> 0x262b49d1575 <+85>: call 0x262b49d1180 ; debug_break at generic.h:9:28 0x262b49d157a <+90>: mov rdx, r13 0x262b49d157d <+93>: mov edx, edx 0x262b49d157f <+95>: mov ecx, dword ptr [r12 + rdx] 0x262b49d1583 <+99>: mov rdx, qword ptr [r12 + rdx + 0x8] 0x262b49d1588 <+104>: mov dword ptr [rbx + 0xa0], r15d ; __vmctx->__stack_pointer = r15 0x262b49d158f <+111>: lea eax, [rcx + rdx] 0x262b49d1592 <+114>: mov rbx, qword ptr [rsp] 0x262b49d1596 <+118>: mov r12, qword ptr [rsp + 0x8] 0x262b49d159b <+123>: mov r13, qword ptr [rsp + 0x10] 0x262b49d15a0 <+128>: mov r15, qword ptr [rsp + 0x18] 0x262b49d15a5 <+133>: add rsp, 0x20 0x262b49d15a9 <+137>: mov rsp, rbp 0x262b49d15ac <+140>: pop rbp 0x262b49d15ad <+141>: ret 0x262b49d15ae <+142>: ud2
So the culprit here is the fact
rsi
, which initially does indeed contain the original WASM frame base (local0
), from which everything else gets evaluated, gets overwritten just before the call we've stopped on, and this is either not reflected in the frame info, or is not accounted for correctly.BTW, here we can also see why the WASM frame base is so prone to being "optimized out" even as it is always used at the end of the method. Local zero gets split into two distinct values: one referenced in code and not referenced in DI (
r15
), the other - not referenced in code, but referenced in DI (rsi
).
alexcrichton added the wasmtime:debugging label to Issue #9900.
SingleAccretion commented on issue #9900:
Upgrading the logging for location descriptor building, here's what we see:
Building ranges for values: L#0 : %rdx@[49..86) %r13@[86..94) L#1 : %r15@[49..105) L#3 : %rsi@[59..72) L#4 : %rdx@[94..100) L#5 : %rcx@[96..112) L#6 : %rcx@[96..112) L#8 : %rdx@[100..112) VMCTX: %rdi@[49..86) %rbx@[86..115)
L3
here is our frame base (DW_OP_WASM_location 0x0 0x3
), and it is correctly reported to live for only these three instructions:0x262b49d155d <+61>: mov dword ptr [rdi + 0xa0], esi ; __vmctx->__stack_pointer = rsi 0x262b49d1563 <+67>: mov r12, qword ptr [rdi + 0x60] 0x262b49d1567 <+71>: mov dword ptr [r12 + rsi + 0xc], edx
This implies a bug in range intersection.
SingleAccretion commented on issue #9900:
The complete range info looks like this:
Building ranges for values in scope: [0..144) L#0 : %rdx@[49..86) %r13@[86..94) L#1 : %r15@[49..105) L#3 : %rsi@[59..72) L#4 : %rdx@[94..100) L#5 : %rcx@[96..112) L#6 : %rcx@[96..112) L#8 : %rdx@[100..112) VMCTX : %rdi@[49..86) %rbx@[86..115) Intersecting with L#3 Intersecting with VMCTX Built ranges: [VMCTX:%rdi, L#3:%rsi]@[59..72)
So we end up with a correct (single) range. That leads me to believe the bug is here:
We can only optimize to a single expression when the range covers the entire scope, which I suspect it almost never will.
SingleAccretion edited a comment on issue #9900:
The complete range info looks like this:
Building ranges for values in scope: [0..144) L#0 : %rdx@[49..86) %r13@[86..94) L#1 : %r15@[49..105) L#3 : %rsi@[59..72) L#4 : %rdx@[94..100) L#5 : %rcx@[96..112) L#6 : %rcx@[96..112) L#8 : %rdx@[100..112) VMCTX : %rdi@[49..86) %rbx@[86..115) Intersecting with L#3 Intersecting with VMCTX Built ranges: [VMCTX:%rdi, L#3:%rsi]@[59..72)
So we end up with a correct (single) range. That leads me to believe the bug is here:
We can only optimize to a single expression when the range covers the entire scope, which I suspect it almost never will.
SingleAccretion edited a comment on issue #9900:
The complete range info looks like this:
Building ranges for values in scope: [0..144) L#0 : %rdx@[49..86) %r13@[86..94) L#1 : %r15@[49..105) L#3 : %rsi@[59..72) L#4 : %rdx@[94..100) L#5 : %rcx@[96..112) L#6 : %rcx@[96..112) L#8 : %rdx@[100..112) VMCTX : %rdi@[49..86) %rbx@[86..115) Intersecting with L#3 Intersecting with VMCTX Built ranges: [VMCTX:%rdi, L#3:%rsi]@[59..72)
So we end up with a correct (single) range. That leads me to believe the bug is here:
We can only optimize to a single expression when the range covers the entire scope, which I suspect it almost never will.
Last updated: Jan 24 2025 at 00:11 UTC