bnjbvr opened issue #3256:
Our internal wasmtime testing on aarch64-darwin shows that something broke around July 17th: https://buildkite.com/embark-studios/wasmtime-aarch64-apple-darwin/builds?branch=main&page=5
When running
cargo test -p wasmtime-cli --test all
, I see a segfault. LLDB shows libunwind in the backtrace, then thebacktrace
crate, thenwasmtime::trap::Trap::new
as the first wasmtime function. It seems that it happens when we're creating to create a backtrace after resuming a fiber from another thread.Unfortunately, I've been trying to chase down a commit that would have introduced this failure, with no success: I can get back to wasmtime as of June 1st (last time I remember manually running tests on this machine), and reproduce that crash. So it could be that the system, distributed version of libunwind has changed after an upgrade of MacOS, and that it is now not working with our unwinding methods anymore.
I'll try to investigate a bit more, and report about what I've found, and if there are workarounds.
bnjbvr labeled issue #3256:
Our internal wasmtime testing on aarch64-darwin shows that something broke around July 17th: https://buildkite.com/embark-studios/wasmtime-aarch64-apple-darwin/builds?branch=main&page=5
When running
cargo test -p wasmtime-cli --test all
, I see a segfault. LLDB shows libunwind in the backtrace, then thebacktrace
crate, thenwasmtime::trap::Trap::new
as the first wasmtime function. It seems that it happens when we're creating to create a backtrace after resuming a fiber from another thread.Unfortunately, I've been trying to chase down a commit that would have introduced this failure, with no success: I can get back to wasmtime as of June 1st (last time I remember manually running tests on this machine), and reproduce that crash. So it could be that the system, distributed version of libunwind has changed after an upgrade of MacOS, and that it is now not working with our unwinding methods anymore.
I'll try to investigate a bit more, and report about what I've found, and if there are workarounds.
alexcrichton commented on issue #3256:
Digging through those logs it looks like when wasmtime is checked out it doesn't print the revision (would y'all be up for changing it to print this?) but assuming that the dates roughly line up with GitHub's dates I'm seeing I think the range for the regression is probably somewhere in here --
All of that looks pretty benign though. https://github.com/bytecodealliance/wasmtime/pull/3082 and https://github.com/bytecodealliance/wasmtime/pull/3074 are arm64 specific but unrelated to unwinding. https://github.com/bytecodealliance/wasmtime/pull/3088 is related to async and stack checks but which also makes it suspicious, but I also have a hard time seeing how that would cause something like this...
bnjbvr commented on issue #3256:
would y'all be up for changing it to print this?
Good idea, added this :+1:
All of that looks pretty benign though.
Yeah, I've tried to git bisect but even the oldest commit I've tried, one I wrote and which I tested manually at the time (and for which I could remember that all the tests were passing), even that commit was failing. Hence my suspicion of a system-wide change. Will try to investigate a bit more this week.
Last updated: Nov 22 2024 at 16:03 UTC