Stream: git-wasmtime

Topic: wasmtime / issue #3256 aarch64-macos: crash when unwinding


view this post on Zulip Wasmtime GitHub notifications bot (Aug 27 2021 at 10:10):

bnjbvr opened issue #3256:

Our internal wasmtime testing on aarch64-darwin shows that something broke around July 17th: https://buildkite.com/embark-studios/wasmtime-aarch64-apple-darwin/builds?branch=main&page=5

When running cargo test -p wasmtime-cli --test all, I see a segfault. LLDB shows libunwind in the backtrace, then the backtrace crate, then wasmtime::trap::Trap::new as the first wasmtime function. It seems that it happens when we're creating to create a backtrace after resuming a fiber from another thread.

Unfortunately, I've been trying to chase down a commit that would have introduced this failure, with no success: I can get back to wasmtime as of June 1st (last time I remember manually running tests on this machine), and reproduce that crash. So it could be that the system, distributed version of libunwind has changed after an upgrade of MacOS, and that it is now not working with our unwinding methods anymore.

I'll try to investigate a bit more, and report about what I've found, and if there are workarounds.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 27 2021 at 10:10):

bnjbvr labeled issue #3256:

Our internal wasmtime testing on aarch64-darwin shows that something broke around July 17th: https://buildkite.com/embark-studios/wasmtime-aarch64-apple-darwin/builds?branch=main&page=5

When running cargo test -p wasmtime-cli --test all, I see a segfault. LLDB shows libunwind in the backtrace, then the backtrace crate, then wasmtime::trap::Trap::new as the first wasmtime function. It seems that it happens when we're creating to create a backtrace after resuming a fiber from another thread.

Unfortunately, I've been trying to chase down a commit that would have introduced this failure, with no success: I can get back to wasmtime as of June 1st (last time I remember manually running tests on this machine), and reproduce that crash. So it could be that the system, distributed version of libunwind has changed after an upgrade of MacOS, and that it is now not working with our unwinding methods anymore.

I'll try to investigate a bit more, and report about what I've found, and if there are workarounds.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 27 2021 at 15:51):

alexcrichton commented on issue #3256:

Digging through those logs it looks like when wasmtime is checked out it doesn't print the revision (would y'all be up for changing it to print this?) but assuming that the dates roughly line up with GitHub's dates I'm seeing I think the range for the regression is probably somewhere in here --

https://github.com/bytecodealliance/wasmtime/compare/73fd702bb7cabdff156eaf4ef8db5b92b098583e...f628d06118ecde14e429fe1f4538831a51a1e75a

All of that looks pretty benign though. https://github.com/bytecodealliance/wasmtime/pull/3082 and https://github.com/bytecodealliance/wasmtime/pull/3074 are arm64 specific but unrelated to unwinding. https://github.com/bytecodealliance/wasmtime/pull/3088 is related to async and stack checks but which also makes it suspicious, but I also have a hard time seeing how that would cause something like this...

view this post on Zulip Wasmtime GitHub notifications bot (Aug 30 2021 at 09:04):

bnjbvr commented on issue #3256:

would y'all be up for changing it to print this?

Good idea, added this :+1:

All of that looks pretty benign though.

Yeah, I've tried to git bisect but even the oldest commit I've tried, one I wrote and which I tested manually at the time (and for which I could remember that all the tests were passing), even that commit was failing. Hence my suspicion of a system-wide change. Will try to investigate a bit more this week.


Last updated: Nov 22 2024 at 16:03 UTC