Stream: git-wasmtime

Topic: wasmtime / issue #6810 aarch64: Fix `return_call`'s inter...


view this post on Zulip Wasmtime GitHub notifications bot (Aug 05 2023 at 19:44):

github-actions[bot] commented on issue #6810:

Subscribe to Label Action

cc @cfallin, @fitzgen

<details>
This issue or pull request has been labeled: "cranelift", "cranelift:area:aarch64", "cranelift:area:machinst", "cranelift:area:x64", "isle"

Thus the following users have been cc'd because of the following labels:

To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.

Learn more.
</details>

view this post on Zulip Wasmtime GitHub notifications bot (Aug 08 2023 at 15:57):

afonso360 commented on issue #6810:

We are going to have some issues with our test runner. We currently use cranelift-native to do feature detection before running a test invocation, so that we don't natively try to run a test with an extension that the host does not support. We currently only ever enable sign_return_address on MacOS, otherwise that feature is never enabled by cranelift-native, which means that our test runner will always skip these tests even if QEMU signals support for them.

I'm also not sure we can enable them in cranelift native, since I think that would automatically enable return address signing for anyone who just uses all native features, and that might not be desirable.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 08 2023 at 15:58):

afonso360 edited a comment on issue #6810:

We are going to have some issues with our test runner. We currently use cranelift-native to do feature detection before running a test invocation, so that we don't natively try to run a test with an extension that the host does not support.

We currently only ever enable sign_return_address on MacOS, that feature is otherwise never enabled by cranelift-native, which means that our test runner will always skip these tests even if QEMU signals support for them.

I'm also not sure we can enable them in cranelift native (i.e. alongside paca), since I think that would automatically enable return address signing for anyone who just uses all native features, and that might not be desirable.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 08 2023 at 16:05):

alexcrichton commented on issue #6810:

Heh I was just digging into this a bit more and reached the same conclusion! Apparently with QEMU 8.0.2 (what I happen to have installed locally) it will, by default, enable the necessary bits for authentication so the retaa instruction, for example, works with QEMU 8.0.2 out-of-the box. The same binary fails on one piece of native hardware I can run on which doesn't have support (presumably I suppose).

I can confirm though that if I force sign_return_address then the runtests as-is will crash. With this PR, however, they pass. In that sense I don't think a new runtest is necessarily needed since the wasm suite and runtests should cover everything already.

Otherwise though I don't feel like I know enough about aarch64 pointer authentication bits to know what to do here. Given that pointer authentication, as we're using it, is purely a function-local decision I don't know why we would ever want to disable sign_return_address. I'm also not sure why it matters which key (A or B) is used.

I'm also not sure we can enable them in cranelift native (i.e. alongside paca), since I think that would automatically enable return address signing for anyone who just uses all native features, and that might not be desirable.

Can you expand on this @afonso360? Do you know why this might be bad and/or not desirable?

view this post on Zulip Wasmtime GitHub notifications bot (Aug 08 2023 at 16:10):

cfallin commented on issue #6810:

Given that pointer authentication, as we're using it, is purely a function-local decision I don't know why we would ever want to disable sign_return_address. I'm also not sure why it matters which key (A or B) is used.

I guess it matters for unwinding? This is why e.g. we're forced to use the feature, and the B key specifically, on macOS, at least if I understand correctly. I'd be very curious to know if aarch64 Linux lets us do signing locally though; that would be a nice defense-in-depth thing...

view this post on Zulip Wasmtime GitHub notifications bot (Aug 08 2023 at 16:14):

afonso360 commented on issue #6810:

I had the idea that there were some issues back when this was originally implemented, but it looks like its unwinding related as @cfallin mentioned.

Here's the original discussion: https://github.com/bytecodealliance/wasmtime/pull/3606#discussion_r936966019

view this post on Zulip Wasmtime GitHub notifications bot (Aug 08 2023 at 16:14):

afonso360 edited a comment on issue #6810:

I got the impressions that there were some issues back when this was originally implemented, but it looks like its unwinding related as @cfallin mentioned.

Here's the original discussion: https://github.com/bytecodealliance/wasmtime/pull/3606#discussion_r936966019

view this post on Zulip Wasmtime GitHub notifications bot (Aug 08 2023 at 16:15):

afonso360 edited a comment on issue #6810:

Can you expand on this @afonso360? Do you know why this might be bad and/or not desirable?

I got the impressions that there were some issues back when this was originally implemented, but it looks like its unwinding related as @cfallin mentioned.

Here's the original discussion: https://github.com/bytecodealliance/wasmtime/pull/3606#discussion_r936966019

view this post on Zulip Wasmtime GitHub notifications bot (Aug 08 2023 at 16:15):

afonso360 edited a comment on issue #6810:

Can you expand on this @afonso360? Do you know why this might be bad and/or not desirable?

I got the impression that there were some issues back when this was originally implemented, but it looks like its unwinding related as @cfallin mentioned.

Here's the original discussion: https://github.com/bytecodealliance/wasmtime/pull/3606#discussion_r936966019

view this post on Zulip Wasmtime GitHub notifications bot (Aug 08 2023 at 16:28):

afonso360 edited a comment on issue #6810:

Can you expand on this @afonso360? Do you know why this might be bad and/or not desirable?

I got the impression that there were some issues back when this was originally implemented, but it looks like its unwinding related as @cfallin mentioned (They might no longer be relevant, that was just the impression I got).

Here's the original discussion: https://github.com/bytecodealliance/wasmtime/pull/3606#discussion_r936966019

view this post on Zulip Wasmtime GitHub notifications bot (Aug 08 2023 at 17:47):

alexcrichton commented on issue #6810:

Ah ok yeah that all makes sense, and this is sort of coming back to me. So I think that functionally we could enable sign_return_address unconditionally for normal execution and wasm backtraces should work due to the use of our own custom frame-pointer unwinder which bakes in xpaclri to handle encrypted return addresses. The remaining issues I think are (a) what does std::backtrace::Backtrace::new() do when there's wasm on the stack and (b) is it possible to get a backtrace through wasm in a debugger. In theory both of these are functions of dwarf unwinding information we emit, which in theory is correct, but that may require newer debuggers/systems to handle that libgcc bug mentioned.

Testing this locally looks like the B key can't be used on Linux with Wasmtime since we haven't implemented the dwarf stuff to indicate that. If I force-enable sign_return_address then std::backtrace::Backtrace::new() segfaults with wasm on the stack. Additionally in gdb the stack trace (of that segfault) stops at the wasm on the stack. The segfault is in libgcc_s.so which may mean I'm running into this. I think I've then confirmed that where if I run in a recent container (ubuntu:23.10) then the backtrace looks correct. Running in a slightly older contains (ubuntu:22.10) I hit the same segfault.

So that's a long way of saying:

We don't use docker for tests in CI, just for release builds, so my guess is that the patched toolchain is so new that it's not available on GitHub actions by default (they're using ubuntu 22.04). This means that sign_return_address, if enabled, I believe would fail on GitHub actions if we acquire a native backtrace in tests (which we may not do, I forget). Given all this I'm tempted to say we shouldn't do anything else here at this time. In a few years we should be able to flip the sign_return_address config option by default and have everything "just work" though.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 09 2023 at 17:45):

alexcrichton commented on issue #6810:

The rabbit hole continues to go deep on this one.

I've attempted to follow-through on suggestions from today's Cranelift meeting by enabling sign_return_address for a single test case (or a few) throughout our runtest test suite. Turns out tail-call-conv.clif already does this! Turns out that not only passes in CI but additionally passes locally on my macbook. This appears to be due to:

On main if I specify sign_return_address_with_bkey then tests segfault as-is on macOS. That means that the test suite already has coverage of the necessary bits that's being fixed here. To help catch this in the future on CI I've added special logic to cranelift-filetest to specifically ignore the sign_return_address feature. This way we should now start running sign_return_address code, on Linux, in CI. This I can confirm segfaults qemu on main because QEMU supports has_pauth.

All-in-all I believe that this should now be run on CI.


Last updated: Dec 23 2024 at 12:05 UTC