Stream: git-wasmtime

Topic: wasmtime / issue #3360 Use `__builtin_setjmp` instead of ...


view this post on Zulip Wasmtime GitHub notifications bot (Sep 16 2021 at 17:54):

bjorn3 commented on issue #3360:

An important caveat is that GCC arranges to save and restore only those registers known to the specific architecture variant being compiled for. This can make __builtin_setjmp and __builtin_longjmp more efficient than their library counterparts in some cases, but it can also cause incorrect and mysterious behavior when mixing with code that uses the full register set.

This has me worried. What are the exact conditions in which this causes incorrect behavior?

view this post on Zulip Wasmtime GitHub notifications bot (Sep 16 2021 at 18:51):

sunfishcode commented on issue #3360:

This has me worried. What are the exact conditions in which this causes incorrect behavior?

Suppose one of the hardware architectures we support added a new register, and decides to have a convention that it's compiler-managed, and callee-saved. The host compiler doesn't use the new register, because it's targeting the generic architecture family, but Cranelift does use the new register, because cranelift-native.

Then we could get into a situation like this:

// compiled by cranelift
void outer(void) {
    write_the_new_register(42);
    catcher();
    assert(read_the_new_register() == 42);
}
// compiled by the host compiler
void catcher(void) {
    if (__builtin_setjmp(buf) == 0) {
        inner();
    }
}
// compiled by cranelift
void inner(void) {
    auto save = read_the_new_register();
    clobber_the_new_register();
    if (some condition) {
        jumper();
    }
    write_the_new_register(save);
}
// compiled by the host compiler
void jumper(void) {
    __builtin_longjmp(buf, 1);
}

In practice, this should be rare, because the major architectures don't add new registers very often, and when they do, it's usually not compiler-managed, and when it is, it's usually call-clobbered. New compiler-managed callee-saved state also breaks many existing libc-implemented setjmp functions in the same way, so it's not often done.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 16 2021 at 18:52):

froydnj commented on issue #3360:

An important caveat is that GCC arranges to save and restore only those registers known to the specific architecture variant being compiled for. This can make __builtin_setjmp and __builtin_longjmp more efficient than their library counterparts in some cases, but it can also cause incorrect and mysterious behavior when mixing with code that uses the full register set.

This has me worried. What are the exact conditions in which this causes incorrect behavior?

It doesn't require mixing and matching with the full register set. You run into problems like https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98997 for instance (clang has the same problem).

This particular case is not a problem for wasmtime, because all access to __builtin_{set,long}jmp should be within wasmtime itself (and you shouldn't be able to mix-and-match wasmtime runtimes...)

view this post on Zulip Wasmtime GitHub notifications bot (Sep 16 2021 at 19:03):

bjorn3 commented on issue #3360:

In practice, this should be rare, because the major architectures don't add new registers very often

ymm/zmm for avx/avx512 are by default unknown to the compiled wasmtime as only SSE2 is enabled by default for most x86_64 targets.

and when it is, it's usually call-clobbered.

I see. The x86_64 SystemV does indeed list xmm, ymm and zmm as being clobbered.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 16 2021 at 19:04):

sunfishcode commented on issue #3360:

Thanks for that link! Yes, in our case, we know that __builtin_setjmp and __builtin_longjmp are compiled by the same compiler with the same flags—they're in the same source file, so we're safe from that particular hazard at least.

We do have a hazard though, in the case of a mismatch between Cranelift's and the host compiler's callee-saved registers. I've added a comment about this to the PR; I'm open to ideas for what else we can do here.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 16 2021 at 19:08):

sunfishcode commented on issue #3360:

For completeness, a form of this hazard is already present in Wasmtime; if someone statically links Wasmtime, the statically-linked-in sigsetjmp will hard-code the registers to save, and if Cranelift at runtime detects and uses more registers and relies on them being callee-saved, it'll also trigger UB.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 16 2021 at 20:43):

alexcrichton commented on issue #3360:

We may need to disable this for AArch64 with clang, locally when I try to compile it spits out:

foo.c:34:7: error: __builtin_setjmp is not supported for the current target
  if (platform_setjmp(buf) != 0) {
      ^~~~~~~~~~~~~~~~~~~~
foo.c:23:30: note: expanded from macro 'platform_setjmp'
#define platform_setjmp(buf) __builtin_setjmp(buf)
                             ^~~~~~~~~~~~~~~~~~~~~
foo.c:44:3: error: __builtin_longjmp is not supported for the current target
  platform_longjmp(*buf, 1);
  ^~~~~~~~~~~~~~~~~~~~~~~~~
foo.c:24:36: note: expanded from macro 'platform_longjmp'
#define platform_longjmp(buf, arg) __builtin_longjmp(buf, arg)
                                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~
2 errors generated.

Poking around in clang it looks like this is gated on hasSjLjLowering which is true for x86 and arm but not for AArch64

view this post on Zulip Wasmtime GitHub notifications bot (Sep 17 2021 at 22:08):

sunfishcode commented on issue #3360:

I fixed the code to compile on clang on aarch64, where __builtin_setmp isn't supported. If there are other architectures/compilers that don't support it, we can add them as needed.

Talking about this with @cfallin offline, adding new callee-saved registers to cranelift amounts to adding a new ABI, and there are lots of considerations, including calls to and from host code, the register save/restore code in wasmtime-fiber, the __builtin_setjmp issue here, trampolines, and possibly other things, so we'll need to carefully think through the design at that time. I've added a comment to the code added in this PR, which seems to be the main useful thing we can do for now.


Last updated: Jan 24 2025 at 00:11 UTC