Stream: git-wasmtime

Topic: wasmtime / issue #6785 macos: "failed to set thread excep...


view this post on Zulip Wasmtime GitHub notifications bot (Jul 30 2023 at 02:43):

thibaultcha opened issue #6785:

Hello,

Avid Wasmtime users in our Nginx embedding, we are facing an assertion failure when running it with Wasmtime on macos (x86 and arm64):

thread '<unnamed>' panicked at 'assertion failed: `(left == right)`
  left: `268435466`,
 right: `0`: failed to set thread exception port', crates/runtime/src/traphandlers/macos.rs:474:9

This assertion systematically fails with Wasmtime 8.0.1 and 11.0.1 when Nginx is configured with daemon on (which ends here in Nginx forking into a background process). Not forking the master process (daemon off) seems to be working fine, even in forked (child) worker processes.

In the past we used to have a CI/CD pipeline with macos targets and Wasmtime used to work fine; but this CI/CD pipeline was taken down, and this bit us in the last few days. It seems like older macos work on Wasmtime has something to do with it, but I know nothing of the macos system interface...

So far the root cause of this assertion failure has eluded us; could someone shed some light on what may be at cause here?

Thanks!

view this post on Zulip Wasmtime GitHub notifications bot (Jul 30 2023 at 02:43):

thibaultcha added the bug label to Issue #6785.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 30 2023 at 03:02):

thibaultcha edited issue #6785:

Hello,

Avid Wasmtime users in our Nginx embedding, we are facing an assertion failure when running it with Wasmtime on macos (x86 and arm64):

thread '<unnamed>' panicked at 'assertion failed: `(left == right)`
  left: `268435466`,
 right: `0`: failed to set thread exception port', crates/runtime/src/traphandlers/macos.rs:474:9

This assertion systematically fails with Wasmtime 8.0.1 and 11.0.1 when Nginx is configured with daemon on (which ends here in Nginx forking into a background process). It seems it occurs in our call to wasmtime_linker_instantiate after having initialized an engine and a single store. Not forking the master process (daemon off) seems to be working fine, even in forked (child) worker processes; instances are create and behave as expected.

In the past we used to have a CI/CD pipeline with macos targets and Wasmtime used to work fine; but this CI/CD pipeline was taken down, and this bit us in the last few days. It seems like older macos work on Wasmtime has something to do with it, but I know nothing of the macos system interface...

So far the root cause of this assertion failure has eluded us; could someone shed some light on what may be at cause here?

Thanks!

view this post on Zulip Wasmtime GitHub notifications bot (Jul 30 2023 at 03:03):

thibaultcha edited issue #6785:

Hello,

Avid Wasmtime users in our Nginx embedding, we are facing an assertion failure when running it with Wasmtime on macos (x86 and arm64):

thread '<unnamed>' panicked at 'assertion failed: `(left == right)`
  left: `268435466`,
 right: `0`: failed to set thread exception port', crates/runtime/src/traphandlers/macos.rs:474:9

This assertion systematically fails with Wasmtime 8.0.1 (and probably earlier) to 11.0.1 when Nginx is configured with daemon on (which ends here in Nginx forking into a background process). It seems it occurs in our call to wasmtime_linker_instantiate after having initialized an engine and a single store. Not forking the master process (daemon off) seems to be working fine, even in forked (child) worker processes; instances are create and behave as expected.

In the past we used to have a CI/CD pipeline with macos targets and Wasmtime used to work fine; but this CI/CD pipeline was taken down, and this bit us in the last few days. It seems like older macos work on Wasmtime has something to do with it, but I know nothing of the macos system interface...

So far the root cause of this assertion failure has eluded us; could someone shed some light on what may be at cause here?

Thanks!

view this post on Zulip Wasmtime GitHub notifications bot (Jul 30 2023 at 15:41):

thibaultcha edited issue #6785:

Hello,

Avid Wasmtime users in our Nginx embedding, we are facing an assertion failure when running it with Wasmtime on macos (x86 and arm64):

thread '<unnamed>' panicked at 'assertion failed: `(left == right)`
  left: `268435466`,
 right: `0`: failed to set thread exception port', crates/runtime/src/traphandlers/macos.rs:474:9

This assertion systematically fails with Wasmtime 8.0.1 (and probably earlier) to 11.0.1 when Nginx is configured with daemon on (which ends here in Nginx forking into a background process). It seems it occurs in our call to wasmtime_linker_instantiate after having initialized an engine and a single store. Not forking the master process (daemon off) seems to be working fine, even in forked (child) worker processes; instances are create and behave as expected.

In the past we used to have a CI/CD pipeline with macos targets and Wasmtime used to work fine; but this CI/CD pipeline was taken down, and this bit us in the last few days. It seems like older macos work on Wasmtime has something to do with it, but I know nothing of the macos system interface...

So far the root cause of this assertion failure has eluded us; probably threads that aren't being carried over to the forked process or something like this. Could someone shed some light on what may be at cause here?

Thanks!

view this post on Zulip Wasmtime GitHub notifications bot (Jul 30 2023 at 16:04):

thibaultcha edited issue #6785:

Hello,

Avid Wasmtime users in our Nginx embedding, we are facing an assertion failure when running it with Wasmtime on macos (x86 and arm64):

thread '<unnamed>' panicked at 'assertion failed: `(left == right)`
  left: `268435466`,
 right: `0`: failed to set thread exception port', crates/runtime/src/traphandlers/macos.rs:474:9

This assertion systematically fails with Wasmtime 8.0.1 (and probably earlier) to 11.0.1 when Nginx is configured with daemon on (which ends here in Nginx forking into a background process). It seems it occurs in our call to wasmtime_linker_instantiate after having initialized an engine and a single store. Not forking the master process (daemon off) seems to be working fine, even in forked worker processes (managed by the master process in foreground); instances are created and behave as expected in the Nginx worker processes. The assertion failure specifically occurs when the master process forks itself into a background process with daemon on.

In the past we used to have a CI/CD pipeline with macos targets and Wasmtime used to work fine; but this CI/CD pipeline was taken down, and this bit us in the last few days. It seems like older macos work on Wasmtime has something to do with it, but I know nothing of the macos system interface...

So far the root cause of this assertion failure has eluded us; probably threads that aren't being carried over to the forked process or something like this. Could someone shed some light on what may be at cause here?

Thanks!

view this post on Zulip Wasmtime GitHub notifications bot (Jul 30 2023 at 16:05):

thibaultcha edited issue #6785:

Hello,

Avid Wasmtime users in our Nginx embedding, we are facing an assertion failure when running it with Wasmtime on macos (x86 and arm64):

thread '<unnamed>' panicked at 'assertion failed: `(left == right)`
  left: `268435466`,
 right: `0`: failed to set thread exception port', crates/runtime/src/traphandlers/macos.rs:474:9

This assertion systematically fails with Wasmtime 8.0.1 (and probably earlier) to 11.0.1 when Nginx is configured with daemon on (which ends here in Nginx forking into a background process). It seems it occurs in our call to wasmtime_linker_instantiate after having initialized an engine and a single store.

Not forking the master process (daemon off) seems to be working fine, even in forked worker processes (managed by the master process in foreground); instances are created and behave as expected in the Nginx worker processes.
The assertion failure specifically occurs once the master process has forked itself into a background process with daemon on.

In the past we used to have a CI/CD pipeline with macos targets and Wasmtime used to work fine; but this CI/CD pipeline was taken down, and this bit us in the last few days. It seems like older macos work on Wasmtime has something to do with it, but I know nothing of the macos system interface...

So far the root cause of this assertion failure has eluded us; probably threads that aren't being carried over to the forked process or something like this. Could someone shed some light on what may be at cause here?

Thanks!

view this post on Zulip Wasmtime GitHub notifications bot (Jul 30 2023 at 16:08):

thibaultcha edited issue #6785:

Hello,

Avid Wasmtime users in our Nginx embedding, we are facing an assertion failure when running it with Wasmtime on macos (x86 and arm64):

thread '<unnamed>' panicked at 'assertion failed: `(left == right)`
  left: `268435466`,
 right: `0`: failed to set thread exception port', crates/runtime/src/traphandlers/macos.rs:474:9

This assertion systematically fails with Wasmtime 8.0.1 (and probably earlier) to 11.0.1 when Nginx is configured with daemon on (which ends here in Nginx forking into a background process). It seems it occurs in our call to wasmtime_linker_instantiate after having initialized an engine and a single store.

Not forking the master process (daemon off) seems to be working fine, even in forked worker processes (managed by the master process in foreground); instances are created and behave as expected in the Nginx worker processes.
The assertion failure specifically occurs once the master process has forked itself into a background process with daemon on.

In summary:

In the past we used to have a CI/CD pipeline with macos targets and Wasmtime used to work fine; but this CI/CD pipeline was taken down, and this bit us in the last few days. It seems like older macos work on Wasmtime has something to do with it, but I know nothing of the macos system interface...

So far the root cause of this assertion failure has eluded us; probably threads that aren't being carried over to the forked process or something like this. Could someone shed some light on what may be at cause here?

Thanks!

view this post on Zulip Wasmtime GitHub notifications bot (Jul 30 2023 at 16:09):

thibaultcha edited issue #6785:

Hello,

Avid Wasmtime users in our Nginx embedding, we are facing an assertion failure when running it with Wasmtime on macos (x86 and arm64):

thread '<unnamed>' panicked at 'assertion failed: `(left == right)`
  left: `268435466`,
 right: `0`: failed to set thread exception port', crates/runtime/src/traphandlers/macos.rs:474:9

This assertion systematically fails with Wasmtime 8.0.1 (and probably earlier) to 11.0.1 when Nginx is configured with daemon on (which ends here in Nginx forking into a background process). It seems it occurs in our call to wasmtime_linker_instantiate after having initialized an engine and a single store.

Not forking the master process (daemon off) seems to be working fine, even in forked worker processes (managed by the master process in foreground); instances are created and behave as expected in the Nginx worker processes.
The assertion failure specifically occurs once the master process has forked itself into a background process with daemon on.

In summary:

In the past we used to have a CI/CD pipeline with macos targets and Wasmtime used to work fine; but this CI/CD pipeline was taken down, and this bit us in the last few days. It seems like older macos work on Wasmtime has something to do with it, but I know nothing of the macos system interface...

So far the root cause of this assertion failure has eluded us; probably threads that aren't being carried over to the forked process or something like this. Could someone shed some light on what may be at cause here?

Thanks!

view this post on Zulip Wasmtime GitHub notifications bot (Jul 30 2023 at 16:09):

thibaultcha edited issue #6785:

Hello,

Avid Wasmtime users in our Nginx embedding, we are facing an assertion failure when running it with Wasmtime on macos (x86 and arm64):

thread '<unnamed>' panicked at 'assertion failed: `(left == right)`
  left: `268435466`,
 right: `0`: failed to set thread exception port', crates/runtime/src/traphandlers/macos.rs:474:9

This assertion systematically fails with Wasmtime 8.0.1 (and probably earlier) to 11.0.1 when Nginx is configured with daemon on (which ends here in Nginx forking into a background process). It seems it occurs in our call to wasmtime_linker_instantiate after having initialized an engine and a single store.

Not forking the master process (daemon off) seems to be working fine, even in forked worker processes (managed by the master process in foreground); instances are created and behave as expected in the Nginx worker processes.
The assertion failure specifically occurs once the master process has forked itself into a background process with daemon on.

In summary:

In the past we used to have a CI/CD pipeline with macos targets and Wasmtime used to work fine; but this CI/CD pipeline was taken down, and this bit us in the last few days. It seems like older macos work on Wasmtime has something to do with it, but I know nothing of the macos system interface...

So far the root cause of this assertion failure has eluded us; probably threads that aren't being carried over to the forked process or something like this. Could someone shed some light on what may be at cause here?

Thanks!

view this post on Zulip Wasmtime GitHub notifications bot (Jul 31 2023 at 15:25):

thibaultcha edited issue #6785:

Hello,

Avid Wasmtime users in our Nginx embedding, we are facing an assertion failure when running it with Wasmtime on macos (x86 and arm64):

thread '<unnamed>' panicked at 'assertion failed: `(left == right)`
  left: `268435466`,
 right: `0`: failed to set thread exception port', crates/runtime/src/traphandlers/macos.rs:474:9

This assertion systematically fails with Wasmtime 8.0.1 (and probably earlier) to 11.0.1 when Nginx is configured with daemon on (which ends here in Nginx forking into a background process). It seems it occurs in our call to wasmtime_linker_instantiate after having initialized an engine and a single store.

Not forking the master process (daemon off) seems to be working fine, even in forked worker processes (managed by the master process in foreground); instances are created and behave as expected in the Nginx worker processes.
The assertion failure specifically occurs once the master process has forked itself into a background process with daemon on.

In summary:

In the past we used to have a CI/CD pipeline with macos targets and Wasmtime used to work fine; but this CI/CD pipeline was taken down, and this bit us in the last few days. It seems like older macos work on Wasmtime has something to do with it, but I know nothing of the macos system interface...

So far the root cause of this assertion failure has eluded us; probably state that aren't being carried over to the forked process or something like this. Could someone shed some light on what may be at cause here?

Thanks!

view this post on Zulip Wasmtime GitHub notifications bot (Jul 31 2023 at 16:18):

thibaultcha commented on issue #6785:

@casimiro has a good example and complete stacktrace of the assertion in #6788.

One thing we have tried to mitigate this is to destroy the engine before forking, in the hope that it would remove all secondary state, so that the forked process could call engine_new again and start from a clean slate. Alas, it does not make the assertion pass either.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 31 2023 at 17:22):

peterhuene commented on issue #6785:

The error code is MACH_SEND_INVALID_RIGHT (the message body specified a port right which the caller didn't possess).

Wasmtime allocates a Mach port name (WASMTIME_PORT), for receiving exceptions from the kernel, upon the very first construction of an Engine.

The assertion failure is from the per-thread initialization that occurs to setup the thread exception port from threads that are running wasm code.

Without a fork, this works because all threads in a process see the same Mach port namespace. However, Mach port namespaces aren't inherited in a child process.

So I suspect we're going to run into an issue whenever an Engine is created prior to a fork(); I'm not sure exactly what our policy is on supporting fork as there's lots of complex, OS-specific initialization in the engine, but this code is clearly not intended to be fork-aware.

Note that there is a compile-time feature (posix-signals-on-macos) which will forego the Mach exception ports integration in favor of traditional signal handlers; it might be worth seeing if you can build the Wasmtime C API with this feature flag and see if it helps with this issue.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 31 2023 at 17:36):

thibaultcha commented on issue #6785:

Thanks @peterhuene; ok so just like we thought, our hypothesis was the same. Although we think that destroying the engine prior to the fork should remedy to this, it feels like an issue in Wasmtime itself that it doesn't.

I'd say it's also a bummer that this cannot be disabled at runtime (in engine configuration) as now our users will have to compile Wasmtime themselves so they can disable this feature, they cannot just download one of the releases... But at least it is a path forward.

Thanks again!

view this post on Zulip Wasmtime GitHub notifications bot (Jul 31 2023 at 17:39):

thibaultcha edited a comment on issue #6785:

Thanks @peterhuene; ok so just like we thought, our hypothesis was the same. Although we think that destroying the engine prior to the fork should remedy this, it feels like an issue in Wasmtime itself that it doesn't.

I'd say it's also a bummer that this cannot be disabled at runtime (in engine configuration) as now our users will have to compile Wasmtime themselves so they can disable this feature, they cannot just download one of the releases... But at least it is a path forward.

Thanks again!

view this post on Zulip Wasmtime GitHub notifications bot (Jul 31 2023 at 17:56):

peterhuene commented on issue #6785:

I am not aware of a reason we couldn't "uninstall" the trap handlers or reset otherwise global state upon the destruction of the last Engine (i.e. to restore things the way they were prior to the construction of the first Engine).

I think it simply hasn't been a priority for us to do so as generally the majority use pattern for Wasmtime thus far has been to create a single Engine for the process and forego fork in favor of event queues.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 31 2023 at 18:00):

peterhuene edited a comment on issue #6785:

I am not aware of a reason we couldn't "uninstall" the trap handlers or reset otherwise global state upon the destruction of the last Engine (i.e. to restore things the way they were prior to the construction of the first Engine).

I think it simply hasn't been a priority for us to do so as generally the majority use pattern for Wasmtime thus far has been to create a single Engine for the process and forego fork in favor of single-process event queues.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 31 2023 at 18:01):

thibaultcha commented on issue #6785:

the majority use pattern for Wasmtime thus far has been to create a single Engine for the process and forego fork in favor of event queues.

Note that all Nginx is doing is forking into a background process which isn't so uncommon, I suspect more embeddings of Wasmtime might run into this issue in the long run.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 31 2023 at 18:04):

peterhuene commented on issue #6785:

Is there a call to create the engine from within the initial process when the feature to fork the daemon is on?

If not, I can't really explain why that particular feature would be tripping things up if all it is doing is immediately forking.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 31 2023 at 18:11):

thibaultcha commented on issue #6785:

Is there a call to create the engine from within the initial process when the feature to fork the daemon is on?

Yes there is or else we would have moved it already. We must open an engine and validate the .wasm bytecode being loaded before the fork() in order to potentially exit() if it is invalid. If we wait until after the fork() to do so, it is too late to ask Nginx to not start (i.e. go into background & fork() into worker processes).

view this post on Zulip Wasmtime GitHub notifications bot (Jul 31 2023 at 18:15):

thibaultcha edited a comment on issue #6785:

Is there a call to create the engine from within the initial process when the feature to fork the daemon is on?

Yes there is or else we would have moved it already. We must open an engine and validate the .wasm bytecode being loaded before the fork() in order to potentially exit() if it is invalid. If we wait until after the fork() to do so, it is too late to ask Nginx to not start (i.e. go into background & fork() into worker processes).

It's ok though, for now we are looking into posix-signals-on-macos as you pointed out.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 31 2023 at 18:15):

peterhuene commented on issue #6785:

I see, thank you for the clarification.

I suspect even in the daemon off case, we've initialized the per-thread exception port successfully as it's before the fork() of the worker process, but the worker probably inherited the fact it was initialized so doesn't attempt to do so again; thus I would expect trap handling to be broken if the wasm traps in the worker process.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 31 2023 at 18:16):

peterhuene edited a comment on issue #6785:

I see, thank you for the clarification.

I suspect even in the daemon off case, we've initialized the per-thread exception port successfully as it's before the fork() of the worker process, but the worker probably inherited the fact it was initialized so doesn't attempt to do so again (and hence bypasses the assertion failure); thus I would expect trap handling to be broken if the wasm traps in the worker process.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 31 2023 at 18:24):

thibaultcha edited a comment on issue #6785:

Is there a call to create the engine from within the initial process when the feature to fork the daemon is on?

Yes there is or else we would have moved it already. We must open an engine and validate the .wasm bytecode being loaded before the fork() in order to potentially exit() if it is invalid. If we wait until after the fork() to do so, it is too late to ask Nginx not to start (i.e. go into background & fork() into worker processes).

It's ok though, for now we are looking into posix-signals-on-macos as you pointed out.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 31 2023 at 18:32):

zhongweiy commented on issue #6785:

For the posix-signals-on-macos feature, I find some error like:

cargo build --release -p wasmtime-c-api --lib --features posix-signals-on-macos
error: none of the selected packages contains these features: posix-signals-on-macos

And it can pass without wasmtime-c-api:
cargo build --release --lib --features posix-signals-on-macos

It seems the wasmtime-c-api package does not have this feature. Is it a bug?

view this post on Zulip Wasmtime GitHub notifications bot (Jul 31 2023 at 18:34):

peterhuene commented on issue #6785:

Yeah, it appears to be missing from the wasmtime-c-api crate, just needs to forward the feature to the wasmtime crate:

crates/c-api/Cargo.toml:

[features]
...
posix-signals-on-macos = ["wasmtime/posix-signals-on-macos"]

view this post on Zulip Wasmtime GitHub notifications bot (Jul 31 2023 at 18:36):

bjorn3 commented on issue #6785:

If you don't want to modify the source --features wasmtime/posix-signals-on-macos may also work.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 02 2023 at 00:40):

thibaultcha commented on issue #6785:

Thanks all for the tips;

With macOS x86, traps are properly handled with posix-signals-on-macos, but on macOS ARM64, we get this exception on a trap with Wasmtime v11.0.1:

thread '<unnamed>' panicked at 'misaligned pointer dereference: address must be a multiple of 0x10 but is 0x105327c18', crates/runtime/src/traphandlers/unix.rs:217:22
stack backtrace:
   0: rust_begin_unwind
             at /rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/std/src/panicking.rs:593:5
   1: core::panicking::panic_fmt
             at /rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/core/src/panicking.rs:67:14
   2: core::panicking::panic_misaligned_pointer_dereference
             at /rustc/8ede3aae28fe6e4d52b38157d7bfe0d3bceef225/library/core/src/panicking.rs:174:5
   3: wasmtime_runtime::traphandlers::unix::get_pc_and_fp
             at ./work/runtimes/libwasmtime/repos/wasmtime/crates/runtime/src/traphandlers/unix.rs:217:22
   4: wasmtime_runtime::traphandlers::unix::trap_handler::{{closure}}
             at ./work/runtimes/libwasmtime/repos/wasmtime/crates/runtime/src/traphandlers/unix.rs:94:24
   5: wasmtime_runtime::traphandlers::tls::with
             at ./work/runtimes/libwasmtime/repos/wasmtime/crates/runtime/src/traphandlers.rs:747:18
   6: wasmtime_runtime::traphandlers::unix::trap_handler
             at ./work/runtimes/libwasmtime/repos/wasmtime/crates/runtime/src/traphandlers/unix.rs:79:19
   7: __platform_memmove

This seems to be coming from Wasmtime's interaction with the libc binding...

view this post on Zulip Wasmtime GitHub notifications bot (Aug 02 2023 at 08:19):

alexcrichton commented on issue #6785:

That's a bug! I've posted a temporary fix for that at https://github.com/bytecodealliance/wasmtime/pull/6793

view this post on Zulip Wasmtime GitHub notifications bot (Aug 02 2023 at 16:37):

fitzgen closed issue #6785:

Hello,

Avid Wasmtime users in our Nginx embedding, we are facing an assertion failure when running it with Wasmtime on macos (x86 and arm64):

thread '<unnamed>' panicked at 'assertion failed: `(left == right)`
  left: `268435466`,
 right: `0`: failed to set thread exception port', crates/runtime/src/traphandlers/macos.rs:474:9

This assertion systematically fails with Wasmtime 8.0.1 (and probably earlier) to 11.0.1 when Nginx is configured with daemon on (which ends here in Nginx forking into a background process). It seems it occurs in our call to wasmtime_linker_instantiate after having initialized an engine and a single store.

Not forking the master process (daemon off) seems to be working fine, even in forked worker processes (managed by the master process in foreground); instances are created and behave as expected in the Nginx worker processes.
The assertion failure specifically occurs once the master process has forked itself into a background process with daemon on.

In summary:

In the past we used to have a CI/CD pipeline with macos targets and Wasmtime used to work fine; but this CI/CD pipeline was taken down, and this bit us in the last few days. It seems like older macos work on Wasmtime has something to do with it, but I know nothing of the macos system interface...

So far the root cause of this assertion failure has eluded us; probably state that aren't being carried over to the forked process or something like this. Could someone shed some light on what may be at cause here?

Thanks!

view this post on Zulip Wasmtime GitHub notifications bot (Aug 02 2023 at 16:47):

alexcrichton commented on issue #6785:

Oops didn't mean to close this

view this post on Zulip Wasmtime GitHub notifications bot (Aug 02 2023 at 16:47):

alexcrichton reopened issue #6785:

Hello,

Avid Wasmtime users in our Nginx embedding, we are facing an assertion failure when running it with Wasmtime on macos (x86 and arm64):

thread '<unnamed>' panicked at 'assertion failed: `(left == right)`
  left: `268435466`,
 right: `0`: failed to set thread exception port', crates/runtime/src/traphandlers/macos.rs:474:9

This assertion systematically fails with Wasmtime 8.0.1 (and probably earlier) to 11.0.1 when Nginx is configured with daemon on (which ends here in Nginx forking into a background process). It seems it occurs in our call to wasmtime_linker_instantiate after having initialized an engine and a single store.

Not forking the master process (daemon off) seems to be working fine, even in forked worker processes (managed by the master process in foreground); instances are created and behave as expected in the Nginx worker processes.
The assertion failure specifically occurs once the master process has forked itself into a background process with daemon on.

In summary:

In the past we used to have a CI/CD pipeline with macos targets and Wasmtime used to work fine; but this CI/CD pipeline was taken down, and this bit us in the last few days. It seems like older macos work on Wasmtime has something to do with it, but I know nothing of the macos system interface...

So far the root cause of this assertion failure has eluded us; probably state that aren't being carried over to the forked process or something like this. Could someone shed some light on what may be at cause here?

Thanks!

view this post on Zulip Wasmtime GitHub notifications bot (Aug 04 2023 at 12:49):

alexcrichton commented on issue #6785:

I investigated this a bit recently and I'm not actually certain that we'll want to fix this. I think that the best solution here might be to recommend that macOS users who want to leverage fork should use signals via the Cargo feature rather than the default mach ports. If this is common then one option we could support is to make this a runtime configuration option instead of a compile-time configuration option too.

Otherwise though the issues on macOS that I found difficult to handle are:

I suppose that can all be mostly summarized as I think the only possibly-feasible thing to do here is to shut everything down when an Engine is destroyed, but I'm not personally all that comfortable doing that as it doesn't align with other platforms and I'm also not certain how robust we can make it.

Otherwise though @thibaultcha I'd ask you if having a custom build is difficult for y'all? If so we can move the signals-vs-ports to a runtime option instead. Additionally one other mitigation we could implement is to use pthread_atfork to force any wasm execution in the child to panic immediately rather than continue in a broken state, effectively having a more precise way of warning developers that "hey ports aren't compatible with fork please use this other thing to use signals instead"

view this post on Zulip Wasmtime GitHub notifications bot (Aug 04 2023 at 16:09):

thibaultcha commented on issue #6785:

Hi @alexcrichton, thanks for having a look at this too! It is ok for us to have a custom build, although a runtime configuration option would be even better since it means users can download any of the upstream Wasmtime releases which imho is ideal for a friction-less experience. How much work do you foresee that be? Perhaps it could be something one of us could contribute (a difficulty is that none of us on my team has access to a Macbook, but maybe we can write the patch off a box if it isn't too big).

Although, I do hope relying on this feature in the long run won't be a hassle for us (e.g. if Wasmtime on macos focuses more efforts & fixes on trap handling through ports at the detriment of signal dispositions); in which case having tighter control over the port handlers might be preferable, since we could shut it down before fork() and ensure all of our processes rely on the same mechanisms Wasmtime invests into...

view this post on Zulip Wasmtime GitHub notifications bot (Aug 04 2023 at 23:54):

alexcrichton commented on issue #6785:

Ok sounds good! Not too much work but no worries I've done some initial bits in https://github.com/bytecodealliance/wasmtime/pull/6807 now. If you'd like though some help in implementing a corresponding option in the C API would be appreciated!

As for the long term you should be good. Unix signals are unlikely to be as battle-tested as Mach ports because they're off-by-default for macOS, but they're turn on-by-default for Linux for example which means they're not completely untested. Y'all aren't the first to want to use signals instead of mach ports as well so it's something that I'd at least personally like to see continued support for in Wasmtime. Basically that's to say that, yes, you may run into future issues. I think we'll be interested in fixing them promptly, though, if they arise.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 31 2023 at 18:41):

alexcrichton commented on issue #6785:

I think everything is now handled here with a combo of what I mentioned above:

I'm going to close this now but @thibaultcha if there's anything else please comment here!

view this post on Zulip Wasmtime GitHub notifications bot (Aug 31 2023 at 18:41):

alexcrichton closed issue #6785:

Hello,

Avid Wasmtime users in our Nginx embedding, we are facing an assertion failure when running it with Wasmtime on macos (x86 and arm64):

thread '<unnamed>' panicked at 'assertion failed: `(left == right)`
  left: `268435466`,
 right: `0`: failed to set thread exception port', crates/runtime/src/traphandlers/macos.rs:474:9

This assertion systematically fails with Wasmtime 8.0.1 (and probably earlier) to 11.0.1 when Nginx is configured with daemon on (which ends here in Nginx forking into a background process). It seems it occurs in our call to wasmtime_linker_instantiate after having initialized an engine and a single store.

Not forking the master process (daemon off) seems to be working fine, even in forked worker processes (managed by the master process in foreground); instances are created and behave as expected in the Nginx worker processes.
The assertion failure specifically occurs once the master process has forked itself into a background process with daemon on.

In summary:

In the past we used to have a CI/CD pipeline with macos targets and Wasmtime used to work fine; but this CI/CD pipeline was taken down, and this bit us in the last few days. It seems like older macos work on Wasmtime has something to do with it, but I know nothing of the macos system interface...

So far the root cause of this assertion failure has eluded us; probably state that aren't being carried over to the forked process or something like this. Could someone shed some light on what may be at cause here?

Thanks!


Last updated: Nov 22 2024 at 16:03 UTC