wasmtime / PR #12375 RR: Config knobs and validation · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / PR #12375 RR: Config knobs and validation

Wasmtime GitHub notifications bot (Jan 20 2026 at 17:15):

arjunr2 opened PR #12375 from arjunr2:main to bytecodealliance:main:

Initial PR for knobs for Config for record/replay feature

Wasmtime GitHub notifications bot (Jan 20 2026 at 17:15):

arjunr2 requested alexcrichton for a review on PR #12375.

Wasmtime GitHub notifications bot (Jan 20 2026 at 17:15):

arjunr2 requested wasmtime-core-reviewers for a review on PR #12375.

Wasmtime GitHub notifications bot (Jan 20 2026 at 17:15):

arjunr2 requested wasmtime-default-reviewers for a review on PR #12375.

Wasmtime GitHub notifications bot (Jan 20 2026 at 19:03):

github-actions[bot] added the label wasmtime:config on PR #12375.

Wasmtime GitHub notifications bot (Jan 20 2026 at 19:03):

github-actions[bot] added the label wasmtime:api on PR #12375.

Wasmtime GitHub notifications bot (Jan 20 2026 at 20:06):

github-actions[bot] commented on PR #12375:

Label Messager: wasmtime:config

It looks like you are changing Wasmtime's configuration options. Make sure to
complete this check list:

[ ] If you added a new Config method, you wrote extensive documentation for
it.

<details>

Our documentation should be of the following form:

```text
Short, simple summary sentence.

More details. These details can be multiple paragraphs. There should be
information about not just the method, but its parameters and results as
well.

Is this method fallible? If so, when can it return an error?

Can this method panic? If so, when does it panic?

Example

Optional example here.
```

</details>

[ ] If you added a new Config method, or modified an existing one, you
ensured that this configuration is exercised by the fuzz targets.

<details>

For example, if you expose a new strategy for allocating the next instance
slot inside the pooling allocator, you should ensure that at least one of our
fuzz targets exercises that new strategy.

Often, all that is required of you is to ensure that there is a knob for this
configuration option in [wasmtime_fuzzing::Config][fuzzing-config] (or one
of its nested structs).

Rarely, this may require authoring a new fuzz target to specifically test this
configuration. See [our docs on fuzzing][fuzzing-docs] for more details.

</details>

[ ] If you are enabling a configuration option by default, make sure that it
has been fuzzed for at least two weeks before turning it on by default.

[fuzzing-config]: https://github.com/bytecodealliance/wasmtime/blob/ca0e8d0a1d8cefc0496dba2f77a670571d8fdcab/crates/fuzzing/src/generators.rs#L182-L194
[fuzzing-docs]: https://docs.wasmtime.dev/contributing-fuzzing.html

<details>

To modify this label's message, edit the <code>.github/label-messager/wasmtime-config.md</code> file.

To add new label messages or remove existing label messages, edit the
<code>.github/label-messager.json</code> configuration file.

Learn more.

</details>

Wasmtime GitHub notifications bot (Jan 22 2026 at 19:17):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jan 22 2026 at 19:17):

alexcrichton created PR review comment:

Personally I'm a bit wary of having this configuration method/convenience. We've got some documented known sources of non-determinisim and these two options are only a subset of the possible sources of non-determinism. Given that, and coupled with my thoughts about validation of a config, I think this would be served well as documentation of the Config::rr method but would prefer to avoid having an explicit method like this.

Wasmtime GitHub notifications bot (Jan 22 2026 at 19:17):

alexcrichton created PR review comment:

Rust-idiom-wise we try to prefer exhaustive matches where possible unless it's unnecessarily onerous. In this case this'd be a good place to change to RRConfig::None => {}

Wasmtime GitHub notifications bot (Jan 22 2026 at 19:17):

alexcrichton created PR review comment:

Personally I'd say it's fine to. unconditionally define these methods and have them just return false when rr-the-feature is disabled.

Wasmtime GitHub notifications bot (Jan 22 2026 at 19:17):

alexcrichton created PR review comment:

We've got more documentation online about this, but in the interest of shuffling things around and minimizing the impact of this one thought I might have here is:

Conditionally define the Recording and Replaying variants of RRConfig-the-enum

Mark RRConfig as #[non_exhaustive]

Unconditionally store RRConfig in various places

That would mean that when the Cargo features is disabled RRConfig is a zero-size-type with no runtime cost. It would reduce the conditional imports of RRConfig as well, too.

Wasmtime GitHub notifications bot (Jan 22 2026 at 19:17):

alexcrichton created PR review comment:

What would you think of ignoring the determinism options here instead of validating/enabling them? Historically as various config options have been tied together it's caused problems and made configuration more confusing due to trying to understand how everything interacts with each other. In some sense producing a recording is entirely orthogonal to deterministic simd/nans. In another sense I also understand how such a recording runs the risk of not being too useful.

For the engine-level configuration I'd argue, however, that this is best kept as a separate concern where we'd document in the RR configuration options that users probably also want to turn on deterministic things, but it wouldn't be a requirement.

Such a change would also have the nice benefit of keeping validate as &self vs &mut self changed in this PR. That's been an intentional design so far where Config is intended to not need any sort of post-processing. If post-processing is necessary we try to defer it to "store the result of the computation in the Engine" so Config continues to reflect the source of truth of configuration specified.

Wasmtime GitHub notifications bot (Jan 22 2026 at 19:19):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jan 22 2026 at 19:19):

alexcrichton created PR review comment:

Oops forgot from the previous review -- mind adding handling of these options in this crate such that if the option is specified an error is returned? A future PR would then change it to calling a Config method or such as appropriate, but it'll avoid a state where we have these options but they don't do anything

Wasmtime GitHub notifications bot (Jan 22 2026 at 19:26):

cfallin created PR review comment:

I think I'd gently push back against this: determinism is a fundamental part of record/replay, not an optional add-on, and without it the replay may not only be incorrect but be incorrect in interesting ways that violate internal invariants (finding the wrong kind of event as we read the trace alongside canonical ABI steps, necessitating some sort of fallback that, I don't know, returns an error? aborts halfway through allocating something? forces a trap halfway through the marshalling code?). I'd personally see this in the same light as, e.g., the need for bounds checks when memory is configured a certain way: it's Just How We Compile Things.

Wasmtime GitHub notifications bot (Jan 22 2026 at 19:26):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 22 2026 at 19:59):

arjunr2 submitted PR review.

Wasmtime GitHub notifications bot (Jan 22 2026 at 19:59):

arjunr2 created PR review comment:

I can't comment much on the intended direction of Config, but I agree that determinism is a fundamental part of RR, and feels like something that should be implicitly enforced whenever that option is specified.

Wasmtime GitHub notifications bot (Jan 22 2026 at 20:16):

arjunr2 submitted PR review.

Wasmtime GitHub notifications bot (Jan 22 2026 at 20:16):

arjunr2 created PR review comment:

We would still need the None variant conditionally right?

Wasmtime GitHub notifications bot (Jan 22 2026 at 22:13):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jan 22 2026 at 22:13):

alexcrichton created PR review comment:

To be clear I'm not saying that things should be deterministic, I'm saying that I think it would make sense to avoid unconditionally coupling them here. The two options handled here, NaN bits and relaxed-simd, are pretty low down on the list of "things that practically cause nondeterminism" in wasm with resource exhaustion (stacks or memory) being much higher on that list. RR cannot control resource exhaustion during replay in the sense that it can't necessarily predict the stack consumption (maybe memory? that seems relatively advanced)

Basically I would expect that divergence of a replay from the original recording is something that's going to need to be handled no matter what. I think it'd be reasonable, for example, for the CLI to automatically set these options but at the wasmtime::Config layer we've generally had bad experiences tying all these together.

Put another way, I would be surprised if we could actually achieve absolutely perfect determinism during a record and replay. Inevitably it seems like we'll forget events, have bugs that prevent this, etc. Assuming perfect determinism to me sounds like it's going to introduce more subtle bugs than not and be a pretty steep uphill battle

Wasmtime GitHub notifications bot (Jan 22 2026 at 22:14):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jan 22 2026 at 22:14):

alexcrichton created PR review comment:

Assuming you mean unconditionally and that's a typo, true! RRConfig::None would always be present so RR-ignoring code could handle that as usual without #[cfg]

Wasmtime GitHub notifications bot (Jan 22 2026 at 22:35):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 22 2026 at 22:35):

cfallin created PR review comment:

Right, I think it's an interesting challenge to think about permitting partial divergence and recovering. However I also think, having seen and thought through a lot of the challenges here, that is enormously complex and opens a huge new Pandora's box of issues. For example, with reversible debugging, which builds on top of replay, the whole algorithm depends on determinism; we'll back ourselves into fundamentally unsolvable corners if we don't have it.

See also e.g. how rr (the Mozilla project) panics with internal asserts if trace replay mismatches. I think that's the only really reasonable way to go here: we'll have asserts when we have mismatches. In other words, yep there may be bugs; let's treat them as bugs and catch and fix them.

Resource exhaustion is of course a different category: early termination because a memory.grow failed on replay is reasonable to propagate through and we already have the error paths for that. The kind of nondeterminism that is impossible to deal with is the kind that keeps running but with a poisoned machine state.

Wasmtime GitHub notifications bot (Jan 23 2026 at 02:29):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jan 23 2026 at 02:29):

alexcrichton created PR review comment:

Oh I definitely don't meant to give the impression that we should somehow recover a diverging replay. I would also expect a panic, error, or something like that. My assumption is that such infrastructure is a given, and with that whether or not NaN and relaxed-simd is deterministic is not a critical property requires for correctness, but instead a strong recommendation and what tooling should enable by default.

Wasmtime GitHub notifications bot (Jan 23 2026 at 02:35):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jan 23 2026 at 02:35):

alexcrichton created PR review comment:

Another possible option here, which we've done elsewhere, is to change defaults depending on configuration options. For example the default set of enabled wasm features is different if you use Winch than if you use Cranelift.

One way to slice this problem, without mutating Config, would be to keep the validation that determinism isn't explicitly disabled and then update the read of these configuration values to take into account the rr configuration. That would retain the fact that validate doesn't mutate Config, but the configuration for reading "is relaxed simd determininstic" would look like "was it set or is rr enabled" or something like that.

Wasmtime GitHub notifications bot (Jan 23 2026 at 19:14):

arjunr2 created PR review comment:

I think that could work too.

Wasmtime GitHub notifications bot (Jan 23 2026 at 19:14):

arjunr2 submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2026 at 15:44):

arjunr2 created PR review comment:

Is there any consensus then on the approach for this? Thinking about it a bit more, updating the value on reads could be misleading and requires all future uses of this to ensure it checks this edge case. Perhaps that's ok if it's stated explicitly in the code documentation somewhere?

Wasmtime GitHub notifications bot (Jan 28 2026 at 15:44):

arjunr2 submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2026 at 20:35):

alexcrichton submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2026 at 20:35):

alexcrichton created PR review comment:

I'd say my personal requirement is that fn validate should stay at &self instead of changing to &mut self. How exactly that plays out for this can be workable in a few ways (e.g. decouple these options, change those reading the options, change the source-of-truth for the options, etc).

Wasmtime GitHub notifications bot (Jan 28 2026 at 21:53):

cfallin submitted PR review.

Wasmtime GitHub notifications bot (Jan 28 2026 at 21:53):

cfallin created PR review comment:

Ah, I missed that this was happening in validate and making it a &mut self method -- I definitely agree that that's incorrect / violates the intended meaning of validate.

In earlier review I had pointed to this bit of logic for guest debugging (idea courtesy of Alex) where we change the default knob settings (prior to processing user overrides) based on other knobs; all of this is happening in a way that doesn't actually mutate the Config, just changes the tunables.

It seems that the determinism settings are not on Tunables so maybe that's not literally applicable here but in principle we should either do that, or do what validate says on the tin and simply reject an invalid config, not silently mutate.

Wasmtime GitHub notifications bot (Feb 02 2026 at 01:18):

arjunr2 updated PR #12375.

Wasmtime GitHub notifications bot (Feb 02 2026 at 01:21):

arjunr2 commented on PR #12375:

Ok I've addressed everything from the prior review. In particular, for validate right now, I have disabled the implicit enforcement of determinism, requiring users to explicitly provide it. It will reject invalid configs that don't meet this with an error.

Wasmtime GitHub notifications bot (Feb 02 2026 at 01:22):

arjunr2 edited a comment on PR #12375:

Ok I've addressed everything from the prior review. In particular, for validate right now, I have disabled the implicit enforcement of determinism, requiring users to explicitly provide it. It will reject invalid configs that don't meet this with an error. Seems like it should be ok to place it on the user to set appropriate sister settings for the RR.

Wasmtime GitHub notifications bot (Feb 02 2026 at 01:53):

alexcrichton submitted PR review:

Thanks! I'll flag for merge after a rebase (merge conflicts currently)

Wasmtime GitHub notifications bot (Feb 02 2026 at 04:17):

arjunr2 updated PR #12375.

Wasmtime GitHub notifications bot (Feb 02 2026 at 04:23):

alexcrichton has enabled auto merge for PR #12375.

Wasmtime GitHub notifications bot (Feb 02 2026 at 04:30):

alexcrichton added PR #12375 RR: Config knobs and validation to the merge queue.

Wasmtime GitHub notifications bot (Feb 02 2026 at 04:56):

alexcrichton merged PR #12375.

Wasmtime GitHub notifications bot (Feb 02 2026 at 04:57):

alexcrichton removed PR #12375 RR: Config knobs and validation from the merge queue.

Last updated: Feb 24 2026 at 04:36 UTC