arjunr2 opened PR #12375 from arjunr2:main to bytecodealliance:main:
Initial PR for knobs for
Configfor record/replay feature
arjunr2 requested alexcrichton for a review on PR #12375.
arjunr2 requested wasmtime-core-reviewers for a review on PR #12375.
arjunr2 requested wasmtime-default-reviewers for a review on PR #12375.
github-actions[bot] added the label wasmtime:config on PR #12375.
github-actions[bot] added the label wasmtime:api on PR #12375.
github-actions[bot] commented on PR #12375:
Label Messager: wasmtime:config
It looks like you are changing Wasmtime's configuration options. Make sure to
complete this check list:
[ ] If you added a new
Configmethod, you wrote extensive documentation for
it.<details>
Our documentation should be of the following form:
```text
Short, simple summary sentence.More details. These details can be multiple paragraphs. There should be
information about not just the method, but its parameters and results as
well.Is this method fallible? If so, when can it return an error?
Can this method panic? If so, when does it panic?
Example
Optional example here.
```</details>
[ ] If you added a new
Configmethod, or modified an existing one, you
ensured that this configuration is exercised by the fuzz targets.<details>
For example, if you expose a new strategy for allocating the next instance
slot inside the pooling allocator, you should ensure that at least one of our
fuzz targets exercises that new strategy.Often, all that is required of you is to ensure that there is a knob for this
configuration option in [wasmtime_fuzzing::Config][fuzzing-config] (or one
of its nestedstructs).Rarely, this may require authoring a new fuzz target to specifically test this
configuration. See [our docs on fuzzing][fuzzing-docs] for more details.</details>
[ ] If you are enabling a configuration option by default, make sure that it
has been fuzzed for at least two weeks before turning it on by default.[fuzzing-config]: https://github.com/bytecodealliance/wasmtime/blob/ca0e8d0a1d8cefc0496dba2f77a670571d8fdcab/crates/fuzzing/src/generators.rs#L182-L194
[fuzzing-docs]: https://docs.wasmtime.dev/contributing-fuzzing.html
<details>
To modify this label's message, edit the <code>.github/label-messager/wasmtime-config.md</code> file.
To add new label messages or remove existing label messages, edit the
<code>.github/label-messager.json</code> configuration file.</details>
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Personally I'm a bit wary of having this configuration method/convenience. We've got some documented known sources of non-determinisim and these two options are only a subset of the possible sources of non-determinism. Given that, and coupled with my thoughts about validation of a config, I think this would be served well as documentation of the
Config::rrmethod but would prefer to avoid having an explicit method like this.
alexcrichton created PR review comment:
Rust-idiom-wise we try to prefer exhaustive matches where possible unless it's unnecessarily onerous. In this case this'd be a good place to change to
RRConfig::None => {}
alexcrichton created PR review comment:
Personally I'd say it's fine to. unconditionally define these methods and have them just return
falsewhenrr-the-feature is disabled.
alexcrichton created PR review comment:
We've got more documentation online about this, but in the interest of shuffling things around and minimizing the impact of this one thought I might have here is:
- Conditionally define the
RecordingandReplayingvariants ofRRConfig-the-enum- Mark
RRConfigas#[non_exhaustive]- Unconditionally store
RRConfigin various placesThat would mean that when the Cargo features is disabled
RRConfigis a zero-size-type with no runtime cost. It would reduce the conditional imports ofRRConfigas well, too.
alexcrichton created PR review comment:
What would you think of ignoring the determinism options here instead of validating/enabling them? Historically as various config options have been tied together it's caused problems and made configuration more confusing due to trying to understand how everything interacts with each other. In some sense producing a recording is entirely orthogonal to deterministic simd/nans. In another sense I also understand how such a recording runs the risk of not being too useful.
For the engine-level configuration I'd argue, however, that this is best kept as a separate concern where we'd document in the RR configuration options that users probably also want to turn on deterministic things, but it wouldn't be a requirement.
Such a change would also have the nice benefit of keeping
validateas&selfvs&mut selfchanged in this PR. That's been an intentional design so far whereConfigis intended to not need any sort of post-processing. If post-processing is necessary we try to defer it to "store the result of the computation in theEngine" soConfigcontinues to reflect the source of truth of configuration specified.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Oops forgot from the previous review -- mind adding handling of these options in this crate such that if the option is specified an error is returned? A future PR would then change it to calling a
Configmethod or such as appropriate, but it'll avoid a state where we have these options but they don't do anything
cfallin created PR review comment:
I think I'd gently push back against this: determinism is a fundamental part of record/replay, not an optional add-on, and without it the replay may not only be incorrect but be incorrect in interesting ways that violate internal invariants (finding the wrong kind of event as we read the trace alongside canonical ABI steps, necessitating some sort of fallback that, I don't know, returns an error? aborts halfway through allocating something? forces a trap halfway through the marshalling code?). I'd personally see this in the same light as, e.g., the need for bounds checks when memory is configured a certain way: it's Just How We Compile Things.
cfallin submitted PR review.
arjunr2 submitted PR review.
arjunr2 created PR review comment:
I can't comment much on the intended direction of
Config, but I agree that determinism is a fundamental part of RR, and feels like something that should be implicitly enforced whenever that option is specified.
arjunr2 submitted PR review.
arjunr2 created PR review comment:
We would still need the
Nonevariant conditionally right?
alexcrichton submitted PR review.
alexcrichton created PR review comment:
To be clear I'm not saying that things should be deterministic, I'm saying that I think it would make sense to avoid unconditionally coupling them here. The two options handled here, NaN bits and relaxed-simd, are pretty low down on the list of "things that practically cause nondeterminism" in wasm with resource exhaustion (stacks or memory) being much higher on that list. RR cannot control resource exhaustion during replay in the sense that it can't necessarily predict the stack consumption (maybe memory? that seems relatively advanced)
Basically I would expect that divergence of a replay from the original recording is something that's going to need to be handled no matter what. I think it'd be reasonable, for example, for the CLI to automatically set these options but at the
wasmtime::Configlayer we've generally had bad experiences tying all these together.Put another way, I would be surprised if we could actually achieve absolutely perfect determinism during a record and replay. Inevitably it seems like we'll forget events, have bugs that prevent this, etc. Assuming perfect determinism to me sounds like it's going to introduce more subtle bugs than not and be a pretty steep uphill battle
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Assuming you mean unconditionally and that's a typo, true!
RRConfig::Nonewould always be present so RR-ignoring code could handle that as usual without#[cfg]
cfallin submitted PR review.
cfallin created PR review comment:
Right, I think it's an interesting challenge to think about permitting partial divergence and recovering. However I also think, having seen and thought through a lot of the challenges here, that is enormously complex and opens a huge new Pandora's box of issues. For example, with reversible debugging, which builds on top of replay, the whole algorithm depends on determinism; we'll back ourselves into fundamentally unsolvable corners if we don't have it.
See also e.g. how rr (the Mozilla project) panics with internal asserts if trace replay mismatches. I think that's the only really reasonable way to go here: we'll have asserts when we have mismatches. In other words, yep there may be bugs; let's treat them as bugs and catch and fix them.
Resource exhaustion is of course a different category: early termination because a memory.grow failed on replay is reasonable to propagate through and we already have the error paths for that. The kind of nondeterminism that is impossible to deal with is the kind that keeps running but with a poisoned machine state.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Oh I definitely don't meant to give the impression that we should somehow recover a diverging replay. I would also expect a panic, error, or something like that. My assumption is that such infrastructure is a given, and with that whether or not NaN and relaxed-simd is deterministic is not a critical property requires for correctness, but instead a strong recommendation and what tooling should enable by default.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Another possible option here, which we've done elsewhere, is to change defaults depending on configuration options. For example the default set of enabled wasm features is different if you use Winch than if you use Cranelift.
One way to slice this problem, without mutating
Config, would be to keep the validation that determinism isn't explicitly disabled and then update the read of these configuration values to take into account the rr configuration. That would retain the fact thatvalidatedoesn't mutateConfig, but the configuration for reading "is relaxed simd determininstic" would look like "was it set or is rr enabled" or something like that.
arjunr2 created PR review comment:
I think that could work too.
arjunr2 submitted PR review.
arjunr2 created PR review comment:
Is there any consensus then on the approach for this? Thinking about it a bit more, updating the value on reads could be misleading and requires all future uses of this to ensure it checks this edge case. Perhaps that's ok if it's stated explicitly in the code documentation somewhere?
arjunr2 submitted PR review.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
I'd say my personal requirement is that
fn validateshould stay at&selfinstead of changing to&mut self. How exactly that plays out for this can be workable in a few ways (e.g. decouple these options, change those reading the options, change the source-of-truth for the options, etc).
cfallin submitted PR review.
cfallin created PR review comment:
Ah, I missed that this was happening in
validateand making it a&mut selfmethod -- I definitely agree that that's incorrect / violates the intended meaning ofvalidate.In earlier review I had pointed to this bit of logic for guest debugging (idea courtesy of Alex) where we change the default knob settings (prior to processing user overrides) based on other knobs; all of this is happening in a way that doesn't actually mutate the
Config, just changes the tunables.It seems that the determinism settings are not on
Tunablesso maybe that's not literally applicable here but in principle we should either do that, or do whatvalidatesays on the tin and simply reject an invalid config, not silently mutate.
Last updated: Jan 29 2026 at 13:25 UTC