lann opened issue #5802:
Feature
Add a new
Engine::serialization_compatibility_hash
method that returns a hash of the "compiler info" appended to precompiled Wasm binaries.Benefit
This would allow precompiled Wasm from multiple configurations/versions of Wasmtime to be safely and efficiently stored together. This may still be a little pessimistic compared with Wasmtime's actual compatibility checks but would be better than the conservative approach of only reusing precompiled binaries on a single host.
Implementation
Extract the following section from
append_compiler_info
and reuse it to compute a sha-256 digest of the "compiler info": https://github.com/bytecodealliance/wasmtime/blob/c3c16eb207eccd895f5fbbc4b771bd74ea36d071/crates/wasmtime/src/engine/serialization.rs#L50-L64Alternatives
- There is some overlap between the features this would be used for and Wasmtime's cache system, but the cache system is currently somewhat opaque and difficult to reason about wrt shared storage and performance characteristics.
- This could be done today without changes to Wasmtime by precompiling a dummy module and hashing its
.wasmtime.engine
section, which seems fragile.- https://github.com/bytecodealliance/wasmtime/issues/3900 suggests splitting out compatibility data into a separate structure, which appears to require a larger effort.
lann edited issue #5802:
Feature
Add a new
Engine::serialization_compatibility_hash
method that returns a hash of the "compiler info" appended to precompiled Wasm binaries.Benefit
This would allow precompiled Wasm from multiple configurations/versions of Wasmtime to be safely and efficiently stored together by, for example, keying its storage on
(Hash(wasm), serialization_compat_hash)
. This may still be a little pessimistic compared with Wasmtime's actual compatibility checks but would be better than the conservative approach of only reusing precompiled binaries on a single host.Implementation
Extract the following section from
append_compiler_info
and reuse it to compute a sha-256 digest of the "compiler info": https://github.com/bytecodealliance/wasmtime/blob/c3c16eb207eccd895f5fbbc4b771bd74ea36d071/crates/wasmtime/src/engine/serialization.rs#L50-L64Alternatives
- There is some overlap between the features this would be used for and Wasmtime's cache system, but the cache system is currently somewhat opaque and difficult to reason about wrt shared storage and performance characteristics.
- This could be done today without changes to Wasmtime by precompiling a dummy module and hashing its
.wasmtime.engine
section, which seems fragile.- https://github.com/bytecodealliance/wasmtime/issues/3900 suggests splitting out compatibility data into a separate structure, which appears to require a larger effort.
lann edited issue #5802:
Feature
Add a new
Engine::serialization_compatibility_hash
method that returns a hash of the "compiler info" appended to precompiled Wasm binaries.Benefit
This would allow precompiled Wasm from multiple configurations/versions of Wasmtime to be safely and efficiently stored together by, for example, keying its storage on
(Hash(wasm), serialization_compat_hash)
. This may still be a little pessimistic compared with Wasmtime's actual compatibility checks but would be better than the conservative approach of only reusing precompiled binaries on a single host.Implementation
Extract the following section from
append_compiler_info
and reuse it to compute a sha-256 digest of the "compiler info": https://github.com/bytecodealliance/wasmtime/blob/c3c16eb207eccd895f5fbbc4b771bd74ea36d071/crates/wasmtime/src/engine/serialization.rs#L50-L64Alternatives
- There is some overlap between the features this would be used for and Wasmtime's cache system, but the cache system is currently somewhat opaque and difficult to reason about wrt shared storage and the cache "worker"'s performance characteristics.
- This could be done today without changes to Wasmtime by precompiling a dummy module and hashing its
.wasmtime.engine
section, which seems fragile.- https://github.com/bytecodealliance/wasmtime/issues/3900 suggests splitting out compatibility data into a separate structure, which appears to require a larger effort.
alexcrichton commented on issue #5802:
One alternative idea for this would be to implement
Hash
andEq
for theConfig
type perhaps. That's more-or-less what the blob here is but would be a bit less specific to the current serialization strategy.
lann commented on issue #5802:
implement Hash and Eq for the Config
That seems like it could work, though it would be a little more pessimistic compatibility-wise for settings that don't affect compilation or if a rustc update changes
Hash
(rare I assume).One notable oddity is that we'd need to special-case
Config::module_version
'sWasmtimeVersion
to mix inCARGO_PKG_VERSION
somewhere.Would this be documented behavior of
impl Hash for Config
? Is it reasonable to assume thatConfig
will continue to cover everything used in compatibility checks?
alexcrichton commented on issue #5802:
That's true, yeah. I'm mostly thinking that I'd prefee to not bake in sha256 or something like that but to expose this through
Hash
andEq
if possible (in case anyone wants strict-equality as well instead of just hash-equality). One possibility would be something likeEngine::cache_key(&self) -> impl Hash + Eq
instead of returning the sha256 hash itself as well. We could then change theserialization.rs
to use this directly to ensure nothing regresses as well perhaps.
lann commented on issue #5802:
What about
Engine::cache_key(&self) -> impl AsRef<[u8]>
?
lann edited a comment on issue #5802:
The semantics of
Hash
just don't seem quite right for this use. You can get au64
out of it for indexing but you wouldn't really use this key in a HashMap.What about
Engine::cache_key(&self) -> impl AsRef<[u8]>
?
lann edited a comment on issue #5802:
The semantics of
Hash
just don't seem quite right for this. You can get au64
out of it for indexing but you wouldn't really use this key in a HashMap.What about
Engine::cache_key(&self) -> impl AsRef<[u8]>
?
alexcrichton commented on issue #5802:
I suppose that's probably the only reasonable thing to implement in the short term. I'm hesitant to have something that feels so specific here, but if it solves a use case for y'all and it's already used elsewhere it shouldn't really increase maintenance burden at all.
lann commented on issue #5802:
Yeah, I actually think https://github.com/bytecodealliance/wasmtime/issues/3900 would be a more generally-useful approach but I don't think I have the context/time to tackle it myself. I'm not actually opposed to the "build a dummy module and read out the compiler info" hack as a short term workaround but I wouldn't want to rely on it for very long.
lann edited a comment on issue #5802:
Yeah, I actually think https://github.com/bytecodealliance/wasmtime/issues/3900 (if exposed publicly) would be a more generally-useful approach but I don't think I have the context/time to tackle it myself. I'm not actually opposed to the "build a dummy module and read out the compiler info" hack as a short term workaround but I wouldn't want to rely on it for very long.
alexcrichton commented on issue #5802:
Nah it's much easier to "just" add a method to
Engine
to expose that section rather than harvesting it yourself, so I think that's ok to add.
jameysharp closed issue #5802:
Feature
Add a new
Engine::serialization_compatibility_hash
method that returns a hash of the "compiler info" appended to precompiled Wasm binaries.Benefit
This would allow precompiled Wasm from multiple configurations/versions of Wasmtime to be safely and efficiently stored together by, for example, keying its storage on
(Hash(wasm), serialization_compat_hash)
. This may still be a little pessimistic compared with Wasmtime's actual compatibility checks but would be better than the conservative approach of only reusing precompiled binaries on a single host.Implementation
Extract the following section from
append_compiler_info
and reuse it to compute a sha-256 digest of the "compiler info": https://github.com/bytecodealliance/wasmtime/blob/c3c16eb207eccd895f5fbbc4b771bd74ea36d071/crates/wasmtime/src/engine/serialization.rs#L50-L64Alternatives
- There is some overlap between the features this would be used for and Wasmtime's cache system, but the cache system is currently somewhat opaque and difficult to reason about wrt shared storage and the cache "worker"'s performance characteristics.
- This could be done today without changes to Wasmtime by precompiling a dummy module and hashing its
.wasmtime.engine
section, which seems fragile.- https://github.com/bytecodealliance/wasmtime/issues/3900 suggests splitting out compatibility data into a separate structure, which appears to require a larger effort.
Last updated: Dec 23 2024 at 13:07 UTC