peterhuene opened PR #2791 from compile-command to main:
This PR adds a
compilecommand to the Wasmtime CLI.The command can be used to Ahead-Of-Time (AOT) compile WebAssembly modules.
With the
all-archfeature enabled, AOT compilation can be performed for
non-native architectures (i.e. cross-compilation).The
Module::compilemethod has been added to perform AOT compilation.A few of the CLI flags relating to "on by default" Wasm features have been
changed to be "--disable-XYZ" flags.A simple example of using the
wasmtime compilecommand:$ wasmtime compile input.wasm $ wasmtime input.cwasm
peterhuene updated PR #2791 from compile-command to main.
bjorn3 submitted PR Review.
bjorn3 created PR Review Comment:
*entries
bjorn3 submitted PR Review.
bjorn3 created PR Review Comment:
const COMPILED_MODULE_HEADER: &[u8] = b"\0wasmtimeaot";
peterhuene updated PR #2791 from compile-command to main.
peterhuene submitted PR Review.
peterhuene created PR Review Comment:
Are you sure I didn't mean uncastrated male horse? :scream:
Good eye! I selected the wrong word in the spell checker drop-down menu it seems.
peterhuene updated PR #2791 from compile-command to main.
alexcrichton submitted PR Review.
alexcrichton created PR Review Comment:
I think that this can be done with
cranelift_other_flag(flag, "true"), right? (not sure if we needed an extra method for this)
alexcrichton submitted PR Review.
alexcrichton created PR Review Comment:
FWIW this method can probably be removed since we can just hash the return value of
enabled_isa_flagsI think?
alexcrichton created PR Review Comment:
I've seen this in a few places, could this perhaps be refactored to a common function in
src/lib.rs?
alexcrichton created PR Review Comment:
We may actually want to avoid this to avoid duplicating the logic below, and instead we could initialize
retlikewasm_backtrace_detailsis initialized perhaps?
alexcrichton created PR Review Comment:
Instead of being a separate constructor could this be a configuration method?
We could document
Config::newas matching whatever the native machine is and then a.target(..)method which resets that internally?
alexcrichton created PR Review Comment:
FWIW I was just thinking about this recently and how we might have to do this. It seems kinda bad that we're guaranteed to break anyone using these flags as soon as things stabilize, and it feels kinda odd having a mishmash of enable/disable.
To help reduce the clutter in the CLI options as well as to make this a bit more future proof, what do you think of something like:
$ wasmtime --wasm-features=+simd $ wasmtime --wasm-features=-bulk-memory $ wasmtime --wasm-features=+multi-value,-bulk-memory,+threads $ wasmtime --wasm-features=+all(similar to LLVM features and
-Ctarget-featurein Rust)This also seems fine for me to file a follow-up too
alexcrichton created PR Review Comment:
This sort of makes me think that we should lean towards
-Cflags which affect codegen. In any case though this might be good to lift intoCommonOptionsperhaps?
alexcrichton created PR Review Comment:
Could this method perhaps be written as
Module::new(engine, bytes)?.serialize_into(output)?I'd ideally prefer to avoid having this method since it seems like it should be a trivial combination of a constructor plus a
serializeinvocation, and splitting it up shouldn't be all that much less ergonomic. (I'd imagine it's fine to add the compiled module header to serialized modules too and we could even support them inModule::newto automaticallydeserializeas well)
alexcrichton created PR Review Comment:
I think it'd be best if we didn't have to list out flags like this. I find it pretty unwieldy for the CLI since
wasmtime --helpbecomes super long in not always so interesting ways.I thought that this could be done with
--cranelift-extra-flag has_znver1=true, but that's naturally a whole lot more wordy than--znver1. Could we perhaps take some liberties and make this a bit more compact though? Similar to what I'm thinking below with wasm features we could do something like:
- Backends have the ability to return all possible flags (removing the need to keep the backend in sync with this file)
- We have a general
--codegen-features=...or something similar to Rust's-Cflag where we do-Ccranelift=...- The CLI would parse everything received and then match that against what the backend says is available, returning errors for anything enabled that the backend doesn't support
- We could implement something like
--codegen-features=helpto print out a helpful listing of all codegen features (similar to-Ctarget-feature=help)I think that'd help reduce the need to keep this file in sync with the cranelift backends and also clean up the CLI UI perhaps?
alexcrichton created PR Review Comment:
FWIW using
wat::parse_filewas done before because it automatically gives better error messages if the input file is*.wat. We can give rustc-style errors with filenames/line numbers and snippets outlined.
alexcrichton created PR Review Comment:
FWIW for this to be a bit more robust I think we'd want to move this outside of the bincode-encoded payload. Bincode may change formats (perhaps?) or we could change the structure here which means that if you're mismatching versions you're unlikely to get the error about different versions, that's only if the binary formats happen to align well.
alexcrichton created PR Review Comment:
With these various
check_*methods we can probably remove theHashimplementation forCompilersince IIRC it's only really used for this?
alexcrichton created PR Review Comment:
I always get this confused with the
isa_flagsbelow. Do you think we could use different names for these flags? IIRC there's like a two-level table of flags for cranelift ISAs, the "this applies to all targets" table and the "this applies to only one target" table. I think this is checking the target-independent table, right? And thecheck_isa_flagsbelow is checking the target-specific table?I basically found it a bit confusing to have this and
check_isa_flagsbelow.
alexcrichton created PR Review Comment:
Do you think it's worth ignoring this? Out of the codegen options this is definitely one where it doesn't actually affect the output in a super meaningful way. An unoptimized
Enginecould easily take an optimized precompiled wasm module
alexcrichton created PR Review Comment:
I wondering, instead of hashing all flags could we alter cranelift to return an iterator over enabled configuration settings and their values? That way we could store the full map here and have better error messages perhaps
alexcrichton created PR Review Comment:
Similar to above using a destructuring statement to exhaustively list all fields I think might work well here too?
alexcrichton created PR Review Comment:
Could this use named fields along the lines of:
let Tunables { static_memory_bound, /* ... */ } = self.tunables; // .. check all the fieldsso when we add a field to
Tunableswe're sure to add a case here too?
alexcrichton created PR Review Comment:
(ideally the same could be done for
opt_leveltoo)
alexcrichton created PR Review Comment:
Also I've personally never been really that sold on the
impl Writeoutput for this sort of use case. It's hard for me to imagine where the output is so large it must be streamed rather than being held resident in memory.
peterhuene submitted PR Review.
peterhuene created PR Review Comment:
Unfortunately not;
Configurable::setandConfigurable::enablehave different functionality relating to presets: the former will error for a preset whereas the latter applies it.
peterhuene submitted PR Review.
peterhuene created PR Review Comment:
That sounds good :+1: I'll make that change.
peterhuene created PR Review Comment:
This is still used for the compiler's hash implementation to generate the unique-for-this-system hash used by the code cache.
Given that implementation is based on
Hash, I felt it easier to just keep as-is than implement something else that could do the less-specific checks that deserializing the AOT artifact now does.
peterhuene submitted PR Review.
peterhuene submitted PR Review.
peterhuene created PR Review Comment:
:+1: I'll make this change to keep the two in sync.
peterhuene created PR Review Comment:
I'll extract this out.
peterhuene submitted PR Review.
peterhuene submitted PR Review.
peterhuene created PR Review Comment:
That makes sense to me. I'll move this out and validate it from there.
peterhuene submitted PR Review.
peterhuene created PR Review Comment:
Agreed, there's no reason this can't be done. The intention behind this was to keep the previous requirement of having the common flags match 1:1, but loosen the requirement behind the ISA flags as they are (currently) just CPU features and having the feature present on the host running the code (e.g. "I have AVX") while not targeted by AOT compilation (i.e. "I don't use AVX") should be allowed.
I can remove this check and allow the optimization level to differ (although I think it's a bit weird to say "I want to run this module as 'none' when it was already compiled optimized").
peterhuene edited PR Review Comment.
peterhuene submitted PR Review.
peterhuene created PR Review Comment:
Agreed, it is confusing.
flags_hashhere is the hash of the common/shared Cranelift flags and theisa_flagsare those enabled by the target ISA (currently these are 1:1 with CPU features).I can see about a better name.
peterhuene created PR Review Comment:
Excellent suggestion, I'll add that and we can then remove the assert on hashes as that was the intention for at least catching this case at runtime.
peterhuene submitted PR Review.
peterhuene submitted PR Review.
peterhuene created PR Review Comment:
Definitely will add :+1:
peterhuene submitted PR Review.
peterhuene created PR Review Comment:
I'll see what I can do to improve this.
peterhuene created PR Review Comment:
It's still used for the code cache, which remains
Hash-based.
peterhuene submitted PR Review.
peterhuene submitted PR Review.
peterhuene created PR Review Comment:
Do you know how it does that as I'm looking at the implementation of
_parse_fileinwatand it seems to just read the file to aVec<u8>and callparse_bytes, effectively what we're doing here?
peterhuene edited PR Review Comment.
peterhuene submitted PR Review.
peterhuene created PR Review Comment:
I like that idea.
I propose that we keep the
--enable-XXXas deprecated, undocumented flags for now for backcompat and to clean up--help, but perhaps break the--enable-XXX falseoptions that only exist for the on-by-default ones by turning them back to bools; those users we'd steer to--wasm-features=-XXXimmediately.Re the name: might
wasm-featuresbe slightly redundant given this is _wasm_time? Perhaps justfeatures?
peterhuene submitted PR Review.
peterhuene created PR Review Comment:
There's one big reason we _don't_ call
Module::newfromModule::compile: this explicitly does not want to callCompiledModule::from_artifacts_listas that will register the JIT'd code for the platform.The compilation might be for a target completely foreign to the host, so creating a
Module, even a short-lived one that gets immediately serialized and dropped, doesn't make much sense.In fact, given that
Configcan now have a foreign target, I propose thatModule's constructors should be verifying that the given config's target matches the host before even attempting JIT compilation. I'll add that.Re:
impl WriteI don't see much point in buffering unbounded data entirely in memory to immediately write it to disk and discard it like we'd do for the compile command. I think, from an API-design perspective, thatimpl Writeoffers maximum flexibility for users of the API to do as they see fit: they can always just pass aVec<u8>and be done.
peterhuene submitted PR Review.
peterhuene created PR Review Comment:
Lifting it into
CommonOptionsmeanswasmtime run --interuptable foo.cwasmwould be possible, which to me doesn't make nearly as much sense as the--wasm-timeoutoption specific to the run command that inherently enables interruption in a useful way.I think
wasmtime compile --interruptable foo.wasmand thenwasmtime run --wasm-timeout 60s foo.cwasmwould make sense to users (the latter errors if--interruptablewasn't given to the compile command).That said, though, I could be convinced that
wasmtime run --interruptable foo.cwasmis a way to run a AOT module that was compiled with interruption support without specifying a timeout, similar to--wasm-timeout <some-reaaaaaally-long-timeout>.
peterhuene submitted PR Review.
peterhuene created PR Review Comment:
Oh whoops, I missed the
set_pathcall in the error handling. I'll change this back.
peterhuene edited PR Review Comment.
peterhuene submitted PR Review.
peterhuene created PR Review Comment:
As I'd like to avoid opening the file twice (once to check if the AOT header is present before handing it off to
wat::parse_filewhich opens the file in turn), do you think we could add awat::Error::set_pathmethod that sets the path if the underlying error kind isWast?
peterhuene edited PR Review Comment.
peterhuene edited PR Review Comment.
peterhuene edited PR Review Comment.
peterhuene edited PR Review Comment.
peterhuene submitted PR Review.
peterhuene created PR Review Comment:
I like the idea of a
--codegen-featuresoption. This will require a bit of refactoring in the generated code for Cranelift's settings as the metadata about the settings isn't suitable for describing the settings to users this way.I'll see what I can make happen.
peterhuene submitted PR Review.
peterhuene created PR Review Comment:
I've opened https://github.com/bytecodealliance/wasm-tools/pull/252 that would allow us to set the path in the error without having to resort to opening the file twice.
peterhuene updated PR #2791 from compile-command to main.
peterhuene updated PR #2791 from compile-command to main.
peterhuene submitted PR Review.
peterhuene created PR Review Comment:
I've added a
--wasm-featuresoption with the latest commit.
peterhuene updated PR #2791 from compile-command to main.
alexcrichton created PR Review Comment:
:+1:
alexcrichton submitted PR Review.
alexcrichton created PR Review Comment:
Perhaps we could just outright remove this method now? With
Config::targetit doesn't seem too useful any more
alexcrichton created PR Review Comment:
Could this method (the one below that I can't actually comment on) be renamed to
cranelift_flag_setto mirror thecranelift_flag_enable?
alexcrichton submitted PR Review.
alexcrichton created PR Review Comment:
Yeah that was kind of my thinking where you could evaluate the overhead of interruptability with
wasmtime run --interruptable, but I don't think it's super important so seems fine to stay here for now.
alexcrichton created PR Review Comment:
Nah seems fine to just leave this until it trips someone up.
alexcrichton created PR Review Comment:
To bikeshed a bit, could this perhaps be called
cranelift_enableand the above method is calledcranelift_set? (I figure that if cranelift wants two different operations we could probably mostly just try to mirror them here well enough
alexcrichton submitted PR Review.
alexcrichton created PR Review Comment:
Bah ok, well in that case we should probably lean into this naming in all the various places, either
setorenableI think to be consistent about what maps to what
alexcrichton submitted PR Review.
alexcrichton created PR Review Comment:
Ah ok seems fine to leave for now in that case. I think this means, though, that we're ripe to refactor that part now too and use the same serialization/deserialization system at some point, but that's fine to happen later.
alexcrichton submitted PR Review.
alexcrichton created PR Review Comment:
Ah I see, those are good points. I think though I'd ideally still like to push on a small set of constructors for
Moduleinstead of a standalonecompilefunction. Could we perhaps defer the registration of JIT'd code for the platform to instantiation? I'm imagining we could have aMutexof some kind which mutates the state ofModulethe first time it's instantiated or something like that.Personally I think it'd be a bummer if, when using cross-compilation, you're forced to never touch
Module::newand friends and you instead have to useModule::compile, but there's no real fundamental reason to do so per se.Another possible alternative is that if the module looks cross-compiled we just never do the registration bits (and somehow we should error-out in
Instantiateif the target actually mismatches the current host...), but I think I'd prefer to have the defer for now to avoid registering jit code on hosts where you don't actually do any instantiation.
peterhuene submitted PR Review.
peterhuene created PR Review Comment:
I'm still inclined to believe that constructing a
Moduleimplies it's usable (i.e.instantiatable) for the current host. I think the constructors should error if 1) the module is AOT-compiled for a different target or 2) the associatedConfighas a target other than host and thus JIT doesn't make sense.I'm also inclined to believe that it's not desirable to produce a
Modulefor AOT compilation (even internally) as the point of AOT compilation is getting an artifact that can be loaded _at a later time_ and likely from a different host. I think that anyone making use ofModule::compilewould not be usingModule::newon the artifact from the same host as doing AOT compilation and loading it on the same host isn't gaining anything over the code cache.It's not just code registration that we're skipping in
Module::compilecurrently, but all the work done to get the module ready to be executed (i.e. "linking it" - copying the text sections into executable memory, applying relocations, etc.), none of which is required for AOT compilation.To me, at least, AOT compilation is a pretty explicit action that's independent of constructing a
Modulethat can be executed. PerhapsModule::compileisn't the best place for this, but I am strongly in favor of now allowing aModuleto construct if the target isn't for host.
peterhuene edited PR Review Comment.
peterhuene edited PR Review Comment.
peterhuene edited PR Review Comment.
alexcrichton submitted PR Review.
alexcrichton created PR Review Comment:
My main worry is that having
Module::{new, compile}could possibly be confusing for users because they both look like constructors and most might even lean towards "compile" since wasm is generally understood to be compiled to native code. Perhaps though we can solve all this by tweaking things a bit API-wise.In a hypothetical future version of the
wasmtimecrate you'll be able to jettison the runtime entirely and instead only have the cranelift compiler (and vice versa, you could only have the runtime without the compiler). In this hypothetical world ifModuleis a "ready to instantiate module" then we wouldn't want to expose it from the version of the crate that is missing the runtime. That would putModule::compilein a weird place since it needs to exist but the type it's attached to doesn't want to exist.Perhaps we could solve this by moving the method to
Engine::compile? That way it's theoretically far away enough api-wise to not confuse users, we'll still have anEngineeven with all modes of the wasmtime crate (whether you're removing the runtime or not), and we can clearly document that it's only intended for ahead-of-time-compiled cases.I think though that I would still want to tweak things a bit ideally:
- I think the format of the data from
Module::compileandModule::serializeshould be the same- We could probably remove
Module::deserializeand just document more sources of binary data forModule::newand/orfrom_binary- Personally I would still like to remove the
impl Writein favor of just returning aVec<u8>. I think that working with aVec<u8>is generally more convenient and if we really need it for performance we could always add something likecompile_intoorserialize_into.How does that all sound?
peterhuene created PR Review Comment:
I'm on-board with everything you just proposed; I agree that moving
compileout ofModuleinto (perhaps)Engine::compile_modulewould make it clear that this is not useful for constructing aModuleitself.I also am not opposed to using
Vec<u8>instead ofimpl Write; agree that it's easy enough to enhance if necessary in the future.
peterhuene submitted PR Review.
alexcrichton submitted PR Review.
alexcrichton created PR Review Comment:
Oh actually a nice name might be
precompile_moduleor something like that to emphasize it a bit more?
peterhuene submitted PR Review.
peterhuene created PR Review Comment:
Even better!
peterhuene updated PR #2791 from compile-command to main.
peterhuene submitted PR Review.
peterhuene created PR Review Comment:
Okay, so I kept the
--cranelift-setand--cranelift-enableflags, removed the various Cranelift flags fromwasmtime compile, and added awasmtime settingscommand that prints out the available settings for the target triple, e.g.:$ wasmtime settings Cranelift settings for target 'x86_64-apple-darwin': Boolean settings: has_sse3 Has support for SSE3. has_ssse3 Has support for SSSE3. has_sse41 Has support for SSE4.1. has_sse42 Has support for SSE4.2. has_avx Has support for AVX. has_avx2 Has support for AVX2. has_avx512dq Has support for AVX512DQ. has_avx512vl Has support for AVX512VL. has_avx512f Has support for AVX512F. has_popcnt Has support for POPCNT. has_bmi1 Has support for BMI1. has_bmi2 Has support for BMI2. has_lzcnt Has support for LZCNT. Presets: baseline A baseline preset with no extensions enabled. nehalem Nehalem microarchitecture. haswell Haswell microarchitecture. broadwell Broadwell microarchitecture. skylake Skylake microarchitecture. cannonlake Canon Lake microarchitecture. icelake Ice Lake microarchitecture. znver1 Zen (first generation) microarchitecture. Settings enabled for this host: - has_sse3 - has_ssse3 - has_sse41 - has_sse42 - has_avx - has_avx2 - has_popcnt - has_bmi1 - has_bmi2 - has_lzcntI've pushed up the commit, so let me know if this is inline with what you were thinking.
peterhuene submitted PR Review.
peterhuene created PR Review Comment:
The command intentionally doesn't present any of the "shared" settings as those really shouldn't be mucked around with by users, but we allow them for
--cranelift-set/--cranelift-enablefor users that know what they're doing.
peterhuene updated PR #2791 from compile-command to main.
peterhuene updated PR #2791 from compile-command to main.
peterhuene updated PR #2791 from compile-command to main.
peterhuene updated PR #2791 from compile-command to main.
peterhuene updated PR #2791 from compile-command to main.
peterhuene updated PR #2791 from compile-command to main.
alexcrichton submitted PR Review.
alexcrichton created PR Review Comment:
Technically I think we might want to check a number of things here in the limit, e.g. if you enable avx512 or something like that but you don't have it we should probably bail out here at some point. I'm fine with that being a FIXME though to basically check things other than just the target.
alexcrichton created PR Review Comment:
Reading this again, is there perhaps a method on a
target_lexiconfor checking this? For example if the arch/os match you could I suppose in theory have a glibc/musl mismatch which would prevent running things. That's never an issue for us in wasmtime but this seems like an abstract thingtarget_lexiconmight handle for us.
alexcrichton created PR Review Comment:
Are these groups still applicable? (I won't pretend to know what clap is doing here)
alexcrichton created PR Review Comment:
Could the documentation be expanded here to mention that the intent of this method is that you'd take these bytes to
Module::newat some other time, perhaps on a different machine or in a different process?
alexcrichton created PR Review Comment:
That looks fantastic to me, thanks for doing that!
alexcrichton submitted PR Review.
peterhuene submitted PR Review.
peterhuene created PR Review Comment:
There's nothing I can see in
target_lexicon(re: a "compatibility check") that would be useful here. Did you happen to spot something that might be?
peterhuene submitted PR Review.
peterhuene created PR Review Comment:
Good eye, these should be removed.
peterhuene updated PR #2791 from compile-command to main.
peterhuene submitted PR Review.
peterhuene created PR Review Comment:
I've removed
flags_hashin favor of storing all of the shared and ISA-specific settings in the compiled module.
peterhuene submitted PR Review.
peterhuene created PR Review Comment:
I've done just that; we now store all the shared and ISA-specific settings in the compiled module and can now print out which of the shared flags is missing or different.
peterhuene updated PR #2791 from compile-command to main.
peterhuene updated PR #2791 from compile-command to main.
alexcrichton submitted PR Review.
alexcrichton created PR Review Comment:
Nah this is more of just a random musing.
alexcrichton submitted PR Review.
cfallin submitted PR Review.
cfallin created PR Review Comment:
One possible backwards-compatibility consideration: it looks like this will result in no encoding of default options in the saved compiled module, but if we later change a default in an incompatible way, would we want to catch that?
(It's entirely possible that this is made irrelevant by some other versioning -- do you encode e.g. a git hash or wasmtime version in the
.cwasmfile too?)
peterhuene submitted PR Review.
peterhuene created PR Review Comment:
Agreed that there are way too many ways incompatibilities can be introduced, so to combat that we do indeed record the version in the cwasm and must be an identical match for now.
peterhuene edited PR Review Comment.
peterhuene edited PR Review Comment.
peterhuene merged PR #2791.
Last updated: Dec 06 2025 at 06:05 UTC