JamesMcGuigan opened issue #4224:
I am trying to run linux wasmtime on Xeon E5430 CPUs (2007) without SSE4 (Streaming SIMD Extensions 4) support
I am getting the following error:
$ wasmtime gcd.wasm --invoke gcd 42 36 Error: failed to run main module `gcd.wasm` Caused by: 0: compilation settings are not compatible with the native host 1: compilation setting "has_sse42" is enabled but not available on the host
Test Case
WASM code is taken directly from the manual, and is 28 lines of handcoded WAT implementing the GCD (Greatest common divisor) algoritm, and compiled using
wat2wasm
- https://docs.wasmtime.dev/lang-bash.html
- https://github.com/JamesMcGuigan/ecosystem-research/blob/04a8056151602f6de71feb6ce12f77f0ddd3cb8e/wasm/wat/gcd/gcd.wat
Steps to Reproduce
git clone https://github.com/JamesMcGuigan/ecosystem-research/ cd ./ecosystem-research/wasm/wat/gcd git checkout 04a8056151602f6de71feb6ce12f77f0ddd3cb8e wat2wasm gcd.wat wasmtime gcd.wasm --invoke gcd 42 36
Expected Results
WASM file can be validated through a custom nodejs runtime
$ node ./gcd.js 42 36 GCD of 42 and 36 is 6
Actual Results
$ wasmtime gcd.wasm --invoke gcd 42 36 Error: failed to run main module `gcd.wasm` Caused by: 0: compilation settings are not compatible with the native host 1: compilation setting "has_sse42" is enabled but not available on the host
Versions and Environment
I have tired both with the prebuilt binary version of wasmtime
sudo pacman -S wasmtime wasmtime 0.36.0
I have also tried installing from source:
cd /usr/local/src/ git clone https://github.com/bytecodealliance/wasmtime.git cd ./wasmtime/ git submodule update --init cargo clean cargo build --release
/usr/local/src/wasmtime/target/release/wasmtime --version # wasmtime-cli 0.38.0
And then even attempting applying the following diff (which didn't help)
iff --git a/crates/wasmtime/src/engine.rs b/crates/wasmtime/src/engine.rs index fbef95923..7df50ec5f 100644 --- a/crates/wasmtime/src/engine.rs +++ b/crates/wasmtime/src/engine.rs @@ -446,8 +446,8 @@ impl Engine { enabled = match flag { "has_sse3" => Some(std::is_x86_feature_detected!("sse3")), "has_ssse3" => Some(std::is_x86_feature_detected!("ssse3")), - "has_sse41" => Some(std::is_x86_feature_detected!("sse4.1")), - "has_sse42" => Some(std::is_x86_feature_detected!("sse4.2")), + // "has_sse41" => Some(std::is_x86_feature_detected!("sse4.1")), + // "has_sse42" => Some(std::is_x86_feature_detected!("sse4.2")), "has_popcnt" => Some(std::is_x86_feature_detected!("popcnt")), "has_avx" => Some(std::is_x86_feature_detected!("avx")), "has_avx2" => Some(std::is_x86_feature_detected!("avx2")), diff --git a/src/commands/compile.rs b/src/commands/compile.rs index 08c3b06fb..c4d4350dd 100644 --- a/src/commands/compile.rs +++ b/src/commands/compile.rs @@ -143,14 +143,14 @@ mod test { let command = CompileCommand::try_parse_from(vec![ "compile", "--disable-logging", - "--cranelift-enable", - "has_sse3", - "--cranelift-enable", - "has_ssse3", - "--cranelift-enable", - "has_sse41", - "--cranelift-enable", - "has_sse42", + //"--cranelift-enable", + //"has_sse3", + //"--cranelift-enable", + //"has_ssse3", + //"--cranelift-enable", + //"has_sse41", + // "--cranelift-enable", + // "has_sse42", "--cranelift-enable", "has_avx", "--cranelift-enable",
Operating system: Manjaro / Arch Linux
cat /etc/lsb-release DISTRIB_ID=ManjaroLinux DISTRIB_RELEASE=21.2.6 DISTRIB_CODENAME=Qonos DISTRIB_DESCRIPTION="Manjaro Linux"
Architecture:
cat /proc/cpuinfo vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Xeon(R) CPU E5430 @ 2.66GHz stepping : 10 microcode : 0xa0b cpu MHz : 2660.057 cache size : 6144 KB physical id : 1 siblings : 4 core id : 3 cpu cores : 4 apicid : 7 initial apicid : 7 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 xsave lahf_lm pti dtherm bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit bogomips : 5322.88 clflush size : 64 cache_alignment : 64 address sizes : 38 bits physical, 48 bits virtual power management:
JamesMcGuigan labeled issue #4224:
I am trying to run linux wasmtime on Xeon E5430 CPUs (2007) without SSE4 (Streaming SIMD Extensions 4) support
I am getting the following error:
$ wasmtime gcd.wasm --invoke gcd 42 36 Error: failed to run main module `gcd.wasm` Caused by: 0: compilation settings are not compatible with the native host 1: compilation setting "has_sse42" is enabled but not available on the host
Test Case
WASM code is taken directly from the manual, and is 28 lines of handcoded WAT implementing the GCD (Greatest common divisor) algoritm, and compiled using
wat2wasm
- https://docs.wasmtime.dev/lang-bash.html
- https://github.com/JamesMcGuigan/ecosystem-research/blob/04a8056151602f6de71feb6ce12f77f0ddd3cb8e/wasm/wat/gcd/gcd.wat
Steps to Reproduce
git clone https://github.com/JamesMcGuigan/ecosystem-research/ cd ./ecosystem-research/wasm/wat/gcd git checkout 04a8056151602f6de71feb6ce12f77f0ddd3cb8e wat2wasm gcd.wat wasmtime gcd.wasm --invoke gcd 42 36
Expected Results
WASM file can be validated through a custom nodejs runtime
$ node ./gcd.js 42 36 GCD of 42 and 36 is 6
Actual Results
$ wasmtime gcd.wasm --invoke gcd 42 36 Error: failed to run main module `gcd.wasm` Caused by: 0: compilation settings are not compatible with the native host 1: compilation setting "has_sse42" is enabled but not available on the host
Versions and Environment
I have tired both with the prebuilt binary version of wasmtime
sudo pacman -S wasmtime wasmtime 0.36.0
I have also tried installing from source:
cd /usr/local/src/ git clone https://github.com/bytecodealliance/wasmtime.git cd ./wasmtime/ git submodule update --init cargo clean cargo build --release
/usr/local/src/wasmtime/target/release/wasmtime --version # wasmtime-cli 0.38.0
And then even attempting applying the following diff (which didn't help)
iff --git a/crates/wasmtime/src/engine.rs b/crates/wasmtime/src/engine.rs index fbef95923..7df50ec5f 100644 --- a/crates/wasmtime/src/engine.rs +++ b/crates/wasmtime/src/engine.rs @@ -446,8 +446,8 @@ impl Engine { enabled = match flag { "has_sse3" => Some(std::is_x86_feature_detected!("sse3")), "has_ssse3" => Some(std::is_x86_feature_detected!("ssse3")), - "has_sse41" => Some(std::is_x86_feature_detected!("sse4.1")), - "has_sse42" => Some(std::is_x86_feature_detected!("sse4.2")), + // "has_sse41" => Some(std::is_x86_feature_detected!("sse4.1")), + // "has_sse42" => Some(std::is_x86_feature_detected!("sse4.2")), "has_popcnt" => Some(std::is_x86_feature_detected!("popcnt")), "has_avx" => Some(std::is_x86_feature_detected!("avx")), "has_avx2" => Some(std::is_x86_feature_detected!("avx2")), diff --git a/src/commands/compile.rs b/src/commands/compile.rs index 08c3b06fb..c4d4350dd 100644 --- a/src/commands/compile.rs +++ b/src/commands/compile.rs @@ -143,14 +143,14 @@ mod test { let command = CompileCommand::try_parse_from(vec![ "compile", "--disable-logging", - "--cranelift-enable", - "has_sse3", - "--cranelift-enable", - "has_ssse3", - "--cranelift-enable", - "has_sse41", - "--cranelift-enable", - "has_sse42", + //"--cranelift-enable", + //"has_sse3", + //"--cranelift-enable", + //"has_ssse3", + //"--cranelift-enable", + //"has_sse41", + // "--cranelift-enable", + // "has_sse42", "--cranelift-enable", "has_avx", "--cranelift-enable",
Operating system: Manjaro / Arch Linux
cat /etc/lsb-release DISTRIB_ID=ManjaroLinux DISTRIB_RELEASE=21.2.6 DISTRIB_CODENAME=Qonos DISTRIB_DESCRIPTION="Manjaro Linux"
Architecture:
cat /proc/cpuinfo vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Xeon(R) CPU E5430 @ 2.66GHz stepping : 10 microcode : 0xa0b cpu MHz : 2660.057 cache size : 6144 KB physical id : 1 siblings : 4 core id : 3 cpu cores : 4 apicid : 7 initial apicid : 7 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 xsave lahf_lm pti dtherm bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit bogomips : 5322.88 clflush size : 64 cache_alignment : 64 address sizes : 38 bits physical, 48 bits virtual power management:
alexcrichton commented on issue #4224:
Thanks for the report! By default Wasmtime has enabled stable merged features into the WebAssembly specification, which at this time notably includes the SIMD proposal for the
v128
type. This means that by default Wasmtime is enabling SIMD support in Cranelift. Cranelift on x86_64 currently requires SSE4.2 to be available to implement the SIMD proposal, which is the source of the error here. The system you're running on does not have SSE4.2 support (which the error message alludes to but probably could be clearer) but Wasmtime requires it to be enabled.You should be able to work around this by passing
--wasm-features=-simd
which should disable the SIMD proposal. When disabled Cranelift shouldn't require SSE4.2. Note though that this isn't a super-well-tested path so I'm not 10% sure that would work.The OP of this issue is about
has_sse41
which isn't related to the SSE4.2 requirement, though. The diff you mentioned you applied would cause this error message and that diff isn't doing what you otherwise think it might be doing. There's unfortunately no easy way to patch Cranelift for this.If you're interested, though, there are probably only a handful of SIMD instructions that require SSE4.2. Cranelift has support for generating different sequences of code depending on the active CPU features, so the real fix for this issue would be to add support to cranelift for non-SSE4.2 CPUs by adding instruction lowerings using SSE4.1 and prior instead of only having a lowering for SSE4.2 (as we do today)
JamesMcGuigan commented on issue #4224:
I have tried both
$ cargo clean; cargo build --release --wasm-features=-simd error: Found argument '--wasm-features' which wasn't expected, or isn't valid in this context
and
$ /usr/local/src/wasmtime/target/release/wasmtime --wasm-features=-simd gcd.wasm --invoke gcd 4 2 Caused by: 0: compilation settings are not compatible with the native host 1: compilation setting "has_sse42" is enabled, but not available on the host
How do I pass the --wasm-features=-simd flag?
alexcrichton commented on issue #4224:
The first inovcation:
cargo build --release --wasm-features=-simd
is passing the flag to Cargo and Cargo doesn't understand this flag. It's actually for the
wasmtime
executable. The second invocation:/usr/local/src/wasmtime/target/release/wasmtime --wasm-features=-simd gcd.wasm --invoke gcd 4 2
is the way to pass the flag. If that doesn't work though then alas while this is the way I believe it's supposed to work I think you're the first to test this out so it's probably not working as intended. That's a bug to fix internally I believe.
cfallin commented on issue #4224:
@JamesMcGuigan thank you for this report, and I'll say at a high level that the goal of Cranelift's x86-64 backend is to work and support Wasm-SIMD on any x86-64 machine, hence require only SSE2. We aren't quite there yet; #3810 is the tracking issue for this goal.
Echoing @alexcrichton , if we can't run with SIMD deselected on your system then that's a bug. @abrown would you be willing to take a look at this? Otherwise I can try to get to it soon (sometime next week).
JamesMcGuigan edited issue #4224:
I am trying to run linux wasmtime on Xeon E5430 CPUs (2007) without SSE4 (Streaming SIMD Extensions 4) support
I am getting the following error:
$ wasmtime gcd.wasm --invoke gcd 42 36 Error: failed to run main module `gcd.wasm` Caused by: 0: compilation settings are not compatible with the native host 1: compilation setting "has_sse42" is enabled but not available on the host
Test Case
WASM code is taken directly from the manual, and is 28 lines of handcoded WAT implementing the GCD (Greatest common divisor) algoritm, and compiled using
wat2wasm
- https://docs.wasmtime.dev/lang-bash.html
- https://github.com/JamesMcGuigan/ecosystem-research/blob/04a8056151602f6de71feb6ce12f77f0ddd3cb8e/wasm/wat/gcd/gcd.wat
Steps to Reproduce
git clone https://github.com/JamesMcGuigan/ecosystem-research/ cd ./ecosystem-research/wasm/wat/gcd git checkout 04a8056151602f6de71feb6ce12f77f0ddd3cb8e wat2wasm gcd.wat wasmtime gcd.wasm --invoke gcd 42 36
Expected Results
WASM file can be validated through a custom nodejs runtime
$ node ./gcd.js 42 36 GCD of 42 and 36 is 6
Actual Results
$ wasmtime gcd.wasm --invoke gcd 42 36 Error: failed to run main module `gcd.wasm` Caused by: 0: compilation settings are not compatible with the native host 1: compilation setting "has_sse42" is enabled but not available on the host
Versions and Environment
I have tired both with the prebuilt binary version of wasmtime
sudo pacman -S wasmtime wasmtime 0.36.0
I have also tried installing from source:
cd /usr/local/src/ git clone https://github.com/bytecodealliance/wasmtime.git cd ./wasmtime/ git submodule update --init cargo clean cargo build --release
/usr/local/src/wasmtime/target/release/wasmtime --version # wasmtime-cli 0.38.0
And then even attempting applying the following diff (which didn't help)
iff --git a/crates/wasmtime/src/engine.rs b/crates/wasmtime/src/engine.rs index fbef95923..7df50ec5f 100644 --- a/crates/wasmtime/src/engine.rs +++ b/crates/wasmtime/src/engine.rs @@ -446,8 +446,8 @@ impl Engine { enabled = match flag { "has_sse3" => Some(std::is_x86_feature_detected!("sse3")), "has_ssse3" => Some(std::is_x86_feature_detected!("ssse3")), - "has_sse41" => Some(std::is_x86_feature_detected!("sse4.1")), - "has_sse42" => Some(std::is_x86_feature_detected!("sse4.2")), + // "has_sse41" => Some(std::is_x86_feature_detected!("sse4.1")), + // "has_sse42" => Some(std::is_x86_feature_detected!("sse4.2")), "has_popcnt" => Some(std::is_x86_feature_detected!("popcnt")), "has_avx" => Some(std::is_x86_feature_detected!("avx")), "has_avx2" => Some(std::is_x86_feature_detected!("avx2")), diff --git a/src/commands/compile.rs b/src/commands/compile.rs index 08c3b06fb..c4d4350dd 100644 --- a/src/commands/compile.rs +++ b/src/commands/compile.rs @@ -143,14 +143,14 @@ mod test { let command = CompileCommand::try_parse_from(vec![ "compile", "--disable-logging", - "--cranelift-enable", - "has_sse3", - "--cranelift-enable", - "has_ssse3", - "--cranelift-enable", - "has_sse41", - "--cranelift-enable", - "has_sse42", + //"--cranelift-enable", + //"has_sse3", + //"--cranelift-enable", + //"has_ssse3", + //"--cranelift-enable", + //"has_sse41", + // "--cranelift-enable", + // "has_sse42", "--cranelift-enable", "has_avx", "--cranelift-enable",
Operating system: Manjaro / Arch Linux
cat /etc/lsb-release DISTRIB_ID=ManjaroLinux DISTRIB_RELEASE=21.2.6 DISTRIB_CODENAME=Qonos DISTRIB_DESCRIPTION="Manjaro Linux"
Architecture:
cat /proc/cpuinfo vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Xeon(R) CPU E5430 @ 2.66GHz stepping : 10 microcode : 0xa0b cpu MHz : 2660.057 cache size : 6144 KB physical id : 1 siblings : 4 core id : 3 cpu cores : 4 apicid : 7 initial apicid : 7 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 xsave lahf_lm pti dtherm bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit bogomips : 5322.88 clflush size : 64 cache_alignment : 64 address sizes : 38 bits physical, 48 bits virtual power management:
abrown commented on issue #4224:
After reading through this issue, I think that this could take some time to resolve properly: 1) identify all SSE4.2+ lowerings (these are not currently conditioned by any
has_...
predicates), 2) addhas_*
to all SSE2+ x64 lowerings, 3) figure out SSE2 lowerings for all of these cases (which could be quite a few?). Seems like a lot of work for little pay-off.Since @JamesMcGuigan isn't really trying to run SIMD code, perhaps an alternative would be to start with steps 1) and 2) above and then remove the code that requires Cranelift to compile Wasm SIMD code with SSE4.2. (I assume that initially we would need to actually fail in an ISLE
if-let
--not just "not match"--if the SSE feature is not present). This way, Wasm SIMD can remain a default feature and users without SSE2+ can still compile most programs. If they run across a program that does use an SSE4.2 instruction, e.g., then they would see a targeted error message (e.g.,v128.splat requires
has_sse42to compile
), leaving them no worse off than they are now. @cfallin, @alexcrichton: thoughts on this approach?
alexcrichton commented on issue #4224:
I think that would work but would need more infrastructure to plumb it all the way through because currently Wasmtime will iterate over enabled features in Cranelift and pessimistically assume that every feature was used for some instruction lowering, testing that each feature is available on the host. If we switched lowerings to fail if the required feature wasn't enabled then we'd also need to track which features were actually used and have that be an additional output of compilation.
That being said though I think the easiest way to handle the immediate issue here would be to fix disabling the simd feature in wasmtime. Doing so should ideally disable this block which should disable the need for feature sets above sse2. Currently though we may not translate
--wasm-features=-simd
to disabling the"simd"
feature in cranelift erronerously.
alexcrichton edited a comment on issue #4224:
I think that would work but would need more infrastructure to plumb it all the way through because currently Wasmtime will iterate over enabled features in Cranelift and pessimistically assume that every feature was used for some instruction lowering, testing that each feature is available on the host. If we switched lowerings to fail if the required feature wasn't enabled then we'd also need to track which features were actually used and have that be an additional output of compilation.
That being said though I think the easiest way to handle the immediate issue here would be to fix disabling the simd feature in wasmtime. Doing so should ideally disable this block which should disable the need for feature sets above sse2. Currently though we may not translate
--wasm-features=-simd
to disabling the"simd"
feature in cranelift erroneously.
cfallin commented on issue #4224:
Hmm, so I'll put up a PR in a second that at least gets further -- when emulating an SSE2-machine (by deleting the feature macro tests in
cranelift-native
), I found that the changes in #3816 to turn on SSE3, SSSE3, SSE4.1 and SSE4.2 in the defaultFlags
so that a default-constructed compiler backend can work with default Wasmtime options were having an unintended effect:cranelift-native
wouldn't actually produce flags with those off at all (because the logic was of the form "if feature present, set flag bit" starting with a default that's not all-zeroes).However it seems we have some other issues in this SSE2-only world -- e.g.:
[cfallin@xap]~/work/wasmtime% target/release/wasmtime run --wasm-features=-simd ../wasm-tests/spidermonkey.wasm simd = false sse3 = false sse42 = false thread 'main' panicked at 'not implemented: cannot generate relocation against libcall CeilF64', crates/cranelift/src/obj.rs:154:21 note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
so I think to safely say that we support such platforms, even without SIMD, we'd want to add a CI target (can we use
qemu
to emulate x86-64 on x86-64, but with only SSE2?) and test it.Finally, on the topic of "should we support SIMD on SSE2-only", I agree it's some amount of work, but I do think it's valuable. I don't want to fall into the trap of "most users have SSE x so we require it / make life without it increasingly painful" -- the point of Wasm is to be universal and work on all platforms, and that ethos I think should extend to having fallbacks for older CPU versions as well. It's certainly lower priority than a lot of the other backend work we have on our TODO lists (e.g. ISLE migration and the like), but we absolutely should do this audit at some point.
cfallin commented on issue #4224:
OK so I just spoke with Andrew for a bit and this is actually a really interesting and difficult design problem where we're stuck between multiple somewhat-mutually-incompatible constraints.
The ground truths:
- Wasm-SIMD is part of baseline SIMD now (ie, spec is merged), so it should be on by default in Wasmtime.
- Our current x86-64 backend supports Wasm-SIMD with SSE4.2, but not with only SSE2.
- Supporting it with only SSE2 would be great, but a huge amount of work and hard to motivate/justify given all the other bits of work. We should do it someday (but when and by whom?).
- The Wasm-SIMD folks mostly assumed a baseline of SSE4.2 on x86-64, so polyfilling back to SSE2 is not necessarily trivial.
- Wasmtime should support running on as many machines as possible, including SSE2-only.
These are clearly in conflict; the options (at least) are:
- Turn off SIMD by default on x86-64. Terrible idea, let's not do this!
- Current status quo: "default" compiler that the API will give you requires SSE4.2, and so works with default Wasmtime. With fix in #4231,
cranelift-native
on an SSE2-only system will give you the right flags and Wasmtime will refuse to construct a backend unless you explicitly turn off SIMD in the settings.- Use CPUID feature detection unconditionally (
cranelift-native
) for defaultFlags
in:
wasmtime-cli
; this is fine, AND- the public Wasmtime and Cranelift APIs; much more arguable! This breaks determinism: one should expect that programmatically invoking Wasmtime to compile
.cwasm
s should not depend on the host flags unless one goes out of the way to make it so.Basically the question is: does a default Wasmtime (i) assume SSE4.2 and compile artifacts for it unconditionally, unless otherwise configured, (ii) grow some new nondeterminism and do that only on some hosts, or (iii) not support SIMD on x86-64 unless explicitly enabled. The fourth option (iv) support SIMD on SSE2 is difficult because the Wasm-SIMD design mostly assumes an SSE4.2 baseline, as per @abrown. The only tenable option seems (i) but it's kind of ugly.
@abrown I probably missed some things, anything else to note here?
alexcrichton commented on issue #4224:
However it seems we have some other issues in this SSE2-only world -- e.g.:
I believe this used to work historically but in some recent refactoring I couldn't figure out why we still had all these libcalls and things to handle so I ended up removing support for libcalls from Wasmtime. It wouldn't be too hard to add back support for libcalls to Wasmtime I think, we'd just have to think somewhat carefully about how to integrate it into compiled images.
Otherwise though I suspect that Cranelift supports everything necessary for non-SIMD wasm with just SSE2, it's the Wasmtime bits which may need some work.
For what to do about the current situation, when you mention breaking determinism I think we already do exactly that today. By default
Engine::new()
will use the native host with allthe natively detected features which means thatwasmtime compile
could vary its output between hosts. This can be fixed by passing the--target
flag, however.I also would also personally advocate that it's not the correct default to disable all options and optionally enable things at runtime. Even if Cranelift and Wasmtime hypothetically supported the entire wasm spec with just SSE2 that means that to get good performance you'd have to manually opt-in to what's present on any CPU for the last decade. In my opinion it should be easy to get the native host's performance and even "target x86_64 please" in my opinion should do roughly what we do right now which is to assume SSE4.2. We should of course have knobs to still work with SSE2 but having to reach for specific support to run code on old CPUs does not seem unreasonable to me.
cfallin commented on issue #4224:
Ah, I hadn't realized that
Engine::new()
usescranelift-native
under the hood; if we're already in that universe, then I think this is actually a somewhat easier problem to solve, indeed. Basically we need to fix the flags detection (PR above) and then we're almost there. The SSE4.2-on-by-default-in-Flags
bit is then mostly there to support tests, which don't usecranelift-native
otherwise. (Maybe we could fix this too, as a shorter path to removing the hack than full SIMD-in-SSE2.)Followup thoughts:
- I wonder if it would make sense to have an extension to
--target
, or maybe an--arch
flag or something, forwasmtime compile
? Basically I want the equivalent ofclang -march=native
orclang -march=skylake
; right now we have an N-dimensional vector of feature flags (--cranelift-set has_sse3=true --cranelift-set has_ssse3=true --cranelift-set has_sse41=true ...
).- If we have that, does it make sense to make a plain
--target x86_64
mean "x86_64 with SSE4.2", and require something like--target x86_64:sse2
or--target x86_64 --arch baseline
or something to generate a fully-compatible.cwasm
that uses only SSE2?
alexcrichton commented on issue #4224:
Oh sorry one thing I also forgot to mention was that we thoeretically get a good deal of coverage for target features with this fuzz configuration which attempts to flip various flags available to cranelift to force it to generate instructions without a particular codegen feature. This testing relies on strict assertions in cranelift though about asserting that when an instruction is emitted the corresponding feature is enabled.
For flags the main precedent I know of is C/Rust compilers where it basically boils down to the
--target
having a default set of features enabled and then-Ctarget-feature=+foo
flags are used to enable/disable individual features. Individual features then understand hierarchies so I think if you enable SSE4.2 you automatically get SSE4.1.
JamesMcGuigan commented on issue #4224:
@cfallin Just tested out your patch in tags/dev after commit 5033f9994b1100d46265014c6243c07cd395789a
$ git fetch --all --tags --prune $ git co tags/dev HEAD is now at 823817595 Fix some typos in the isle language reference (#4248) $ cargo clean; cargo build --release $ /usr/local/src/wasmtime/target/release/wasmtime gcd.wasm --invoke gcd 42 24 Error: Unsupported feature: SIMD support requires SSE3, SSSE3, SSE4.1, and SSE4.2 on x86_64. $ /usr/local/src/wasmtime/target/release/wasmtime --wasm-features=-simd gcd.wasm --invoke gcd 42 24 warning: using `--invoke` with a function that takes arguments is experimental and may break in the future warning: using `--invoke` with a function that returns values is experimental and may break in the future 6
Still need to figure out how to disable the
--invoke
warnings, but otherwisewasmtime
now works as expected on my Xeon processors for this very simplegcd.wat
test case. Thank you.
cfallin commented on issue #4224:
Great, thanks for confirming @JamesMcGuigan ! Since the immediate issue here has been solved, I'm going to go ahead and close this issue, but please feel free to file any others you encounter.
cfallin closed issue #4224:
I am trying to run linux wasmtime on Xeon E5430 CPUs (2007) without SSE4 (Streaming SIMD Extensions 4) support
I am getting the following error:
$ wasmtime gcd.wasm --invoke gcd 42 36 Error: failed to run main module `gcd.wasm` Caused by: 0: compilation settings are not compatible with the native host 1: compilation setting "has_sse42" is enabled but not available on the host
Test Case
WASM code is taken directly from the manual, and is 28 lines of handcoded WAT implementing the GCD (Greatest common divisor) algoritm, and compiled using
wat2wasm
- https://docs.wasmtime.dev/lang-bash.html
- https://github.com/JamesMcGuigan/ecosystem-research/blob/04a8056151602f6de71feb6ce12f77f0ddd3cb8e/wasm/wat/gcd/gcd.wat
Steps to Reproduce
git clone https://github.com/JamesMcGuigan/ecosystem-research/ cd ./ecosystem-research/wasm/wat/gcd git checkout 04a8056151602f6de71feb6ce12f77f0ddd3cb8e wat2wasm gcd.wat wasmtime gcd.wasm --invoke gcd 42 36
Expected Results
WASM file can be validated through a custom nodejs runtime
$ node ./gcd.js 42 36 GCD of 42 and 36 is 6
Actual Results
$ wasmtime gcd.wasm --invoke gcd 42 36 Error: failed to run main module `gcd.wasm` Caused by: 0: compilation settings are not compatible with the native host 1: compilation setting "has_sse42" is enabled but not available on the host
Versions and Environment
I have tired both with the prebuilt binary version of wasmtime
sudo pacman -S wasmtime wasmtime 0.36.0
I have also tried installing from source:
cd /usr/local/src/ git clone https://github.com/bytecodealliance/wasmtime.git cd ./wasmtime/ git submodule update --init cargo clean cargo build --release
/usr/local/src/wasmtime/target/release/wasmtime --version # wasmtime-cli 0.38.0
And then even attempting applying the following diff (which didn't help)
iff --git a/crates/wasmtime/src/engine.rs b/crates/wasmtime/src/engine.rs index fbef95923..7df50ec5f 100644 --- a/crates/wasmtime/src/engine.rs +++ b/crates/wasmtime/src/engine.rs @@ -446,8 +446,8 @@ impl Engine { enabled = match flag { "has_sse3" => Some(std::is_x86_feature_detected!("sse3")), "has_ssse3" => Some(std::is_x86_feature_detected!("ssse3")), - "has_sse41" => Some(std::is_x86_feature_detected!("sse4.1")), - "has_sse42" => Some(std::is_x86_feature_detected!("sse4.2")), + // "has_sse41" => Some(std::is_x86_feature_detected!("sse4.1")), + // "has_sse42" => Some(std::is_x86_feature_detected!("sse4.2")), "has_popcnt" => Some(std::is_x86_feature_detected!("popcnt")), "has_avx" => Some(std::is_x86_feature_detected!("avx")), "has_avx2" => Some(std::is_x86_feature_detected!("avx2")), diff --git a/src/commands/compile.rs b/src/commands/compile.rs index 08c3b06fb..c4d4350dd 100644 --- a/src/commands/compile.rs +++ b/src/commands/compile.rs @@ -143,14 +143,14 @@ mod test { let command = CompileCommand::try_parse_from(vec![ "compile", "--disable-logging", - "--cranelift-enable", - "has_sse3", - "--cranelift-enable", - "has_ssse3", - "--cranelift-enable", - "has_sse41", - "--cranelift-enable", - "has_sse42", + //"--cranelift-enable", + //"has_sse3", + //"--cranelift-enable", + //"has_ssse3", + //"--cranelift-enable", + //"has_sse41", + // "--cranelift-enable", + // "has_sse42", "--cranelift-enable", "has_avx", "--cranelift-enable",
Operating system: Manjaro / Arch Linux
cat /etc/lsb-release DISTRIB_ID=ManjaroLinux DISTRIB_RELEASE=21.2.6 DISTRIB_CODENAME=Qonos DISTRIB_DESCRIPTION="Manjaro Linux"
Architecture:
cat /proc/cpuinfo vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Xeon(R) CPU E5430 @ 2.66GHz stepping : 10 microcode : 0xa0b cpu MHz : 2660.057 cache size : 6144 KB physical id : 1 siblings : 4 core id : 3 cpu cores : 4 apicid : 7 initial apicid : 7 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 xsave lahf_lm pti dtherm bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit bogomips : 5322.88 clflush size : 64 cache_alignment : 64 address sizes : 38 bits physical, 48 bits virtual power management:
Last updated: Nov 22 2024 at 17:03 UTC