cfallin labeled issue #1183:
Recent RaptorCS hardware makes PowerPC a modern desktop platform which runs Firefox, currently without a Javascript JIT and without WebAssembly support at all.
I read a plan where Cranelift would be used within Firefox for WebAssembly, it would be nice to have JIT support there.
Free IBM POWER9 hardware available from OSUOSL: https://osuosl.org/services/powerdev/request_hosting/ - for Free/Libre and Open Source Software development only. (This project is eligible)
cfallin labeled issue #1183:
Recent RaptorCS hardware makes PowerPC a modern desktop platform which runs Firefox, currently without a Javascript JIT and without WebAssembly support at all.
I read a plan where Cranelift would be used within Firefox for WebAssembly, it would be nice to have JIT support there.
Free IBM POWER9 hardware available from OSUOSL: https://osuosl.org/services/powerdev/request_hosting/ - for Free/Libre and Open Source Software development only. (This project is eligible)
rjzak commented on issue #1183:
@programmerjake @leo-lb is there any progress or updates regarding this? I recently got a Talos II system and would like to help.
ecnelises commented on issue #1183:
Hi @programmerjake is there any update on this? If you have some initial work, it would be great to publish it and we can improve by the review.
Otherwise, I'd like to do some work to set up initial support. (to be clear: just personal hobby project, maybe not in very near future)
programmerjake commented on issue #1183:
I haven't started working on this yet, we're currently focusing on getting Libre-SOC's ISA extensions into the next version of PowerISA so I don't plan on working on this in the next few months. When we do start adding compiler support, we're working on LLVM/GCC first, so Cranelift will have to wait. I did do some experimentation with writing a register allocator (no, the current register allocator is insufficient for SVP64, though it will probably work fine for scalar PowerISA) so we can put our SVP64 vector extension into Cranelift, but that's currently on indefinite hold.
If you do want to implement PowerISA support in Cranelift, please be aware that POWER9/10 are not the only modern cpus that people will want to run, so don't necessarily assume VMX/VSX is supported (Libre-SOC cpus will be supporting SVP64 instead, and the PowerPC Open Hardware Notebook doesn't support VMX/VSX in Little-Endian mode) and especially don't confuse ISA v3.0/v3.1 support as being equivalent to POWER9/10 support, those are separate things (e.g. Libre-SOC cpus will support ISA v3.1 or probably even v3.2 but won't have the same feature set as whatever POWER cpu is released next). Thanks!
ecnelises commented on issue #1183:
so don't necessarily assume VMX/VSX is supported (Libre-SOC cpus will be supporting SVP64 instead, and the PowerPC Open Hardware Notebook doesn't support VMX/VSX in Little-Endian mode) and especially don't confuse ISA v3.0/v3.1 support as being equivalent to POWER9/10 support
My understanding, correct me if I'm wrong:
- SVP64 is an extension to POWER ISA, which includes scalar part from standard POWER, and implemented vectors using scalar registers, totally incompatible with Altivec/VSX. (this also means the regalloc would be quite different)
- LLVM already differentiates those features. For example,
power9-vector
controls new VSX instructions introduced in POWER9, whileisa-v30-instructions
controls other scalar instructions introducec in POWER9. As I know, Linux kernel was built with Altivec disabled.- So, I expect Cranelift to have ability to control target features. If so, something like
target-cpu=xxx
ortarget-features=xxx
can switch Altivec/VSX on or off, but enabled by default for a standardpowerpc64le-unknown-linux-gnu
target. (not sure how SVP64 will define its triple)
programmerjake commented on issue #1183:
so don't necessarily assume VMX/VSX is supported (Libre-SOC cpus will be supporting SVP64 instead, and the PowerPC Open Hardware Notebook doesn't support VMX/VSX in Little-Endian mode) and especially don't confuse ISA v3.0/v3.1 support as being equivalent to POWER9/10 support
My understanding, correct me if I'm wrong:
- SVP64 is an extension to POWER ISA, which includes scalar part from standard POWER, and implemented vectors using scalar registers, totally incompatible with Altivec/VSX. (this also means the regalloc would be quite different)
it is not totally incompatible (a cpu can implement both), but is totally independent of VSX/VMX, with what happens when you try to use SVP64 prefixing on a VSX instruction not being currently defined beyond just being an illegal instruction trap.
- LLVM already differentiates those features. For example,
power9-vector
controls new VSX instructions introduced in POWER9, whileisa-v30-instructions
controls other scalar instructions introducec in POWER9. As I know, Linux kernel was built with Altivec disabled.Yes, that's good. the reason I was mentioning this is because a lot of recent PowerPC software uses
#ifdef POWER9
(e.g. glibc iirc) to enable v3.0 instructions, not realizing that non-POWER9/10 cpus can also use them.
- So, I expect Cranelift to have ability to control target features. If so, something like
target-cpu=xxx
ortarget-features=xxx
can switch Altivec/VSX on or off, but enabled by default for a standardpowerpc64le-unknown-linux-gnu
target. (not sure how SVP64 will define its triple)iirc the plan is tentatively to be something like
powerpc64lesffs-unknown-linux-gnu
since SFFS is the name the PowerISA uses for that conformance class. the SVP64 extension(s) are enabled separately, like how AVX512F is on x86.
ecnelises commented on issue #1183:
Thanks for explanation.
the reason I was mentioning this is because a lot of recent PowerPC software uses #ifdef POWER9 (e.g. glibc iirc) to enable v3.0 instructions, not realizing that non-POWER9/10 cpus can also use them.
Clang does not have any macros indicating POWER ISA version now. A workaround is to check
defined(_ARCH_PWR9) && !defined(__ALTIVEC__)
.
programmerjake commented on issue #1183:
it is not totally incompatible (a cpu can implement both), but is totally independent of VSX/VMX, with what happens when you try to use SVP64 prefixing on a VSX instruction not being currently defined beyond just being an illegal instruction trap.
note that a future version of SVP64 may define what happens when combining SVP64 prefixing with VSX/VMX -- it's probably relatively straightforward but we have no funding for that work.
matevy commented on issue #1183:
Hi,
I am thinking of supporting ppc64le with resources I have on hand. I am just curious what is the estimated effort to support a new architecture? Just to see if it is anywhere I could fit it into our schedule...
ecnelises commented on issue #1183:
I think it depends on the scope you expect: ABI (ELFv1? ELFv2? XCOFF?), bitwidth (32-bit? 64-bit?), endianness (little? big?), ISA version (PPC? Altivec? VSX? Power10?). Practically a simple initial support can only contain support for PPC scalar instructions, targeting 64-bit only and using ELFv2 ABI for PowerPC little-endian on Linux.
Initial commit for s390 is a nice overview for what needs to be done to enable support for an arch: https://github.com/bytecodealliance/wasmtime/commit/89b5fc776debe07dd25535dc9bba48f236f2163f
matevy commented on issue #1183:
Hi,
simple initial support, I do not need performance, just to have it
running on this architecture.
yes, PPC scalar, 64 bit, LE, linux, ELFv2 ABI seems perfect.will have a look at that commit, thanks
Qiu Chaofan wrote on 18/08/2023 09:55:
I think it depends on the scope you expect: ABI (ELFv1? ELFv2?
XCOFF?), bitwidth (32-bit? 64-bit?), endianness (little? big?), ISA
version (PPC? Altivec? VSX? Power10?). Practically a simple initial
support can only contain support for PPC scalar instructions,
targeting 64-bit only and using ELFv2 ABI for PowerPC little-endian on
Linux.Initial commit for s390 is a nice overview for what needs to be done
to enable support for an arch: 89b5fc7
<https://github.com/bytecodealliance/wasmtime/commit/89b5fc776debe07dd25535dc9bba48f236f2163f>—
Reply to this email directly, view it on GitHub
<https://github.com/bytecodealliance/wasmtime/issues/1183#issuecomment-1683516287>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AH7ITNDF3HNYRNTDLM7ESDDXV4NXBANCNFSM4K6CHH2A>.
You are receiving this because you are subscribed to this
thread.Message ID:
@.***>
rjzak commented on issue #1183:
@matevy I'm excited about this! I can help with testing at least, and some dev maybe. I don't have much assembly experience, but I have a Power9 desktop.
programmerjake commented on issue #1183:
one note (iirc i said this before but imho is worth repeating), please don't assume that POWER9/10 are the only CPUs with v3.0/3.1 support or that v3.0/3.1 support implies having VSX or AltiVec or decimal floating point or etc., there are other less well known CPUs that are aiming for v3.0 and later v3.1 compliance, such as Microwatt and Libre-SOC's CPUs
programmerjake edited a comment on issue #1183:
one note (iirc i said this before but imho is worth repeating), please don't assume that POWER9/10 are the only CPUs with v3.0/3.1 support or that v3.0/3.1 support implies having VSX or AltiVec or decimal floating point or etc., there are other less well known CPUs that are aiming for v3.0 and later v3.1 compliance without AltiVec or VSX, such as Microwatt and Libre-SOC's CPUs
cfallin commented on issue #1183:
For a few more pieces of anecdata: basic RISC-V (integer/FP only, no SIMD, not too many opts) took about three months in review and was ~21k lines of code. Back in 2020 I built the aarch64 backend to a similar level in about that time as well. For the level of completeness that we hope for with a good backend, SIMD, reasonable optimizations, reasonable confidence in correctness, etc., I'd expect around six compiler-engineer-months of fulltime work and the backend would likely be around ~30kLoC.
I'll note our tiers as well: a new backend would arrive at tier 3, which allows for some amount of "work-in-progress" or incompleteness in-tree but does expect ongoing involvement and responsiveness to issues. (E.g., tail-call support recently had architecture-specific work for each backend and some of our maintainers pitched in.) Partly this is inspired by our experience with a partial arm32 backend, which we had to delete because no one was around to maintain or finish it. In other words, we want to make sure that the ongoing maintenance is properly factored in.
All that is not to discourage anything -- we would absolutely love to have this! -- just to ensure that the proper level of serious commitment is there, as it's not a small undertaking.
matevy commented on issue #1183:
Hi,
many thanks for more effort estimation. Exactly what needed, and well
aware of what happens if underestimated. That is why I asked, will check
internally what are our possibilities.one question though. is it possible to use QEMU to test? I quickly
tried, but none of the supported ports passed all tests.regards,
Matevz
jameysharp commented on issue #1183:
The tests are routinely run under qemu in CI and should pass on all supported platforms, but due to how qemu handles memory allocation some configurations don't work, so we have an environment variable to disable those configurations that we use when testing under qemu. If you set
WASMTIME_TEST_NO_HOG_MEMORY=1
the tests should pass.
alexcrichton added the wasmtime:platform-support label to Issue #1183.
ecnelises commented on issue #1183:
Hi @matevy is there some progress on this?
If so, feel free to share the implementation to get public comments. (the work may be incomplete as of now but that's totally fair)
If not, I'd like to continue my investigation. (https://github.com/ecnelises/wasmtime/tree/ppc64/cranelift/) Just some early copy-n-rename, not real efforts till now, so do not consider it as duplicated work :)
rjzak commented on issue #1183:
@ecnelises this looks like you've made some good progress https://github.com/bytecodealliance/wasmtime/compare/main...ecnelises:wasmtime:ppc64! Is this in a testable state, and does it support Little Endian?
cfallin commented on issue #1183:
It looks like more or less a full copy of the aarch64 backend, with find-replace ("copy-n-rename" as @ecnelises noted above); I was curious if there were any e.g. instruction emission definitions for PPC but I didn't see anything in the diff that looked PPC-specific?
to both @ecnelises and @rjzak -- there's enough complexity in the fully-developed backends now that I think I might suggest a "mostly clean slate" approach instead; I'd probably develop things by:
- deleting all lowering rules in lower.isle, and all inst helpers in inst.isle, and all auxiliary struct/enum definitions for inst arguments etc, and strip out all ABI code so it doesn't reference the now-deleted instructions (i.e., get down to a true skeleton, rather than a full copy);
- define the registers and the MachineEnv in regs.rs;
- define the basic instructions you know you'll need in the MInst enum in inst.isle, and write emission code and emission tests for them (things like move, load, store, add, jump);
- write lowering rules that use these, and write some basic CLIF compile tests;
- fill in ABI code with the appropriate register choices and stack-frame details;
- at this point, get some really basic ISA runtests working (return x+y, fibonacci, if/else, that sort of thing);
- then burn down the list of Wasm spec tests and existing CLIF runtests until you've got everything implemented, adding instructions as you need them. (The instruction emission code doesn't need to know the whole ISA, just the subset you use)
That's probably easier than starting from a full copy of a production backend and trying to make a pass over it, or similar. We're happy to answer questions on Zulip as you have them!
Last updated: Nov 22 2024 at 16:03 UTC