Stream: git-wasmtime

Topic: wasmtime / issue #4460 cranelift: Implement scalar `fma` ...


view this post on Zulip Wasmtime GitHub notifications bot (Jul 16 2022 at 16:19):

github-actions[bot] commented on issue #4460:

Subscribe to Label Action

cc @cfallin, @fitzgen

<details>
This issue or pull request has been labeled: "cranelift", "cranelift:area:aarch64", "cranelift:area:machinst", "cranelift:area:x64", "cranelift:module", "isle"

Thus the following users have been cc'd because of the following labels:

To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.

Learn more.
</details>

view this post on Zulip Wasmtime GitHub notifications bot (Jul 21 2022 at 00:14):

jameysharp commented on issue #4460:

I keep trying to review this but finding too many things I don't understand about Cranelift yet. So I'll ask some questions instead. I think there are three main things happening in this PR:

  1. You've moved make_libcall_sig from isa::x64::lower to ir::libcall and renamed it to Libcall::signature. This seems related to a nearby comment in the existing code: "TODO avoid recreating signatures for every single Libcall function." But in doing so you lose access to the ir::Inst that the libcall came from, so you have to list the input and output types for every libcall in the signature method.

  2. You've plumbed a target_lexicon::Triple through a bunch of functions. I think this is because libcall_3_ret_1 needs it to call emit_vm_call but only has access to the x64 IsleContext, so you have to thread the triple through to lower_common in order to stash it there.

  3. You're exposing x64's emit_vm_call to ISLE via a new libcall_3_ret_1 helper, and using that to write lowering rules for fma.

Does that cover everything you did in this PR?

I would find this easier to review without change (1), or if you pass the ir::Inst to Libcall::signature so its implementation is identical to make_libcall_sig. But I'm guessing the way you have it now is better because it means this function can be reused by other target backends, right?

I'm curious if the target triple is reachable through one of the other fields that's already in the IsleContext, or if there's a way to get the right libcall calling convention in emit_vm_call without using the triple. I'm guessing no, or you probably would have used it, but it may be worth double-checking. If it's possible to get rid of the tedious changes for (2), that would make this patch a lot more readable.

Step (3) seems like it's simple enough that it can't be wrong, but I haven't read up on the details of ISLE yet.

The other thing I'd like to know is whether anybody is using the fma CLIF instruction. It looks to me like wasmtime does not use it, so I assume it's here for use by one of the other frontends, but it would be nice to verify that.

I think @cfallin or @fitzgen will need to review this but I hope my notes can at least help guide that review.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 21 2022 at 07:46):

afonso360 commented on issue #4460:

I keep trying to review this but finding too many things I don't understand about Cranelift yet.

Thanks for reviewing either way!

You've moved make_libcall_sig from isa::x64::lower to ir::libcall and renamed it to Libcall::signature. This seems related to a nearby comment in the existing code: "TODO avoid recreating signatures for every single Libcall function." But in doing so you lose access to the ir::Inst that the libcall came from, so you have to list the input and output types for every libcall in the signature method.

I think that comment still applies, we are still creating a new signature every time we lower fma even if we could cache (or precompute) that, since the signature isn't going to change.

Losing the Inst is a bit more deliberate, the previous code was essentially just copying the Inst signature into a LibCall signature, which doesen't seem very correct to me. I've centralized the signatures in LibCall since they should be pretty much equal across arches, but this may be wrong as well.

I think using Inst can also lead to a situation where we put a libcall in a rule, and use it in a different instruction lowering and then copy the wrong signature. An example would be calling nearest in the div instruction lowering and use the signature from div when emitting the libcall for nearest

I'm curious if the target triple is reachable through one of the other fields that's already in the IsleContext, or if there's a way to get the right libcall calling convention in emit_vm_call without using the triple. I'm guessing no, or you probably would have used it, but it may be worth double-checking. If it's possible to get rid of the tedious changes for (2), that would make this patch a lot more readable.

Yeah, I couldn't find it, but would appreciate a double check on that.

Does that cover everything you did in this PR?

Yeah pretty much.

The other thing I'd like to know is whether anybody is using the fma CLIF instruction. It looks to me like wasmtime does not use it, so I assume it's here for use by one of the other frontends, but it would be nice to verify that.

I don't think so, since it is not implemented right now, but at least rustc_codegen_cranelift side steps the issue by calling the libcall directly both for SIMD and the scalar version of this instruction.

Thanks!

view this post on Zulip Wasmtime GitHub notifications bot (Aug 01 2022 at 20:56):

afonso360 commented on issue #4460:

@cfallin Would it be possible to review this? The Fma op isn't that important (we have a better implementation on #4539), but the libcall mechanism is something that we need to lower other i128 ops and this is sort of blocking that.

Changes since @jameysharp 's review:

view this post on Zulip Wasmtime GitHub notifications bot (Aug 02 2022 at 17:34):

afonso360 commented on issue #4460:

I don't see an actual libcall implementation here -- I guess it will result in a panic if used today in e.g. the Wasmtime embedding (but that's fine since the cranelift-wasm frontend won't use the opcode)? Or is it already implemented in cranelift-jit?

Sorry, I don't quite understand understand.

We implement a libcall lowering for FMA. We already have libcall's in the x86 backend, but not in isle yet (see ceil lowering without SSE42).

So this is already reachable from wasmtime (from the ceil instruction). cranelift-jit already supports these relocations, so we should also be good there.

And #4453 allows us to test them with runtests.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 02 2022 at 17:36):

afonso360 edited a comment on issue #4460:

I don't see an actual libcall implementation here -- I guess it will result in a panic if used today in e.g. the Wasmtime embedding (but that's fine since the cranelift-wasm frontend won't use the opcode)? Or is it already implemented in cranelift-jit?

Sorry, I don't quite understand understand.

We implement a libcall lowering for FMA. We already have libcall's in the x86 backend, but not in isle yet (see ceil lowering without SSE42).

So this is already reachable from wasmtime (from the ceil instruction). cranelift-jit already supports these relocations, so we should also be good there. And I assume that those instructions already work in wasmtime (without SSE42).

And #4453 allows us to test them with runtests.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 02 2022 at 17:37):

afonso360 edited a comment on issue #4460:

I don't see an actual libcall implementation here -- I guess it will result in a panic if used today in e.g. the Wasmtime embedding (but that's fine since the cranelift-wasm frontend won't use the opcode)? Or is it already implemented in cranelift-jit?

Sorry, I don't quite understand understand.

We implement a libcall lowering for FMA. We already have libcall's in the x86 backend, but not in isle yet (see ceil lowering without SSE42).

So this is already reachable from wasmtime (from the ceil instruction). cranelift-jit already supports these relocations, so we should also be good there. And I assume that those instructions already work in wasmtime (without SSE42).

And #4453 allows us to test them with runtests. This does cause a panic if no relocations are performed which is the state of the current runtest suite (and why we don't add tests in this PR)

view this post on Zulip Wasmtime GitHub notifications bot (Aug 02 2022 at 17:38):

afonso360 edited a comment on issue #4460:

I don't see an actual libcall implementation here -- I guess it will result in a panic if used today in e.g. the Wasmtime embedding (but that's fine since the cranelift-wasm frontend won't use the opcode)? Or is it already implemented in cranelift-jit?

Sorry, I don't quite understand understand.

We implement a libcall lowering for fma. We already have libcall's in the x86 backend, but not in isle yet (see ceil lowering without SSE42).

So this is already reachable from wasmtime (from the ceil instruction). cranelift-jit already supports these relocations, so we should also be good there. And I assume that those instructions already work in wasmtime (without SSE42).

And #4453 allows us to test them with runtests. This does cause a panic if no relocations are performed which is the state of the current runtest suite (and why we don't enable the fma tests in this PR)

view this post on Zulip Wasmtime GitHub notifications bot (Aug 02 2022 at 17:38):

afonso360 edited a comment on issue #4460:

I don't see an actual libcall implementation here -- I guess it will result in a panic if used today in e.g. the Wasmtime embedding (but that's fine since the cranelift-wasm frontend won't use the opcode)? Or is it already implemented in cranelift-jit?

Sorry, I don't quite understand understand.

We implement a libcall lowering for fma. We already have libcall's in the x86 backend, but not in isle yet (see ceil lowering without SSE42).

So this is already reachable from wasmtime (from the ceil instruction). cranelift-jit already supports these relocations, so we should also be good there. And I assume that those instructions already work in wasmtime (without SSE42).

And #4453 allows us to test them with runtests. This does cause a panic if no relocations are performed which is the state of the current runtest suite (and why we don't enable the fma tests in this PR).

view this post on Zulip Wasmtime GitHub notifications bot (Aug 02 2022 at 17:43):

afonso360 edited a comment on issue #4460:

I don't see an actual libcall implementation here -- I guess it will result in a panic if used today in e.g. the Wasmtime embedding (but that's fine since the cranelift-wasm frontend won't use the opcode)? Or is it already implemented in cranelift-jit?

Sorry, I don't quite understand understand.

We implement a libcall lowering for fma. We already have libcall's in the x86 backend, but not in isle yet (see ceil lowering without SSE42).

So this is already reachable from wasmtime (from the ceil instruction). cranelift-jit already supports handling relocations for these functions, so we should also be good there. And I assume that those instructions already work in wasmtime (without SSE42).

And #4453 allows us to test them with runtests. This does cause a panic if no relocations are performed which is the state of the current runtest suite (and why we don't enable the fma tests in this PR).

view this post on Zulip Wasmtime GitHub notifications bot (Aug 02 2022 at 17:44):

afonso360 edited a comment on issue #4460:

I don't see an actual libcall implementation here -- I guess it will result in a panic if used today in e.g. the Wasmtime embedding (but that's fine since the cranelift-wasm frontend won't use the opcode)? Or is it already implemented in cranelift-jit?

Sorry, I don't quite understand understand.

We implement a libcall lowering for fma. We already have libcall's in the x86 backend, but not in isle yet (see ceil lowering without SSE42).

So this is already reachable from wasmtime (from the ceil instruction). cranelift-jit already supports handling relocations for these functions, so we should also be good there. And I assume that those instructions already work in wasmtime (without SSE42).

And #4453 allows us to test them with runtests. This does cause a Illegal Instruction if no relocations are performed which is the state of the current runtest suite (and why we don't enable the fma tests in this PR).

view this post on Zulip Wasmtime GitHub notifications bot (Aug 02 2022 at 17:47):

cfallin commented on issue #4460:

I guess I meant specifically for fma -- this PR adds two new libcalls, FmaF32 and FmaF64, and I didn't see implementations of them. I was wondering if these implementations were coming later, or in general what the plan for them is? (Or perhaps I've missed them somewhere, but grepping for FmaF32 in the existing tree isn't showing anything.)

view this post on Zulip Wasmtime GitHub notifications bot (Aug 02 2022 at 17:52):

afonso360 commented on issue #4460:

We don't implement them, in cranelift-jit we expect them to be provided when linking. We translate the FmaF{32,64} enum to the libcall name (fmaf and fma) in cranelift-module and from there the linker finds those functions for us. We search libc / msvcrt / whatever native runtime by default in cranelift-jit.

I don't know how this works in Wasmtime, if we always need to provide those functions or if we also get them from the system.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 02 2022 at 17:53):

afonso360 edited a comment on issue #4460:

We don't implement them. In cranelift-jit we expect them to be provided when linking. We translate the FmaF{32,64} enum to the libcall name (fmaf and fma) in cranelift-module and from there the linker finds those functions for us. We search libc / msvcrt / whatever native runtime by default in cranelift-jit.

I don't know how this works in Wasmtime, if we always need to provide those functions or if we also get them from the system.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 02 2022 at 17:55):

cfallin commented on issue #4460:

OK, that seems fine to me; just want to confirm that I wasn't missing something! It is at least a loud failure if someone tries to use the lowering in their own embedding and finds that the libcall isn't provided/implemented, so this shouldn't be a problem.


Last updated: Jan 24 2025 at 00:11 UTC