github-actions[bot] commented on issue #3783:
Subscribe to Label Action
cc @cfallin, @fitzgen
<details>
This issue or pull request has been labeled: "cranelift", "cranelift:area:aarch64", "cranelift:area:machinst", "cranelift:area:x64", "isle"Thus the following users have been cc'd because of the following labels:
- cfallin: isle
- fitzgen: isle
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
Learn more.
</details>
uweigand commented on issue #3783:
So I think a
lower_zero
,lower_one
, maybe a speciallower_two
for instructions that statically produce exactly two results (e.g.iadd_ifcout
-alikes) andlower_variable
for calls (even if they produce 0, 1, or 2 results, this is known only at compile time) would make sense. Thoughts?I guess what's not quite clear about this to me is just where that distinction would be introduced. If we have different lowering functions with different return types, we won't be able to handle all of them with a single
lower_common
handler, as the latter depends on the particular return type.So should we have a
lower_common
that takes four different lower routines, and then internally decides which of them to use (based on the opcode, I guess?), and then handles each of them separately according to their return value?Or should
lower_common
be modified to handle different return types, based on some template magic I guess, and then the various back-ends could use different instantiations, and chose by themselves which version to use (again based on the opcode, but now probably in the already-existing switch statements in the back-ends?)?
cfallin commented on issue #3783:
I guess what's not quite clear about this to me is just where that distinction would be introduced. If we have different lowering functions with different return types, we won't be able to handle all of them with a single
lower_common
handler, as the latter depends on the particular return type.So should we have a
lower_common
that takes four different lower routines, and then internally decides which of them to use (based on the opcode, I guess?), and then handles each of them separately according to their return value?Or should
lower_common
be modified to handle different return types, based on some template magic I guess, and then the various back-ends could use different instantiations, and chose by themselves which version to use (again based on the opcode, but now probably in the already-existing switch statements in the back-ends?)?This is a good question; the general principle would be to dispatch somewhere before we enter ISLE code so we end up with the type-safety once we get to specific rules. My first instinct is to have something like
trait IsleEntrypoints { fn lower_zero(...) -> Option<()>; fn lower_one(...) -> Option<ValueRegs>; fn lower_two(...) -> Option<(ValueRegs, ValueRegs)>; fn lower_variable(...) -> Option<SmallVec<[ValueRegs; 4]>>; }
to replace the
IF
closure we currently parameterizelower_common
on, and then/// Arity (number of result values) for an instruction. Note that we make special distinctions /// for: /// - zero-result instructions: these are used for their side-effects only; /// - one-result instructions: the most common kind, computing one new value; /// - two-result instructions: some instructions produce two fixed outputs, like addend and carry; /// - variable-result instructions: instructions like calls that produce some number of results different /// for each instance of the instruction. Note that even a call with 0/1/2 results, dynamically, is still /// `Variable`. enum InstArity { Zero, One, Two, Variable, } impl InstData { // ... fn arity(&self) -> InstArity { ... } }
Now that I've sketched that out, though, I'm wavering a bit on whether this is really better than just a
SmallVec
at thelower_common
level (as long as theSmallVec
is properly sized; 2ValueRegs
is probably optimal, since multivalue calls are rare and a too-large inline vec will cause measurable memcpy cost?), and then let each backend dispatch into specialized entry points as desired. @fitzgen any thoughts here?
fitzgen commented on issue #3783:
Don't we already know the number of expected results, and therefore don't need the
InstArity
that you've sketched?
cfallin commented on issue #3783:
The additional information needed is whether the number of outputs is fixed or variable; e.g. for
call
it may happen to have one result in this (and most) instances of the instruction, but it still needs the variable path.The idea with the
Arity
enum was to allow a clean one-to-onematch
that dispatches into the trait but I suppose we could also have something likehas_variable_results()
. (Curiously, we do haveinst_variable_args
that allows this fixed/variable distinction on the inputs side, but nothing equivalent for results.)Diving further into this rabbithole though I'm starting to think that just dispatching in the platform-specific glue into its ISLE toplevel is maybe a better answer: match on op, do a special entrypoint for calls (calls are special anyway in lots of ways), nothing else should be variable so then invoke
lower_{zero,one,two}
based on actual result count as you suggest; and return aSmallVec
tolower_common
...
fitzgen commented on issue #3783:
Diving further into this rabbithole though I'm starting to think that just dispatching in the platform-specific glue into its ISLE toplevel is maybe a better answer: match on op, do a special entrypoint for calls (calls are special anyway in lots of ways), nothing else should be variable so then invoke
lower_{zero,one,two}
based on actual result count as you suggest; and return aSmallVec
tolower_common
...SGTM :+1:
uweigand commented on issue #3783:
Diving further into this rabbithole though I'm starting to think that just dispatching in the platform-specific glue into its ISLE toplevel is maybe a better answer: match on op, do a special entrypoint for calls (calls are special anyway in lots of ways), nothing else should be variable so then invoke
lower_{zero,one,two}
based on actual result count as you suggest; and return aSmallVec
tolower_common
...If you always return a
SmallVec
tolower_common
, then what's even the benefit of multiple lower constructors? Can't we then just go back to my approach (except withSmallVec
instead ofVec
) plus (the upcoming) default conversions?
cfallin commented on issue #3783:
If you always return a
SmallVec
tolower_common
, then what's even the benefit of multiple lower constructors? Can't we then just go back to my approach (except withSmallVec
instead ofVec
) plus (the upcoming) default conversions?The advantage of at least having the distinction at the ISLE layer is type safety, mainly -- we can at least encode the restriction that, say,
iadd
should result in just one Value in the lowering rules, even if the toplevel driver is generic. It also keeps the rules in a form that is easier to invoke in a more efficient way if we do want/need to do something better than the generic approach in the future.I'm not completely tied to this of course, and the way we're thinking of doing this now it would be a per-backend decision, so IMHO at least it'sOK to do the fully generic thing in s390x if you feel really strongly here.
cfallin edited a comment on issue #3783:
If you always return a
SmallVec
tolower_common
, then what's even the benefit of multiple lower constructors? Can't we then just go back to my approach (except withSmallVec
instead ofVec
) plus (the upcoming) default conversions?The advantage of at least having the distinction at the ISLE layer is type safety, mainly -- we can at least encode the restriction that, say,
iadd
should result in just one Value in the lowering rules, even if the toplevel driver is generic. It also keeps the rules in a form that is easier to invoke in a more efficient way if we do want/need to do something better than the generic approach in the future.I'm not completely tied to this of course, and the way we're thinking of doing this now it would be a per-backend decision, so IMHO at least it's OK to do the fully generic thing in s390x if you feel really strongly here.
uweigand commented on issue #3783:
The advantage of at least having the distinction at the ISLE layer is type safety, mainly -- we can at least encode the restriction that, say,
iadd
should result in just one Value in the lowering rules, even if the toplevel driver is generic.But
lower_common
would continue to verify that it got the correct number of outputs, so I'm sure I see what additional benefit this provides? If you have multiple constructors, you instead have the potential source of error of calling the wrong constructor for some opcode ...It also keeps the rules in a form that is easier to invoke in a more efficient way if we do want/need to do something better than the generic approach in the future.
I'm not completely tied to this of course, and the way we're thinking of doing this now it would be a per-backend decision, so IMHO at least it's OK to do the fully generic thing in s390x if you feel really strongly here.
I don't really feel strongly either ... I guess once the default conversion patch is in, I can update this PR to use those (and switch to a
SmallVec
), then we can see how it looks.
uweigand edited a comment on issue #3783:
The advantage of at least having the distinction at the ISLE layer is type safety, mainly -- we can at least encode the restriction that, say,
iadd
should result in just one Value in the lowering rules, even if the toplevel driver is generic.But
lower_common
would continue to verify that it got the correct number of outputs, so I'm not sure I see what additional benefit this provides? If you have multiple constructors, you instead have the potential source of error of calling the wrong constructor for some opcode ...It also keeps the rules in a form that is easier to invoke in a more efficient way if we do want/need to do something better than the generic approach in the future.
I'm not completely tied to this of course, and the way we're thinking of doing this now it would be a per-backend decision, so IMHO at least it's OK to do the fully generic thing in s390x if you feel really strongly here.
I don't really feel strongly either ... I guess once the default conversion patch is in, I can update this PR to use those (and switch to a
SmallVec
), then we can see how it looks.
uweigand commented on issue #3783:
I guess once the default conversion patch is in, I can update this PR to use those (and switch to a
SmallVec
), then we can see how it looks.OK, just pushed an updated version. The changes to the platform lower.isle files actually look very reasonable to me now, the default conversions can indeed handle nearly everything. The only nontrival changes are about
iadd_ifcout
(which is an improvement anyway), and replacingvalue_regs_none
byside_effect
(only used by s390x, and I think also an improvement).What do you think?
Last updated: Dec 23 2024 at 13:07 UTC