wasmtime / issue #3807 WIP: implicit conversions in ISLE. · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / issue #3807 WIP: implicit conversions in ISLE.

Wasmtime GitHub notifications bot (Feb 16 2022 at 08:29):

Ah, one other high-level thing: it may be useful in the future to have a mode in which we dump the sema tree, with conversions included, back to ISLE source with all the conversions made explicit. Call this "core ISLE" or a sort of MIR, I suppose. This would primarily be useful for tools that consume ISLE (e.g. for verification), so they don't need an independent implementation of the conversion-insertion logic.

Possibly it's better for such tools to just consume the sema nodes directly though -- cc @avanhatt for thoughts on this?

Wasmtime GitHub notifications bot (Feb 16 2022 at 08:42):

github-actions[bot] commented on issue #3807:

Subscribe to Label Action

cc @cfallin, @fitzgen

<details>
This issue or pull request has been labeled: "cranelift", "isle"

Thus the following users have been cc'd because of the following labels:

cfallin: isle

fitzgen: isle

To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.

Learn more.
</details>

Wasmtime GitHub notifications bot (Feb 16 2022 at 14:47):

avanhatt commented on issue #3807:

@cfallin thanks for the heads up! The verification engine is currently consuming sema nodes if I understand the distinction correctly (isle::sema::{Rule, TermEnv, TypeEnv, Pattern, ...}, code is here). I think seeing an example of how this would affect an existing lowering rule would help me understand the verification implications; even if that actual change doesn't happen in this PR.

One thing I'm not quite understanding from this PR yet - in the example converter (convert T U t_to_u), is the assumption that this requires that the external Rust type T implements Into<U>? Or is there some other way of defining the Rust implementation of the converter?

Also, can multiple converters be chained? That is, if I have a_to_b and b_to_c, can I use an a as a c without a direct converter?

Wasmtime GitHub notifications bot (Feb 16 2022 at 18:23):

cfallin commented on issue #3807:

@cfallin thanks for the heads up! The verification engine is currently consuming sema nodes if I understand the distinction correctly (isle::sema::{Rule, TermEnv, TypeEnv, Pattern, ...}, code is here). I think seeing an example of how this would affect an existing lowering rule would help me understand the verification implications; even if that actual change doesn't happen in this PR.

OK, cool, I think everything should be fine then; you should be able to follow the type errors once updated past this PR with the new nodes (sema::Pattern::Convert, sema::Expr::Convert) and handle them like other external calls.

Propagation of this into the lowering rules will definitely come next! Just wanted to get the compiler implementation out for thoughts/comments first :-)

One thing I'm not quite understanding from this PR yet - in the example converter (convert T U t_to_u), is the assumption that this requires that the external Rust type T implements Into<U>? Or is there some other way of defining the Rust implementation of the converter?

Everything should be explicit I think (no Into required) -- t_to_u can implement arbitrary logic, like other external extractors or constructors. Importantly it has access to the context too, so it can e.g. automatically fill-in calls to def_inst or the like in instruction patterns.

Also, can multiple converters be chained? That is, if I have a_to_b and b_to_c, can I use an a as a c without a direct converter?

I thought about this a bit but right now I want to see how far I can get without it. If we do need this, then we could compute the transitive closure of all conversions before consulting the conversions table during typechecking, or we could ask the user to define A->C explicitly; I think I'd slightly prefer the latter as it simplifies codegen.

Wasmtime GitHub notifications bot (Feb 16 2022 at 19:36):

fitzgen commented on issue #3807:

Also, can multiple converters be chained? That is, if I have a_to_b and b_to_c, can I use an a as a c without a direct converter?

I thought about this a bit but right now I want to see how far I can get without it. If we do need this, then we could compute the transitive closure of all conversions before consulting the conversions table during typechecking, or we could ask the user to define A->C explicitly; I think I'd slightly prefer the latter as it simplifies codegen.

Yeah I would definitely prefer not to handle transitive conversions, if we can.

Wasmtime GitHub notifications bot (Feb 16 2022 at 20:23):

fitzgen commented on issue #3807:

I opted to add conversions as their own IR node type, rather than try lowering these to extractors/constructors to share that code, as the operations are just a tiny bit different (not associated with a term, for example)

It seems that they are associated with a term: the term specified in the conversion declaration.

Also, is this separating conversions for extractors vs constructors, like we talked about in the issue?

Wasmtime GitHub notifications bot (Feb 16 2022 at 20:35):

cfallin commented on issue #3807:

I opted to add conversions as their own IR node type, rather than try lowering these to extractors/constructors to share that code, as the operations are just a tiny bit different (not associated with a term, for example)

It seems that they _are_ associated with a term: the term specified in the conversion declaration.

I'm not sure I follow, do you mean the Rust function name? E.g. in specifying (convert T U t_to_u) there are two types, and a Rust function, but no term involved.

Also, is this separating conversions for extractors vs constructors, like we talked about in the issue?

I actually started down that path but found there was a lot of unnecessary duplication: it's simpler just to have one mechanism for going from T to U I think. Or said another way, once we have t_to_u and the system knows about it, there's no difference in behavior whether we use that to take a T and apply a pattern for a U on the extractor side, or evaluate an expression for a T and convert it to a final U result on the constructor side. But possibly I'm missing some reason why we'd want to disallow a particular conversion on one side or the other; do you have an example in mind?

Wasmtime GitHub notifications bot (Feb 16 2022 at 21:00):

fitzgen commented on issue #3807:

But possibly I'm missing some reason why we'd want to disallow a particular conversion on one side or the other; do you have an example in mind?

In an extractor, it maybe makes sense to implicitly convert from a Value to an Inst via the def_inst extractor. But I don't think I would want to do that same implicit conversion in constructors. Hard to put into words why I'm reacting that way. It just feels wrong/unnecessary/I wouldn't use it that way, and so I'd rather not have any kind of potential footgun there.

FWIW, 99% of the sites where I personally wish we had implicit conversions are constructors.

Wasmtime GitHub notifications bot (Feb 16 2022 at 21:06):

fitzgen commented on issue #3807:

To clarify a little bit on the semantics/lookup I'm imagining for implicit conversions:

;; `convert_T_U`'s constructor is used when implicitly converting
;; T to U in a constructor context.
;;
;; `convert_T_U`'s extractor is used when implicitly converting U
;; to T in an extractor context.
(convert T U convert_T_U)

(decl convert_T_U (T) U)

;; convert T to U in constructors.
(rule (convert_T_U t) ...)

;; convert U to T in extractors.
(extractor (convert_T_U t) ...)

Does that make sense?

Wasmtime GitHub notifications bot (Feb 16 2022 at 21:07):

fitzgen edited a comment on issue #3807:

To clarify a little bit on the semantics/lookup I'm imagining for implicit conversions:
;; `convert_T_U`'s constructor is used when implicitly converting
;; T to U in a constructor context.
;;
;; `convert_T_U`'s extractor is used when implicitly converting U
;; to T in an extractor context.
(convert T U convert_T_U)

(decl convert_T_U (T) U)

;; convert T to U in constructors.
(rule (convert_T_U t) ...)

;; convert U to T in extractors.
(extractor (convert_T_U t) ...)
So type checking would have to not only check whether the appropriate (convert ...) exists, but also whether the convert's associated term has an extractor/constructor depending on the context.

Does that make sense?

Wasmtime GitHub notifications bot (Feb 16 2022 at 21:23):

cfallin commented on issue #3807:

But possibly I'm missing some reason why we'd want to disallow a particular conversion on one side or the other; do you have an example in mind?

In an extractor, it maybe makes sense to implicitly convert from a Value to an Inst via the def_inst extractor. But I don't think I would want to do that same implicit conversion in constructors. Hard to put into words why I'm reacting that way. It just feels wrong/unnecessary/I wouldn't use it that way, and so I'd rather not have any kind of potential footgun there.

FWIW, 99% of the sites where I personally wish we had implicit conversions are constructors.

This is a really interesting difference -- I found that my intuition was actually pushing the other way! Basically it feels like a feature of the low-level dataflow in the language to me; typechecking can insert little converters to make things work out where there are gaps between the type, and this is below the level of concern of extractors/constructors. I think most the other comments above flow from that difference as well.

In general as well, I want to try to avoid distinctions between extractors and constructors where possible; if we instead see the forms on LHS and RHS as living in the same value-space, with similar semantics, it opens the door later to optimizations like chaining multiple rewrite steps into one. In that sense, it feels like an odd semantic gap for a conversion from A to B to be possible on the LHS, but not on the RHS ("one rule's LHS is another rule's RHS" in some sense, once we start to chain steps).

In the case of Value to Inst, you do raise a good point, the meaning of that is sort of odd if it's invoke on the RHS. Perhaps the right way to resolve this though is to say that fallible conversions can only appear in the LHS. def_inst is fallible because we can't always peek back at the defining instruction for a value; so the Value to Inst conversion is fallible, and wouldn't make sense on the RHS. Does that seem reasonable?

Wasmtime GitHub notifications bot (Feb 16 2022 at 21:24):

cfallin edited a comment on issue #3807:

But possibly I'm missing some reason why we'd want to disallow a particular conversion on one side or the other; do you have an example in mind?

In an extractor, it maybe makes sense to implicitly convert from a Value to an Inst via the def_inst extractor. But I don't think I would want to do that same implicit conversion in constructors. Hard to put into words why I'm reacting that way. It just feels wrong/unnecessary/I wouldn't use it that way, and so I'd rather not have any kind of potential footgun there.

FWIW, 99% of the sites where I personally wish we had implicit conversions are constructors.

This is a really interesting difference -- I found that my intuition was actually pushing the other way! Basically it feels like a feature of the low-level dataflow in the language to me; typechecking can insert little converters to make things work out where there are gaps between the types, and this is below the level of concern of extractors/constructors. I think most the other comments above flow from that difference as well.

In general as well, I want to try to avoid distinctions between extractors and constructors where possible; if we instead see the forms on LHS and RHS as living in the same value-space, with similar semantics, it opens the door later to optimizations like chaining multiple rewrite steps into one. In that sense, it feels like an odd semantic gap for a conversion from A to B to be possible on the LHS, but not on the RHS ("one rule's LHS is another rule's RHS" in some sense, once we start to chain steps).

In the case of Value to Inst, you do raise a good point, the meaning of that is sort of odd if it's invoke on the RHS. Perhaps the right way to resolve this though is to say that fallible conversions can only appear in the LHS. def_inst is fallible because we can't always peek back at the defining instruction for a value; so the Value to Inst conversion is fallible, and wouldn't make sense on the RHS. Does that seem reasonable?

Wasmtime GitHub notifications bot (Feb 16 2022 at 21:26):

cfallin commented on issue #3807:

Ah, but re: explicit calls to external Rust functions vs. just inserting terms, I do agree completely -- delegation to the outside should be an orthogonal thing, and lowering to "core ISLE" that really is just etors/ctors makes sense as a separate desugaring step. I'll start to rework things in that direction.

Wasmtime GitHub notifications bot (Feb 16 2022 at 21:26):

cfallin edited a comment on issue #3807:

But possibly I'm missing some reason why we'd want to disallow a particular conversion on one side or the other; do you have an example in mind?

In an extractor, it maybe makes sense to implicitly convert from a Value to an Inst via the def_inst extractor. But I don't think I would want to do that same implicit conversion in constructors. Hard to put into words why I'm reacting that way. It just feels wrong/unnecessary/I wouldn't use it that way, and so I'd rather not have any kind of potential footgun there.

FWIW, 99% of the sites where I personally wish we had implicit conversions are constructors.

This is a really interesting difference -- I found that my intuition was actually pushing the other way! Basically it feels like a feature of the low-level dataflow in the language to me; typechecking can insert little converters to make things work out where there are gaps between the types, and this is below the level of concern of extractors/constructors. I think most the other comments above flow from that difference as well.

In general as well, I want to try to avoid distinctions between extractors and constructors where possible; if we instead see the forms on LHS and RHS as living in the same value-space, with similar semantics, it opens the door later to optimizations like chaining multiple rewrite steps into one. In that sense, it feels like an odd semantic gap for a conversion from A to B to be possible on the LHS, but not on the RHS ("one rule's LHS is another rule's RHS" in some sense, once we start to chain steps).

In the case of Value to Inst, you do raise a good point, the meaning of that is sort of odd if it's invoked on the RHS. Perhaps the right way to resolve this though is to say that fallible conversions can only appear in the LHS. def_inst is fallible because we can't always peek back at the defining instruction for a value; so the Value to Inst conversion is fallible, and wouldn't make sense on the RHS. Does that seem reasonable?

Wasmtime GitHub notifications bot (Feb 17 2022 at 07:50):

cfallin commented on issue #3807:

OK, so I've reworked this as per the above and after talking with @fitzgen a bit more today -- it's much cleaner now, I think, in that conversions are immediately lowered to ordinary ctor/etor invocations. I added some test infrastructure as well. (The fail-tests don't assert on the actual failing error message yet; maybe we could do something with filetest; but I'll leave that to a future PR.)

I'm still planning to try using this for a bit before merging the compiler bits; I'll switch this out of "Draft" once I've convinced myself it's a reasonable and usable feature.

Wasmtime GitHub notifications bot (Feb 22 2022 at 21:57):

cfallin commented on issue #3807:

OK, so I think this is ready for review! cc @fitzgen

I took this for a spin in this branch (x64/lower.isle), which builds on the ISLE compiler changes here. I'm quite excited by how clean the rules turn out to be -- for example, the basic lowering for integer add is literally
;; Add two registers.
(rule (lower (has_type (fits_in_64 ty)
                       (iadd x y)))
      (add ty x y))
where the (put_value_in_reg ...) and (value_reg ...) are inserted implicitly. All of the conversions between Gpr/Xmm newtypes, the various argument forms (Reg, RegMem, RegMemImm, likewise for Xmm/Gpr), etc. all became automatic as well.

This PR has just the changes to the DSL compiler, and documentation changes, as well as the new ISLE testing infrastructure. I'll do a followup PR with the branch linked above containing a refactor to actually define and use some converters.

Last updated: Apr 17 2025 at 12:05 UTC