Stream: git-wasmtime

Topic: wasmtime / issue #8599 ISLE: eliminate distinction betwee...


view this post on Zulip Wasmtime GitHub notifications bot (May 12 2024 at 01:02):

jameysharp opened issue #8599:

Feature

ISLE terms should just be functions, from any number of arguments to any number of results. Internal terms should all be defined using rule. Terms with exactly one argument can be used in pattern matches like extractors are today; all terms can be used on the right-hand side of a rule or if-let, like constructors are today. Possibly a form of type-directed overloading should be added to the language to allow the current pattern of using the same term name for both constructor and extractor.

Benefit

Implementation

I'd start by unifying constructors and extractors in the trie-again representation and ensuring that the generated code doesn't change.

Then maybe introduce something like (overload iadd extract_iadd construct_iadd) to declare that when the term iadd is used, whichever of extract_iadd or construct_iadd has the right type for that context should be used instead. I'm less sure about this.

Next, extend constructor terms to allow multiple results. A pattern binding the result of such a term needs some extra syntactic care to handle tuples. Semantically, though, we can approximate this today by wrapping up multiple results in an enum with a single variant, so I think we should just make the language support it directly.

Finally, remove the restriction that patterns can only use extractors and expressions can only use constructors, and replace (extractor ...) with (rule ...). Optionally, we could add an inline flag on terms and use it for all current extractors to ensure we still generate the same code, since internal extractors are currently inlined automatically. It is difficult to define inlining for all current combinations of ISLE features, though.

Alternatives

There are so many alternatives. Let's talk about it…

view this post on Zulip Wasmtime GitHub notifications bot (May 12 2024 at 01:43):

cfallin commented on issue #8599:

I definitely think this would be a welcome change! As you know I've been saying similar things for a while; my take is that extractors can be lowered somehow to a sugar on constructors and if-lets, now that we have them. This kind of fulfills the original dream (that I had had, anyway!) of a simple core Prolog-like semantics of matching steps in the backtracking phase (originally LHS) prior to committing.

My take on a core calculus would be:

Then we can accept the existing language as-is; and as a next step, devise additional surface syntax for or-patterns and other things as needed.

Concretely I imagine the existing surface syntax example (slight liberties assumed in unifying inst-helper signatures to include ty everywhere)

(decl inst_data (InstructionData) Inst)
(extern extractor inst_data inst_data)
(decl iadd (Type Value Value) Inst)
(extractor (iadd ty x y) (inst_data (InstructionData.BinaryOp (Opcode.Iadd) ty x y)))

(rule (lower (iadd ty x y))
  (do_the_isa_thing x y))

would desugar into core syntax something like:

(decl inst_data (InstructionData) Inst)
(decl inst_data_dual (Inst) InstructionData)
(dual inst_data inst_data_dual)
(extern inst_data_dual inst_data)  ;; note that extern no longer specifies etor/ctor
(decl iadd (Type Value Value) Inst)
(decl iadd_dual (Inst) (Type Value Value)
(dual iadd iadd_dual)

(rule (iadd_dual inst)  ;; all rules have simple bindings only in argument positions
  (if-let instdata (inst_data_dual inst)) ;; inst_data_dual used here because original pattern used `inst_data` and in etor position we rewrite by looking up the dual
  (if-let (InstructionData.BinaryOp (Opcode.Iadd) ty x y) instdata)
  (tuple ty x y))

(rule (lower inst)
  (if-let (tuple ty x y) (iadd_dual inst))
  (do_the_isa_thing ty x y))

It looks a little verbose but it has really nice properties: simple args only on toplevel rules; only "structural destructuring" (as it were?) of tuples and true enums in left-hand sides of if-lets; all invocations to other rules that can fail in right-hand sides of if-lets; and after the prelude of if-lets, a sequence of constructor calls as normal. And finally and most importantly: all forms in a rule that are invocations are invocations to a "constructor", so we can stop calling them that and just call them terms :-)

(To be maximally clear, I see my sketch here as adding onto your points above; strong +1 to tuple-returns, and the implementation details in the trie-again IR; this is mostly sketching my ideas in the place where you noted "less sure about [overload]".)

view this post on Zulip Wasmtime GitHub notifications bot (May 12 2024 at 20:14):

jameysharp commented on issue #8599:

Ah, you're filling in two things I was unsure of: term overloading and a syntax for terms with multiple return values.

Regarding dual, that works for me. I gather it's purely syntactic: If a term is used in a pattern, then its dual is looked up instead. I find some appeal in type-directed overloading instead of syntax-directed, especially in conjunction with the existing support for implicit conversions, but this does the job and I can run with it.

I'd like to tweak it a little bit though: If a term currently has only an extractor, like inst_data or the lowering version of iadd, then I don't want to have a dual declaration for it. I think we can just allow terms that have only one parameter to be used in patterns. If a dual declaration exists for that term, then rewrite to it, but otherwise use the original term itself. Then the main sugar here is that rewriting an extractor-only term into the core version of decl has the parameters swapped with the results.

I think the "core" form should be a subset of the full language, so that we can use it directly when we want to. That might mean picking a different name for the core version of decl so the existing version's syntax can coexist with it. (Bikeshedding the name: how about term instead of decl?) We might choose to incrementally migrate existing ISLE source to parts of the core syntax; I'd certainly find term/dual less confusing than the current syntax for extractors, which always feel "backwards" to me.

As for tuple syntax: I had noticed that we need some sort of special form for matching tuples, like your suggested (tuple ...). I like your suggestion better than my idea, which was that you'd use the term name again: (if-let (iadd_dual ty x y) (iadd_dual inst)). And syntactically I was thinking we could get away without a special form for constructing tuples, because we could allow multiple expressions on the right-hand side of a rule, but I like the consistency of your approach. In particular, explicitly constructing a tuple as an expression means it can be on the right-hand side of a let expression, and that allows us to use the same impure binding in multiple tuple fields, which we could not do with if-let.

Also while we're at it I'd like to make our Unit type and unit external constructor into simply 0-tuples, () and (tuple) respectively.

I'm intrigued that you want the core syntax to prohibit function calls in patterns. We have to do that internally when building up the "serialized" representation just before generating Rust source, but the "trie-again" representation doesn't make that distinction, and it's not too hard to build that representation from our current surface syntax. So I'm not sure what that restriction on the core syntax would buy us.

By the way, we absolutely need some kind of inlining for some "extractor" terms or our pattern-matching efficiency will be completely shot. We have to be able to see that iadd and imul both match different enum variants on the result of the same inst_data call or we'll test all variants in some sequence on every instruction. We might gain something by inlining some existing constructors that are used in if-let right now too.

To summarize, I think a reasonable synthesis of our proposals is:

Introducing term alone would allow us to convert a variety of simple extractor predicates from Rust to ISLE; more if we also add tuple. Adding inline as well and implementing it when constructing the trie-again representation would be semantically visible because it would make the overlap check more precise and allow us to remove a bunch of rule priorities. Lots of incremental benefits from doing different parts of this work.

view this post on Zulip Wasmtime GitHub notifications bot (May 12 2024 at 22:13):

jameysharp added the isle label to Issue #8599.


Last updated: Dec 23 2024 at 13:07 UTC