jameysharp requested wasmtime-compiler-reviewers for a review on PR #8223.
jameysharp opened PR #8223 from jameysharp:no-print-aliases
to bytecodealliance:main
:
This is a follow-up to #8214. That PR made the CLIF printer resolve aliases before printing any value, which means now the aliases are never referenced.
I expected plenty of tests to change due to removing the aliases from the printed representation of CLIF functions, but there was one surprise: in tests/disas/icall-loop.wat, this didn't just remove the aliases, but also changed the value numbering.
As it turns out, in that one function, the last value in the function that Wasmtime generated was an alias (v28 -> v3). The disas test harness has Wasmtime print the CLIF it generated, then the harness re-parses that CLIF, and passes the re-parsed CLIF to Cranelift's optimization passes. Since after this change the aliases are not printed, re-parsing the printed form doesn't round-trip perfectly, and specifically the last allocated value number changes. As a result, when legalization and optimization introduce new instructions, they get different result value numbers.
My conclusion is that this is fine. The disas tests don't need to produce exactly the same value numbers that the full Wasmtime to Cranelift pipeline would produce. This process does produce the same instructions in the same order, and I think that's all that matters.
<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
jameysharp requested cfallin for a review on PR #8223.
jameysharp requested alexcrichton for a review on PR #8223.
jameysharp requested wasmtime-core-reviewers for a review on PR #8223.
If you do a debug print of some
Value
you have somewhere in your code which happens to be an alias and then try to cross-reference it with the printed clif ir, you wouldn't be able to find the value anymore with this PR, right? Would it instead be possible to have places that print clif ir explicitly mutate the clif ir to resolve all aliases? (likely with a helper function for this onfunc.dfg
) That way you have control over if aliases are printed or not.
cfallin commented on PR #8223:
Perhaps we can have a
verbose
mode to printing (a flag carried on the writer object, not a Cranelift setting, to be clear) that prints all aliases still, but in a dump at thebikeshed_choice({beginning, end})
of the function body?I like the cleanup here, but at the same time I'm also sympathetic to the view that this is "changing the defined values" visible to the reader, differently than the earlier use-aliases-in-args change; I can also easily imagine how this would become confusing when debugging. I think the above would largely address that: if one is println-debugging and wants to dump the function body before some algorithm starts to traverse values, one can dump it "verbosely" to include the aliases and such.
jameysharp commented on PR #8223:
That's true, although you could also call
dfg.resolve_aliases
on values you want to display in your debug-print statements, which I would think would be more useful. Do you feel strongly that it's important to be able to find all Values represented in the CLIF serialization? I could be persuaded but I don't yet quite understand why it would be worth keeping.The egraph pass already has the side effect of resolving all aliases, so I also considered having the printer only print aliases which are actually used somewhere. But since #8214 makes the printer never show any alias uses, I thought that would be a waste of time.
cfallin commented on PR #8223:
I guess it comes down to: are the aliases semantically part of the CLIF still, or not? I think we should decide that, and then make the text representation follow. I'm concerned with not just debug printing but also "machine-readable bugs", so to speak: some other logic might reference a value somewhere along with text-serialized CLIF, and if the value has disappeared from the CLIF (or been replaced by some other new operator reusing the value number) then that's a subtle and hard-to-debug issue.
If aliases are semantically part of CLIF, then any random
Value
one comes across anywhere should be reasonable to print, and look for in the CLIF; if they aren't, thenresolve_aliases
is a more or less mandatory step to take before exposing a value "externally".I think given the above bug potential, I am actually leaning toward "aliases are still part of CLIF"; that means we should print them, maybe even unconditionally, to avoid confusion about "variants"; and if we want to make them not part of CLIF, we should find ways to "hide" unsanitized (unresolved-aliases) Values. Separately, for a goal of "print clean-looking CLIF", we could do what @bjorn3 mentions and run a mutation (resolve all aliases in the instruction args, and delete alias definitions) before printing CLIF.
are the aliases semantically part of the CLIF still, or not?
IMO yes as you can change a value back from an alias to a regular instruction or change it to be an alias for a different value. If changing a value to be an alias for another value was a permanent change where modifying the value would actually modify the aliased value instead, only then would I consider aliases to not be semantically part of clif ir as the alias and it's target are indistinguishable modulo the
Eq
trait impl (it would likely need to be removed in that case as unlikePartialEq
,Eq
requires a value to be equal to itself).
cfallin commented on PR #8223:
Right, so given that, the line that makes sense to me is: altering values we print in arguments is fine because chasing aliases to the root is semantically equivalent; and also that's a very useful thing when reading the code; but removing the definitions of the aliases is removing another "output", or meaningful element of the code.
(I could also see an argument for removing the implicit alias-chasing in printing if we instead have a "cleanup alias resolution pass" before printing output that also deletes defs; that's fine because we're sequestering all the mutation away into a thing that does exactly the mutation it says on the tin, so to speak. But the above middle ground seems like a good compromise too, to me.)
jameysharp commented on PR #8223:
Alright, I'm convinced enough that aliases are part of the meaning of CLIF.
But in that case I think we should revert #8214 and fully go the direction that @bjorn3 suggested instead, where if someone wants aliases resolved they should do it before printing.
In addition, I want the aliases deleted once all their uses have been resolved, as Chris suggests. It looks like the way to do that is to update the alias' type to
INVALID
and its referent toValue::reserved_value()
. I believe that will make the printer skip those values, without needing to introduce a "verbose" flag on printing.And I think I would like this alias resolution pass to happen in all the filetest-style harnesses, but that could be a separate discussion. I could go either way on whether Wasmtime's
--emit-clif
option should do alias resolution or not, but as that frontend rarely introduces aliases I guess it makes sense to just go ahead and print them, so that the round-trip done by the disas test harness is lossless.
cfallin commented on PR #8223:
That sounds like a good approach to me!
jameysharp closed without merge PR #8223.
jameysharp commented on PR #8223:
The new helper for rewriting away aliases is in #8238 and the rest of the relevant changes are coming after that. I'm dropping this PR in favor of that.
Last updated: Jan 24 2025 at 00:11 UTC