cfallin commented on Issue #2527:
@bjorn3 thanks for the PR; I think this is a reasonable thing to have and the automatic derivation certainly is easier to maintain than the separate "serializable subset" approach of
cranelift-serde
.That said, with any persistent format we should worry about backward/forward-compatibility to some degree. IIUC, there's no versioning of any sort with stock serde traits and bincode/json/whatever backends. This could lead to pernicious bugs (or at least footguns) where we have a blob of serialized IR data that became invalid after adding an instruction or instruction format, but we don't catch that.
It's also somewhat unfortunate that with all the automation and effort saved by serde, we still have a manual parser!
So I wonder about another possibility: could we generalize the parser to accept a binary format that's syntactically identical to textual CLIF, but with interned tokens/identifiers, so we could share the parsing code, get the backward/forward compatibility for free, and keep the same control / ability to customize parsing that we have today? This way we make use of the thinking we've already done about the "external view" of CLIF, without the potential unintended consequences of exposing internal data structures.
Depending on the use-case, I'm not totally opposed to
serde
-all-the-things, by the way; if we put a version number on it (and are careful to bump that when appropriate) and declare it to be "for use only by tools that know what they're doing" (e.g. have a fallback path if the version number doesn't match), then maybe it's still OK. Just want to make sure we've thought about these issues first.
bjorn3 commented on Issue #2527:
I can add a version number. I think it can directly use the cranelift version specified in
Cargo.toml
.
bjorn3 commented on Issue #2527:
I have added a version marker.
cfallin commented on Issue #2527:
@bjorn3 thanks for the update -- we're almost there I think (thanks for the patience!).
Could you update to include the git hash of the build as a version ID rather than the crate version? It looks like this StackOverflow answer shows a reasonable way to do so in
build.rs
. (We already have abuild.rs
for the generated IR data structures so this shouldn't be too much more overhead, I think.)
bjorn3 commented on Issue #2527:
That won't work when published to crates.io, as there is no git repo then.
cfallin commented on Issue #2527:
Ah, that's a good point. I suppose we could add some logic to check for that? (Either check return value of the
git
invocation, or check for presence of.git
directory first.) Then we could use<crate-version>-<git-hash>
as an ID string, where<git-hash>
may be empty.The reason I worry about this is that in
cg_clif
, IIRC, Cranelift is pulled in as a git dependency, so we can't necessarily rely on crate version number to disambiguate breaking changes to the IR. I don't want to have to think about backward compatibility-via-JSON when making individual changes to CLIF instructions; that potentially causes either very confusing errors, or silent breakage. Since it's a potential correctness issue, I want to Do It Right :-)
bjorn3 commented on Issue #2527:
I added the git rev. When either git is not found or it isn't inside a git, it should omit the git rev.
cfallin commented on Issue #2527:
@bjorn3 I think that with a rebase on current
main
this should be good to merge.
bjorn3 commented on Issue #2527:
Rebased and fixed
ensure_deterministic_build.sh
.
cfallin commented on Issue #2527:
Thanks!
Last updated: Nov 22 2024 at 17:03 UTC