cfallin opened PR #5072 from egraph-misc-opts
to main
:
These optimizations together are worth about a 2% compile-time reduction, as measured on one more with spidermonkey.wasm as an input, using
hyperfine
onwasmtime compile
.The changes included are:
- Some better pre-allocation (blockparams and side-effects concatenated list vecs);
- Avoiding the indirection of storing list-of-types for every Pure and Inst node, when almost all nodes produce only a single result; instead, store arity and single type if it exists, and allow result projection nodes to fill in types otherwise;
- Pack the
MemoryState
enum into oneu32
(this together with the above removal of the type slice allowsNode
to shrink from 48 bytes to 32 bytes);- always-inline an accessor (
entry
onCtxHash
) that wasn't (always(inline)
appears to be load-bearing, rather than justinline
);- Split the update-analysis path into two hotpaths, one for the union case and one for the new-node case (and the former can avoid recomputing for the contained node when replacing a node with node-and-child eclass entry).
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
cfallin requested elliottt for a review on PR #5072.
cfallin requested fitzgen for a review on PR #5072.
cfallin updated PR #5072 from egraph-misc-opts
to main
.
cfallin edited PR #5072 from egraph-misc-opts
to main
:
These optimizations together are worth about a 2% compile-time reduction, as measured on one core with spidermonkey.wasm as an input, using
hyperfine
onwasmtime compile
.The changes included are:
- Some better pre-allocation (blockparams and side-effects concatenated list vecs);
- Avoiding the indirection of storing list-of-types for every Pure and Inst node, when almost all nodes produce only a single result; instead, store arity and single type if it exists, and allow result projection nodes to fill in types otherwise;
- Pack the
MemoryState
enum into oneu32
(this together with the above removal of the type slice allowsNode
to shrink from 48 bytes to 32 bytes);- always-inline an accessor (
entry
onCtxHash
) that wasn't (always(inline)
appears to be load-bearing, rather than justinline
);- Split the update-analysis path into two hotpaths, one for the union case and one for the new-node case (and the former can avoid recomputing for the contained node when replacing a node with node-and-child eclass entry).
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
[ ] This has been discussed in issue #..., or if not, please tell us why
here.[ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.[ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
fitzgen submitted PR review.
fitzgen created PR review comment:
arity: { debug_assert!(results.len() <= (u16::MAX as usize)); results.len() as u16 },
fitzgen submitted PR review.
fitzgen created PR review comment:
Ditto
fitzgen created PR review comment:
Can't just be
#[inline]
? Needs to be inlined in non-optimized builds too?
fitzgen created PR review comment:
Ditto
cfallin created PR review comment:
I'd prefer not to affect non-optimized builds, but for whatever reason the
inline(always)
seemed to be necessary to get inlining to happen in optimized builds, unfortunately...(I'd really love to gain more insight into the heuristics at play here!)
cfallin submitted PR review.
cfallin updated PR #5072 from egraph-misc-opts
to main
.
cfallin submitted PR review.
cfallin created PR review comment:
Fixed! Actually used
u16::try_from(...).expect(...)
here but result is the same. And, factored out above if/else chain to use below as well.
cfallin updated PR #5072 from egraph-misc-opts
to main
.
cfallin has enabled auto merge for PR #5072.
cfallin has disabled auto merge for PR #5072.
cfallin updated PR #5072 from egraph-misc-opts
to main
.
fitzgen submitted PR review.
cfallin merged PR #5072.
abrown submitted PR review.
abrown created PR review comment:
Shouldn't this be
self.0 & 3 > 0
?
cfallin submitted PR review.
cfallin created PR review comment:
It's an unpack method just for the tag=1 case so I think this is correct?
abrown submitted PR review.
abrown created PR review comment:
Ah, ok... It seemed like it could have been for all three :+1:
Last updated: Jan 24 2025 at 00:11 UTC