Stream: git-wasmtime

Topic: wasmtime / PR #1729 Fix MachBuffer branch optimization.


view this post on Zulip Wasmtime GitHub notifications bot (May 20 2020 at 01:21):

cfallin requested bnjbvr and julian-seward1 for a review on PR #1729.

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2020 at 01:21):

cfallin opened PR #1729 from machinst-branch-opt to master:

This patch fixes a subtle bug that occurred in the MachBuffer branch
optimization: in tracking labels at the current buffer tail using a
sorted-by-offset array, the code did not update this array properly when
redirecting labels. As a result, the dead-branch removal was unsafe,
because not every label pointing to a branch is guaranteed to be
redirected properly first.

Discovered while doing performance testing: bz2 silently took a wrong
branch and exited compression early. (Eek!)

To address this problem, this patch adopts a slightly simpler data
structure: we only track the labels at the current buffer tail, and
at the start of each branch, and we're careful to update these
appropriately to maintain the invariants. I'm pretty confident that this
is correct now, but we should (still) fuzz it a bunch, because wrong
control flow scares me a nonzero amount. I should probably also actually
write out a formal proof that these data-structure updates are correct.
The optimizations are important for performance (removing useless empty
blocks, and taking advantage of any fallthrough opportunities at all),
so I don't think we would want to drop them entirely.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2020 at 01:21):

cfallin requested bnjbvr and julian-seward1 for a review on PR #1729.

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2020 at 04:50):

julian-seward1 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2020 at 04:50):

julian-seward1 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2020 at 04:50):

julian-seward1 created PR Review Comment:

What does "complete" mean here?

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2020 at 04:50):

julian-seward1 created PR Review Comment:

Seems like significant added expense on a hot path -- avoidable?

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2020 at 04:50):

julian-seward1 created PR Review Comment:

Ditto re "complete"

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2020 at 04:50):

julian-seward1 created PR Review Comment:

This doesn't entail any kind of fixpointing behaviour, right?

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2020 at 19:35):

cfallin updated PR #1729 from machinst-branch-opt to master:

This patch fixes a subtle bug that occurred in the MachBuffer branch
optimization: in tracking labels at the current buffer tail using a
sorted-by-offset array, the code did not update this array properly when
redirecting labels. As a result, the dead-branch removal was unsafe,
because not every label pointing to a branch is guaranteed to be
redirected properly first.

Discovered while doing performance testing: bz2 silently took a wrong
branch and exited compression early. (Eek!)

To address this problem, this patch adopts a slightly simpler data
structure: we only track the labels at the current buffer tail, and
at the start of each branch, and we're careful to update these
appropriately to maintain the invariants. I'm pretty confident that this
is correct now, but we should (still) fuzz it a bunch, because wrong
control flow scares me a nonzero amount. I should probably also actually
write out a formal proof that these data-structure updates are correct.
The optimizations are important for performance (removing useless empty
blocks, and taking advantage of any fallthrough opportunities at all),
so I don't think we would want to drop them entirely.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2020 at 19:36):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2020 at 19:36):

cfallin created PR Review Comment:

Clarified -- just means that the vector must contain all labels that are bound to this offset (the buffer tail); missing a label would result in that label not being adjusted properly when it needs to be.

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2020 at 19:37):

cfallin created PR Review Comment:

Good point! Now using a lazy-clear scheme: we store the offset at which the label vector applies; if cur_offset() moves, the vector is conceptually empty, and is made so next time we need to read it.

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2020 at 19:37):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2020 at 19:38):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2020 at 19:38):

cfallin created PR Review Comment:

Not in the usual sense, no. On each iteration of the loop we munch a branch (or redirect its labels elsewhere); we don't ever revisit a branch.

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2020 at 19:38):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2020 at 19:38):

cfallin created PR Review Comment:

Fixed.

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2020 at 21:14):

cfallin updated PR #1729 from machinst-branch-opt to master:

This patch fixes a subtle bug that occurred in the MachBuffer branch
optimization: in tracking labels at the current buffer tail using a
sorted-by-offset array, the code did not update this array properly when
redirecting labels. As a result, the dead-branch removal was unsafe,
because not every label pointing to a branch is guaranteed to be
redirected properly first.

Discovered while doing performance testing: bz2 silently took a wrong
branch and exited compression early. (Eek!)

To address this problem, this patch adopts a slightly simpler data
structure: we only track the labels at the current buffer tail, and
at the start of each branch, and we're careful to update these
appropriately to maintain the invariants. I'm pretty confident that this
is correct now, but we should (still) fuzz it a bunch, because wrong
control flow scares me a nonzero amount. I should probably also actually
write out a formal proof that these data-structure updates are correct.
The optimizations are important for performance (removing useless empty
blocks, and taking advantage of any fallthrough opportunities at all),
so I don't think we would want to drop them entirely.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (May 20 2020 at 21:43):

cfallin merged PR #1729.


Last updated: Oct 23 2024 at 20:03 UTC