adambratschikaye commented on issue #4924:
Looks like @fitzgen makes the most sense for review? I don't seem to be able to assign a reviewer though.
adambratschikaye commented on issue #4924:
Thanks @abrown and @fitzgen!
jameysharp commented on issue #4924:
This PR apparently can cause excessive memory consumption: see #4931 for a fuzz input causing
use_var_nonlocal
to attempt to allocate 3GiB for theself.calls
vector. I suspect this is effectively an infinite loop, but the control flow is obscured by the state machine and heap-based recursion inSSABuilder
. I'm very much in favor of removing quadratic behavior but I think we may need to revert this PR and think harder about what the invariants are supposed to be in theSSABuilder
. Right now I find that whole source file confusing.I haven't confirmed the root cause, but here's what I believe is happening. There's no guarantee that the blocks which
can_optimize_var_lookup
examines are sealed, which means that it's possible to add more predecessors to them afterward. So let's say it's called once on a block which doesn't have any predecessors yet, and a result oftrue
is saved in the memo-table. Then exactly one predecessor is added that causes it to be part of a cycle, and then this function is called again. Now the un-memoized version would change to returningfalse
, but the memoized version still returnstrue
.This wouldn't be as visible if not for another issue:
use_var_nonlocal
callscan_optimize_var_lookup
every time, but only uses its result if the current block is sealed. Without memoizing, this is a waste of time for unsealed blocks but has no effect on correctness. If this were fixed, the above issue would be less likely to happen because we'd only update the memo-table if at least the initial block were sealed. However, I don't think that would fix the bug.Also worth noting: I think this is hard to trigger from Wasmtime because if I recall correctly cranelift-wasm tries to seal blocks as soon as possible. That said, I bet there exists some wasm input that would still hit this bug. By contrast, the Cranelift fuzzers don't seal blocks until the whole function is built, so it's much easier to hit this bug there.
I think we might be able to fix this by keeping track of which blocks aren't currently part of a cycle, but doing it proactively in
declare_block_predecessor
instead of memoizingcan_optimize_var_lookup
.
adambratschikaye commented on issue #4924:
Oh shoot. I didn't realize the graph was being modified in between these calls to
can_optimize_var_lookup
and it definitely doesn't make sense to memoize if that's the case. I can look into thedeclare_block_predecessor
suggestion.
jameysharp commented on issue #4924:
I have a patch mostly put together for that idea, but it wasn't passing the test suite yet when I had to stop for the day yesterday. I feel like it should work out though. We'll see!
fitzgen commented on issue #4924:
Reverting in https://github.com/bytecodealliance/wasmtime/pull/4937 until we figure out a proper fix for the original performance bug.
Last updated: Dec 23 2024 at 12:05 UTC