Stream: git-wasmtime

Topic: wasmtime / issue #4979 Enable the regalloc checker in tes...


view this post on Zulip Wasmtime GitHub notifications bot (Sep 29 2022 at 13:10):

afonso360 opened issue #4979:

:wave: Hey,

This was brought up yesterday in the cranelift meeting, and I'm filing this mostly so we don't forget.

Feature

Enable the cranelift regalloc_checker flag in our test suites.

The concrete testsuites I'm thinking are the filetests testsuite for cranelift, and the spec_testsuite for wasm. However if we can get others then that would be great too.

Something that was brought up at the meeting was the performance of regalloc_checker, we should probably make sure that the testsuites don't become too slow.

Benefit

This would have caught #4969 earlier and potentially other regalloc issues in the future.

Implementation

I'm most familiar with filetests and at least in the compile and run tests we can "clone" the flags requested by the tests and force enable regalloc_checker. I'm not too sure about spec_testsuite but maybe something similar would work too?

Alternatives

Make it a default flag

@cfallin mentioned this yesterday with the additional note that the regalloc checker was never built for speed so it would probably have a somewhat large downside. This matches what I've seen when trying to fuzz it.

Run it only during fuzzing

I played around with this in the cranelift-fuzzgen fuzzer, and the preliminary results aren't great.

In the sample run that I usually do (~90k inputs), with regalloc checker we run at 10 execs / sec and without we get 26 execs / sec. The other alternative that I've been thinking is running with regalloc checker only x% of the time although that is still a WIP.

cc: @cfallin @jameysharp @uweigand

view this post on Zulip Wasmtime GitHub notifications bot (Sep 29 2022 at 13:14):

afonso360 edited issue #4979:

:wave: Hey,

This was brought up yesterday in the cranelift meeting, and I'm filing this mostly so we don't forget.

Feature

Enable the cranelift regalloc_checker flag in our test suites.

The concrete testsuites I'm thinking are the filetests testsuite for cranelift, and the spec_testsuite for wasm. However if we can get others then that would be great too.

Benefit

This would have caught #4969 earlier and potentially other regalloc issues in the future.

Implementation

I'm most familiar with filetests and at least in the compile and run tests we can "clone" the flags requested by the tests and force enable regalloc_checker. I'm not too sure about spec_testsuite but maybe something similar would work too?

Something that was brought up at the meeting was the performance of regalloc_checker, we should probably make sure that the testsuites don't become too slow.

Alternatives

Make it a default flag

@cfallin mentioned this yesterday with the additional note that the regalloc checker was never built for speed so it would probably have a somewhat large downside. This matches what I've seen when trying to fuzz it.

Run it only during fuzzing

I played around with this in the cranelift-fuzzgen fuzzer, and the preliminary results aren't great.

In the sample run that I usually do (~90k inputs), with regalloc checker we run at 10 execs / sec and without we get 26 execs / sec. The other alternative that I've been thinking is running with regalloc checker only x% of the time although that is still a WIP.

cc: @cfallin @jameysharp @uweigand

view this post on Zulip Wasmtime GitHub notifications bot (Sep 30 2022 at 16:58):

cfallin commented on issue #4979:

This is definitely interesting to consider!

One thing that has occurred to me since our discussion is that turning the checker on in tests may have somewhat less value than in fuzzing (as long as fuzzing has sufficient coverage to touch every Wasm or CLIF opcode that the tests do). The reason is that regalloc errors tend to become more interesting as multiple instructions, or nontrivial control flow, interact. If we trivially miss an instruction input or output constraint, that will immediately cause tests to fail. But if, for example, we accidentally clobber some other register, we're more likely to see a failure if there is enough register pressure in the function that something else is also using that register.

As well, enabling the checker for the main testsuites increases the critical path length for CI, while fuzzing is a separate workflow off to the side.

So I suspect that "turn it on sometimes for some compilation fuzz targets" is our best option. I'm interested in what others think of this!


Last updated: Nov 22 2024 at 17:03 UTC