cranelift / Issue #1305 Recent regression with cg_clif · git-cranelift

Stream: git-cranelift

Topic: cranelift / Issue #1305 Recent regression with cg_clif

GitHub (Dec 20 2019 at 20:58):

What are the steps to reproduce the issue? Can you include a CLIF test case,
ideally reduced with the bugpoint clif-util command? https://travis-ci.org/bjorn3/rustc_codegen_cranelift/jobs/627882870

What do you expect to happen? What does actually happen? Does it panic, and
if so, with which assertion? SIGSEGV in coretests test.

Which Cranelift version / commit hash / branch are you using? e0d317249194f51a80c97a856f96eea1560c5435

Regression range: https://github.com/bytecodealliance/cranelift/compare/ec787eb281bb2e18e191508c17abe694e91f0677...e0d317249194f51a80c97a856f96eea1560c5435

Likely caused by #1298 (cc @sstangl)

GitHub (Dec 20 2019 at 20:58):

bjorn3 labeled Issue #1305:

What are the steps to reproduce the issue? Can you include a CLIF test case,
ideally reduced with the bugpoint clif-util command? https://travis-ci.org/bjorn3/rustc_codegen_cranelift/jobs/627882870

What do you expect to happen? What does actually happen? Does it panic, and
if so, with which assertion? SIGSEGV in coretests test.

Which Cranelift version / commit hash / branch are you using? e0d317249194f51a80c97a856f96eea1560c5435

Regression range: https://github.com/bytecodealliance/cranelift/compare/ec787eb281bb2e18e191508c17abe694e91f0677...e0d317249194f51a80c97a856f96eea1560c5435

Likely caused by #1298 (cc @sstangl)

GitHub (Dec 20 2019 at 21:02):

sstangl commented on Issue #1305:

Is there a way this can be reproduced? Even without a CLIF file, the change could be reversed piecewise by changing the i32_i64 function to i32_i64_explicit_rex, at least narrowing down which recipe is likely responsible.

GitHub (Dec 20 2019 at 21:10):

bjorn3 commented on Issue #1305:

Just follow the build instructions of https://github.com/bjorn3/rustc_codegen_cranelift. If you remove https://github.com/bjorn3/rustc_codegen_cranelift/blob/master/test.sh#L69-L78, it will be quite a bit faster to test.

Even without a CLIF file, the change could be reversed piecewise by changing the i32_i64 function to i32_i64_explicit_rex, at least narrowing down which recipe is likely responsible.

Will try tomorrow.

GitHub (Dec 20 2019 at 21:22):

sstangl commented on Issue #1305:

Hum, I get this:

[codegen mono items] end time: 483.21918ms
error: could not compile core.

Caused by:
process didn't exit successfully: rustc --edition=2018 --crate-name core sysroot_src/src/libcore/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts --crate-type lib --emit=dep-info,metadata,link -C debuginfo=2 -C metadata=53bf39f7e0485e6c -C extra-filename=-53bf39f7e0485e6c --out-dir /home/sstangl/dev/rustc_codegen_cranelift/build_sysroot/target/x86_64-unknown-linux-gnu/debug/deps --target x86_64-unknown-linux-gnu -C incremental=/home/sstangl/dev/rustc_codegen_cranelift/build_sysroot/target/x86_64-unknown-linux-gnu/debug/incremental -L dependency=/home/sstangl/dev/rustc_codegen_cranelift/build_sysroot/target/x86_64-unknown-linux-gnu/debug/deps -L dependency=/home/sstangl/dev/rustc_codegen_cranelift/build_sysroot/target/debug/deps -Cpanic=abort -Cdebuginfo=2 -Zpanic-abort-tests -Zcodegen-backend=/home/sstangl/dev/rustc_codegen_cranelift/target/debug/librustc_codegen_cranelift.so --sysroot /home/sstangl/dev/rustc_codegen_cranelift/build_sysroot/sysroot -Z force-unstable-if-unmarked (signal: 4, SIGILL: illegal instruction)

Which, if I'm reading that right, is rustc crashing?

GitHub (Dec 20 2019 at 21:23):

sstangl commented on Issue #1305:

Do I need to be on a specific version of rustc to compile this? The build instructions didn't mention it.

GitHub (Dec 20 2019 at 21:50):

bjorn3 commented on Issue #1305:

The travis build is using latest nightly.

Which, if I'm reading that right, is rustc crashing?

Seems so, the travis build fails with:

error: test failed, to rerun pass '--lib'

Caused by:

process didn't exit successfully: /Users/travis/build/bjorn3/rustc_codegen_cranelift/build_sysroot/sysroot_src/src/libcore/tests/target/x86_64-apple-darwin/debug/deps/coretests-a2d3b493e4d8f1a3 (signal: 11, SIGSEGV: invalid memory reference)

What is the output above the error for you?

GitHub (Dec 21 2019 at 10:45):

bjorn3 commented on Issue #1305:

The pc ends up in the middle of an instruction. A breakpoint at the start of the function doesn't get triggered.

GitHub (Dec 22 2019 at 06:10):

sstangl closed Issue #1305:

What are the steps to reproduce the issue? Can you include a CLIF test case,
ideally reduced with the bugpoint clif-util command? https://travis-ci.org/bjorn3/rustc_codegen_cranelift/jobs/627882870

What do you expect to happen? What does actually happen? Does it panic, and
if so, with which assertion? SIGSEGV in coretests test.

Which Cranelift version / commit hash / branch are you using? e0d317249194f51a80c97a856f96eea1560c5435

Regression range: https://github.com/bytecodealliance/cranelift/compare/ec787eb281bb2e18e191508c17abe694e91f0677...e0d317249194f51a80c97a856f96eea1560c5435

Likely caused by #1298 (cc @sstangl)

GitHub (Dec 22 2019 at 16:27):

bjorn3 commented on Issue #1305:

I can confirm that this issue is fixed: https://travis-ci.org/bjorn3/rustc_codegen_cranelift/jobs/628398038

GitHub (Jan 06 2020 at 11:14):

bnjbvr commented on Issue #1305:

Just follow the build instructions

Sorry @bjorn3, but this won't scale if we have to do this for every project embedding Cranelift. Please provide a CLIF test case next time, so we can include it in Cranelift's regression test suite.

GitHub (Jan 06 2020 at 11:40):

bjorn3 commented on Issue #1305:

I understand that it would be much nicer to just provide some clif ir with the bug. However this was a runtime error, but I didn't know where the problem was, so I would have to provide the clif ir of every compiled function. Because I use cranelift_module, that would still miss information necessary to actually compile and run it.

GitHub (Jan 06 2020 at 12:26):

bnjbvr commented on Issue #1305:

I understand the difficulty, but for us it's even more work to try to understand your code base, figure out how it uses Cranelift, etc.

Since you have a test case with a segfault at a precise instruction pointer value, could you please:

add debug print to show the effective code ranges for each compiled function

run in a debugger (gdb, or rr is even better in this kind of situations) and note down the value of the instruction pointer when the crash happens

find which function is affected by this issue

get the CLIF for this particular function?

And then we can figure it out from there. Having a test case is a great deal in effectively fixing a regression.

I think that having item (1) in cranelift-module in general would make sense to make debugging easier (behind an environment option / compile option / controlled by a parameter or whatnot).

GitHub (Jan 06 2020 at 12:51):

bjorn3 commented on Issue #1305:

I completely agree that it would be nicer to have a test case rather than a big program to compile and run. I try to create a test case when possible. In this particular case that was however not really possible, as there was a jump somewhere to the middle of an instruction, after which the process continued running. I did try to reduce the test case, but even slight changes made it stop crash, crash at a different place or even jump to an unmapped page. This made it hard to find the responsible instruction or even function. rr would have made it easy, but it doesn't work on macOS, which I was using that day.

find which function is affected by this issue

The debugger showed me which function was running when it finally crashed (both faerie and object emit symbols for every function, even if private), but at that point the ip was already pointing in the middle of an instruction.

GitHub (Jan 06 2020 at 13:38):

bnjbvr commented on Issue #1305:

The debugger showed me which function was running when it finally crashed (both faerie and object emit symbols for every function, even if private), but at that point the ip was already pointing in the middle of an instruction.

I see. Note that the aforementioned rr is a record and replay debugger, so it allows you to run your code backwards: from the point where it crashed, you can do rsi (reverse step instruction) to see what the previous instruction was. It's a bit of work to install it first, but the payoff is great in situations like this, and I am confident it would help in your Rustc endeavors as well.

Last updated: Apr 14 2025 at 18:05 UTC