Hello everyone, I'm new to the server.
I'm doing a bachelor thesis next spring, so I'm looking for a topic right now. My compiler professor was positive about the idea of contributing to cranelift. Next step would be to find something concrete to work on. I'm excited about the idea of cranelift being used as a backend for rustc debug builds. That being said, some of you probably have a much better idea of what could be a suitable area to work on and fits in the scope of a bachelor thesis. The thesis will most likely be done by two people. I still need to find a partner, but I'm optimistic about that once I have concrete ideas to offer.
I'm obviously also talking to other professors about other topics, so nothing is set in stone.
Best regards, Remo
Hi, author of the cranelift backend for rustc and a fellow student here.
Hi @Remo Senekowitsch :wavy_dash:
One idea we've been kicking around for a while but haven't had a chance to invest time in yet is a "chaos mode" for Cranelift that would help us uncover latent bugs: https://github.com/bytecodealliance/wasmtime/issues/4134
Maybe this (or part of it, since the scope could almost endlessly expand) is interesting to you?
Thank you, this sounds very interesting! I'll rephrase the basic idea to make sure I understood correctly:
Centrally and deterministically from a seed, it should be possible to fuzz test different possible decisions the compiler can make. So that doesn't mean errors and invalid states / decisions are inserted or forced, in order to test if the compiler can recover from errors. Rather, it would be used to test situations where it is thought the compiler should behave equally, independent of a chosen option. This would make it easier to discover and locate bugs where that assumption is broken.
we would use this mode when fuzzing but nothing about the mode should be tied to fuzzing
compilation should still be deterministic, given the same chaos mode seed
So that doesn't mean errors and invalid states / decisions are inserted or forced, in order to test if the compiler can recover from errors. Rather, it would be used to test situations where it is thought the compiler should behave equally, independent of a chosen option.
exactly.
for example, when there is too much register pressure, we need to spill a value that is currently in a register to the stack to make room for the current value that must be in a register right now. which register do we spill? our register allocator has heuristics for choosing one right now, but what if we chose a different, random one? that is still valid, despite probably not ideal from a code quality point of view, and it definitely shouldn't trigger a bug in instruction encoding or change the semantics of the compiled program
That would be fantastic if you wanted to tackle this idea, @Remo Senekowitsch ! Just chiming in to say I'd be happy to mentor this; it's an idea we've had floating around for a few years and I've wanted to see it implemented as I think it has significant value in ensuring correctness
One nice aspect here from the point of view of a time-scoped project is that it's fairly modular and adjustable in effort: there's a framework, then there are a bunch of specific ideas (configurable spills, explicitly clobbering undefined state, tweaking isel, changing inst/block order, changing optimization pass order/applicability, ...) that can make the project smaller or larger as your time allows
That sounds amazing, having a mentor from the project would be immensely helpful! And I agree it's a perfect fit as the scope is quite flexible. I'll run this by my professor and get back to you when I have news.
Hello, little status update from my side. My professor has agreed to do it! And I think he would still be ok if I did it alone, in which case, I would want to do it alone. But my school is pushing for people to do it in groups. I've been talking to two other students who have experience with Rust, love it and would be interested in this specific idea as well. I think it's likely they'll agree to do it with me in a group of three. (Maybe the professor would ask to expand the scope in that case.)
The decision is usually finalized in December, so that would be the latest you hear from me again. Likely sooner, if there are relevant news for you in the meantime. I'm excited!
Have a good day :smile:
exciting!
That's super-exciting! Happy to meet, help brainstorm/write out details, etc whenever you need; just let us know
Chris Fallin said:
That's super-exciting! Happy to meet, help brainstorm/write out details, etc whenever you need; just let us know
I'd love to take you up on this offer :smile: I'm thinking it would be best to have a meeting with:
to discuss some details. I will see them on Monday and ask them what exactly they would like to discuss so the meeting can be efficient. I'll get back to you when I know more.
And I have two small questions to get a better idea of the work we'd be doing:
The way I understand it currently, I'm leaning toward conditional compilation. It would be pretty simple to do, easy to turn chaos mode on, and still provide the necessary flexibility of running arbitrary code per decision, so each perturbation can make use of the control plane differently according to need. pseudo code:
mod chaos_engine {
fn get_chaotic_decision() {}
}
mod some_heuristic {
fn make_decision() {
#[cfg(not(feature = "chaos_mode"))]
let decision = get_mormal_decision();
#[cfg(feature = "chaos_mode")]
let decision = chaos_engine::get_chaotic_decision();
}
}
There are a lot of heuristics in regalloc2 chaos mode can influence I think. You could also randomize basic block order (https://github.com/bytecodealliance/wasmtime/blob/main/cranelift/codegen/src/machinst/blockorder.rs), not optimize branches away (https://github.com/bytecodealliance/wasmtime/blob/434fbf2b27e900533e438088ac843215f2ec19d3/cranelift/codegen/src/machinst/buffer.rs#L775) and other things of that kind.
I think chaos mode should be behind a feature flag, but also make it possible to set a fixed seed to allow reproducing bugs uncovered by chaos mode.
Thanks! I was thinking a command line argument or environment variable for the seed should be fine?
That would work as interface when using wasmtime, but I don't think cranelift itself should set it. Maybe there could be a cranelift flag with the seed?
Hello :smile: Just letting you know that the paperwork is signed! :tada: I'm looking forward to working on this, starting in mid February.
Exciting!
Hello!
Semester has started and today was the first day we were working on the bachelor thesis. Our first goal is to get our feet wet in the code base, such that we can figure out how our testing stuff is going to fit into the existing framework.
We're having trouble getting the fuzz tests to run (same problem on two machines, fedora and pop os).
It seems to us that crates/fuzzing/wasm-spec-interpreter
is not building correctly. In that directory, we found some more specific instructions not contained in the regular documentation, like the book chapter 9.4. Specifically we installed ocaml
and libgmp
(gmp-static
in my case, running fedora). However, we still get the following error output (e.g. by running cargo fuzz run api_calls
):
I don't really understand most of that. I'm not experienced with make
or OCaml
. What caught my eye was this:
make[1]: ocamlbuild: No such file or directory
And we found this in the Makefile:
# A space-separated list of paths that the linker will use to search for libgmp.
# Override with `make LIBGMP_PATHS=...`.
LIBGMP_PATHS ?= /usr/lib /usr/lib/x86_64-linux-gnu /lib64
Feeling lucky, I added /usr/lib64
to this list, which seems to be where fedora put libgmp
in my case. But sadly, that wasn't it. I might be on the wrong track.
Maybe someone can help us figure this out?
Building with the spec interpreter requires OCaml to be installed locally; @Andrew Brown might have more specific suggestions (perhaps we can improve the docs too). But actually most of my fuzzing work I do with the spec-interpreter disabled, as I also have occasional troubles with the ocaml build :-)
You can do this by removing the fuzz-spec-interpreter
feature from the default
feature list in fuzz/Cargo.toml
, IIRC...
That seems to work! Thanks a lot :-)
Another option is to build the fuzzers with --no-default-features
which will disable the build of the ocaml interpreter, which I frequently do myself when I haven't installed ocaml already on a machine
If you do want to end up fuzzing with the spec interpreter, you probably want to install OPAM and then run something like opam install ocamlbuild
to install the binary you're missing. I suspect there's a Fedora package that also provides it.
FYI I filed an issue for one chaos thing we could do: https://github.com/bytecodealliance/wasmtime/issues/6009
Thanks! I linked it to the original issue and I'm adding it to our notes.
I've got fuzz testing with randomly skipped branch optimization working :tada: here's the draft PR
I missed a couple spots, gonna fix CI when I get a minute
This is fantastic work -- I plan to review it today. Thanks so much!
Small question - what is the general policy / convention on rebasing and rewriting history? Is it ok if the commits in a PR are a little messy, and at the end the commits are rewritten to be neat, logical, atomic? Or is the history left as is for already reviewed commits? I can see that the history is linear, so rebasing is required, but maybe there's an understanding to not go overboard with it.
It's usually better during codereview to push new commits rather than force-push over the existing history, because the latter makes it harder to see "what haven't I reviewed yet" and makes comments go stale
we squash-on-merge so a PR will be one commit once it's done
sometimes it's good if there's a lot of work-in-progress history to rebase once, at the end, before we squash-on-merge so we get a nicer concatenated commit message, but that's not strictly necessary
to add on to what Chris said: in general it is nice to have logical commits (when that doesn't ruin history for follow up reviews after new commits addressing previous reviews' feedback) so when first opening a PR, it may make sense to clean things up a bit, or squash to a single commit if the history is messy. that also makes it easier for reviewers.
I have a little proof of concept working for the fuel parameter. I'm not yet sure where this parameter should be coming from though. The settings::Flags
has been suggested, so I tried to figure out how these are set. It seems that can be done via wasmtime CLI arguments, but also comments inside a WAT file? There might be more, I didn't check all the references exhaustively.
During fuzz testing, we're not using the wasmtime CLI and also not manipulating WAT files directly. Is there another suitable way to inject settings flags that I'm missing?
We could also just read the fuel parameter from an environment variable. I'm thinking the binary search for bug sources could be handled by a little CLI tool that manages this environment variable. What do you think?
Settings are part of the compilation fuzz targets; there should be plumbing where they are injected into the backend construction (the wasmtime::Config
has Cranelift settings inside of it, for example). I like the idea of an "optimize up to this much fuel" setting, right alongside opt_level
and friends.
Definitely we do not want to rely on an environment variable -- that's a Unix antipattern of unmaintainable impure dependencies that is just a recipe for bugs :-)
It was suggested that one of the perturbations we could do is to randomize the basic block order in machinst/blockorder.rs
. We have a branch that's pretty much ready to go, we just aren't sure if we're randomizing the order of the correct thing. We now noticed the file has completely changed since the suggestion was made in the first place. Does it still make sense to do this and if so, which order exactly should we randomize?
The big refactor was made here.
@Trevor Elliott you committed this, maybe you can help out :smile:
for context, I've opened a draft pull request here with what we have so far.
As long as the entry block comes first, I believe we should generate correct code no matter what order the rest of the blocks are emitted in. You do have to make sure that the critical-edge-splitting blocks are inserted somewhere, but you could permute the blocks either before or after that happens. "After" is sort of preferable since that's the complete set of basic blocks, but if it's easier to do "before" then I think that would be fine.
We could even support putting the entry block elsewhere if we detect that case in VCode::emit and generate a jump as the first instruction! but that's easily done later, no need to build that now
So that would be right between steps 2 and 3 in BlockLoweringOrder::new
, something like this?
// Step 2: walk the postorder from the domtree in reverse to produce our desired node
// lowering order, identifying critical edges to split along the way.
// ...
// do not shuffle the entry block for now
// v
ctrl_plane.shuffle(&mut lowered_order[1..]);
// Step 3: build the successor tables given the lowering order. We can't perform this step
// during the creation of `lowering_order`, as we need `lb_to_bindex` to be fully populated
// first.
I don't think that'll quite work, because the lb_to_bindex
map would need to be updated as well, and I'm not entirely sure about the "Mutate the successor to be a critical edge" step either.
Ah, I see. I'll try to pass a closure with some cleanup code to ctrl_plane.shuffle
as well
I think you might be better off building a vector of indices (e.g. (1..lowered_order.len()).collect()
) and shuffling that, then indirecting through that table whenever accessing lowered_order
. Or something along those lines.
Wouldn't that introduce some overhead even when chaos mode is disabled? I had the idea of the closure because I'm always thinking everything we do must happen inside a control plane method, so conditional compilation can get rid of it entirely.
Yeah, it's true that I'd rather not have the indirection overhead the rest of the time… Hmm…
Here's an idea that might both simplify your work and possibly provide a performance improvement during normal compilation at the same time. We can wait to build lb_to_bindex
until after lowered_order
is fully built, with:
let lb_to_bindex = FxHashMap::from_iter(
lowered_order.iter()
.enumerate()
.map(|(i, &lb)| (lb, i))
);
That could hurt performance due to worse cache locality, but could be better by pre-allocating memory for all the elements up front. I couldn't be sure without measuring it, but I'd guess lowered_order
easily fits in L1 and the cost of memory allocation is the bigger effect. At any rate it makes the loop building lowered_order
a little easier to understand.
Given that change, I think you should be safe to shuffle lowered_order
immediately before building lb_to_bindex
.
I'm very pleased that this turned out to be an unambiguous performance improvement, as well as making the code simpler and making it possible for you to make your changes. :grinning_face_with_smiling_eyes:
Fantastic! Thanks for the help :smiley:
The issue I filed about having an owned variant of arbitrary::Unstructured
: https://github.com/rust-fuzz/arbitrary/issues/147
I'm trying to include the control plane in CLIF. There's a small problem: The control_plane_fuel is never included in the printed output of cargo-fuzz. I think the input to the fuzz target is generated again when a panic is encountered, but that generation does not get passed the same CLI args as before (--fuel=XX
in my case). I think this might be impossible to fix, or am I missing something?
Here is where I'm reading the CLI arg for fuzzing, which works as expected.
Here is where I'm reading the CLI arg for printing after a panic was encountered. That doesn't work.
Printing of control plane data and parsing of both control plane data and the fuel limit are working.
Maybe an automated tooling for bisection with the fuel limit could make this less of a problem.
I think a meta-comment here might actually be useful: the work to figure out fuel is a lot of fiddly design, and it's not totally clear to me what the right approach is (still), and it's not as useful in terms of testing coverage as other kinds of randomized decisions; IMHO it'd be fine to drop the further work on it in favor of, e.g., the pseudo-insts to clobber values, or other chaos injected into lowering
that's not to say the work linked so far is not useful! I agree that a bisection tool will be nice to have. Just thinking in terms of limited time remaining
I may be missing some context here, but could the struct TestCase
have fuel: Option<u32>
as a field? The fuzz input would then be used to initialize this field (probably not with the full range of u32
but some smaller upper bound instead)
That would work, but the fuel limit is not supposed to come from the data generated by the fuzzer, its purpose is to be set manually.
For a given input by the fuzzer (which triggered some bug), the fuel limit can be set manually on top of it to figure out which mutation introduced by the control plane actually triggered the bug.
Chris Fallin said:
it'd be fine to drop the further work on it in favor of, e.g., the pseudo-insts to clobber values, or other chaos injected into lowering
Yeah, that makes sense. I'll focus on this for now. Thanks!
Hm I'm not sure I understand, if a fuzz input fails then reduction should auto reduce the fuel value and otherwise it can be manually adjusted in the text form?
this is exactly the unclear nature of the idea that I want to avoid :-) FWIW, the clearest kind of "fuel" idea I can imagine is actually unrelated to the chaos-mode features altogether, and instead controls the ordinary optimizations; it just rides on the same control plane. A kind of fuel that regulates how much of the chaos we use is potentially also useful, but much fuzzier
we've gone back and forth on this a lot; I feel time is more productively spent actually building out other kinds of chaos injection
Sorry but I'm still confused, right now this is parsing env::args
which no other fuzzer is doing so it's actively going out of its way to create a nonstandard method of generating fuzz input. The alternative, using fuel: Option<u32>
as a struct field and generating from the fuzzer input, I don't know why it has a downside?
I don't mean to radically change designs after the fact, but I'm also not sure why this would be a big change
I would prefer we actually delete the approach that reads env::args
as well; I'm not disagreeing with you at all
I felt pretty uncomfortable with it during code review (sorry @Remo Senekowitsch ) and in the end was ok with landing it to make forward progress
but I really don't like the idea of fuel controlling chaos at all; it's one fuzz input that masks another ("please ignore other parts of this struct")
my "time is more productively spent" is trying to redirect away from the idea of fuel at all
Oh well I don't know whether to say we should have fuel or not in the fuzz input, but I can say that in our experience with libfuzzer it's generally really quite good at probing all reachable paths, so although fuel can mask optimizations and such by disabling them fuzzing still has a quite good chance of reaching there when libfuzzer automatically increases the fuel or changes the input
Right; the same "good at probing all paths" though can equally just trim the sequence of decisions given in the input (and in a sense that's a more direct/efficient encoding of the "DNA")
as a meta-comment though in the context of this topic (bachelors' thesis) I again want to redirect energy away from any more work on fuel, precisely because it is not totally settled, and toward other kinds of chaos
Hello everyone :wave:
We've had a busy week getting our thesis across the finish line, we handed it in on Friday. Here's a little excerpt from the preface, translated from German:
Our gratitude goes to:
Chris Fallin, developer of Cranelift, for his instructive and patient mentoring.
He always gently guided our efforts on the right track through his experience.(our teachers)
The other developers of Cranelift, for their generous help.
These include, among others: Alex Crichton, Trevor Elliott, Nick Fitzgerald and Jamey Sharp.
So yeah, thanks again for everything :pray:
@Chris Fallin @Alex Crichton @Trevor Elliott @fitzgen (he/him) @Jamey Sharp
@Chris Fallin I'm happy to do any cleanup work you think makes sense to wrap things up. Maybe add some documentation for things that currently live in our heads only? Or improve the state of how the fuel parameter is handled. I think you'd like to have it moved out of the fuzz-target entirely and only have it part of clif-util(?)
That's great to hear, and I'm happy the project went as well as it did! Will the thesis be anywhere public that we can read it?
Re: cleanup, I think you all actually did a fairly good job of upstreaming, so I'm happy to see it stay as-is if you'd prefer that. We could work out a better way to do fuel -- doing as you say and controlling it only from the CLI, for reduction, is one option -- but there are still some unresolved design details there I think and I don't want to burden you more than needed. On the other hand, docs of any sort are always welcome -- please do feel free to summarize the system in a new markdown file and put it in cranelift/docs/
if you'd like (and after writing the thesis I suppose you have some good succinct ways of describing the system anyway!). Thanks!
The school might publish it later. But we're happy to provide a copy right now if your interested. Would you like us to run the source through a translator? The main text is only in German right now.
I'll read through those docs another time and see if I can add something of value.
sure, if you're willing to share! the original is fine, we can use a translator as necessary or puzzle it out otherwise for those who can't read German :-)
Alright, let me know if the link doesn't work for some reason:
https://drive.google.com/file/d/1Lrnn7OLbHd7GQXd3J8Pgr8g4IbNnoUJX/view
Last updated: Dec 23 2024 at 12:05 UTC