Stream: cranelift

Topic: How to inspect generated optimized assembly


view this post on Zulip Setzer22 (Apr 19 2023 at 19:41):

I'd like to better understand what optimizations cranelift is capable of doing. I've figured out how to print the unoptimized IR of a function via Function::display, but I can't seem to find a way to inspect the generated optimized assembly. I'm assuming optimization happens while lowering IR to machine code and thus there's no way to inspect optimized IR? If that was possible, it would be even better :)

view this post on Zulip Jamey Sharp (Apr 19 2023 at 19:48):

There are several ways you can do this. It sounds like you specifically want to print the optimized code that results from some frontend you're writing, yeah? (If you're using Wasmtime rather than your own Cranelift frontend, try wasmtime explore.)

If you want to see the result of optimization on the Cranelift intermediate representation (CLIF), you can call cranelift_codegen::Context::for_function to get a compilation context, then call the optimize method on that, and finally look at the func field in the context for the optimized function. (See cranelift/filetests/src/test_optimize.rs for an example.)

If you want to see the generated assembly instead, see the "precise output" mode of cranelift/filetests/src/test_compile.rs for an example of how to do that.

view this post on Zulip bjorn3 (Apr 19 2023 at 20:03):

Part of the optimizations happen on clif ir, while other parts happen during lowering or code emission.

view this post on Zulip Setzer22 (Apr 19 2023 at 20:10):

It sounds like you specifically want to print the optimized code that results from some frontend you're writing, yeah?

Yup, that's correct :smile:

If you want to see the result of optimization on the Cranelift intermediate representation (CLIF), you can call cranelift_codegen::Context::for_function

Oops, looks like I was already looking at optimized output. I'm still not fully familiar with the API. Was using cranelift-jit and apparently this was calling ctx.compile() for me, which will optimize the code (since I was setting opt_level speed in the flags). :sweat_smile:

view this post on Zulip Jamey Sharp (Apr 19 2023 at 20:11):

Oh yeah, that's a good point bjorn3! All the optimizations that can move code around or make other big changes work at the level of the IR, but when lowering to machine code we can sometimes select a target instruction that implements several IR instructions at once, and after that there are some local cleanups to eliminate branch instructions. For the branch optimizations see Chris' excellent blog post: https://cfallin.org/blog/2021/01/22/cranelift-isel-2/

view this post on Zulip bjorn3 (Apr 19 2023 at 20:16):

The context after calling compile will contain the compiled code and if requested the vcode (arch specific ir which looks a lot like assembly, but doesn't have jump threading applied and contains block labels)

view this post on Zulip Setzer22 (Apr 19 2023 at 20:27):

I'd like to have a look at the vcode, but I'm not sure how to request / access it? I tried grepping the codebase but didn't get far. I don't think there's anything in Context that refers to vcode :sweat_smile:

view this post on Zulip Setzer22 (Apr 19 2023 at 20:33):

Ahh, nevermind, figured it out :smile: You have to call ctx.set_disasm(true), and then find the vcode by finding it inside the CompiledCoe (obtainable via ctx.compiled_code())


Last updated: Oct 23 2024 at 20:03 UTC