Stream: cranelift

Topic: ✔isle question


view this post on Zulip CompilerSmith (Apr 07 2024 at 19:03):

The isle DSL is very nice and i like the egraph integration support. i see there is support for mid level ir optimizations. My question is: would it be possible for someone to implement obfuscation using rewrite rules? Much like llvm ir obfuscation passes?

I would really love to generate large mixed-boolean-arithmetic expressions with the rewrite rules. However I dont want to have static rewrite rules id like to be able to generate unique sequences on the fly. Any suggestions or tips would be amazing.

view this post on Zulip fitzgen (he/him) (Apr 08 2024 at 18:54):

we don't do it with ISLE, but this is essentially what wasm-mutate does in its "peephole" pass:

CLI and Rust libraries for low-level manipulation of WebAssembly modules - bytecodealliance/wasm-tools

view this post on Zulip CompilerSmith (Apr 08 2024 at 22:21):

fitzgen (he/him) said:

we don't do it with ISLE, but this is essentially what wasm-mutate does in its "peephole" pass:

image.png

So the mutation happens prior to any cranelift ir is created?

view this post on Zulip fitzgen (he/him) (Apr 08 2024 at 22:46):

wasm-mutate is a wasm-to-wasm tool, cranelift isn't involved unless you happen to take the resulting wasm and run it through wasmtime

view this post on Zulip CompilerSmith (Apr 08 2024 at 23:30):

fitzgen (he/him) said:

wasm-mutate is a wasm-to-wasm tool, cranelift isn't involved unless you happen to take the resulting wasm and run it through wasmtime

ah i see, we are interested in doing term rewriting on the cranelift-ir directly. We are doing binary rewriting of PE files and x86 instructions and have our own little IR at the moment however cranelift-ir looks very attractive to us for obfuscation purposes.

view this post on Zulip Chris Fallin (Apr 10 2024 at 20:58):

Generating new ISLE rules on the fly isn't really possible; the rules are compiled to Rust ahead of time (and in a way that requires seeing all rules together, to properly combine them) and then compiled into Cranelift. You could potentially build a large pool of potential rewrites statically and turn them on/off with "predicate guards" (if-let clauses on the ISLE rules calling to an "is this rule enabled" helper in Rust)

view this post on Zulip Chris Fallin (Apr 10 2024 at 20:59):

however I'd also slightly discourage using Cranelift for this: you're going to be fighting the optimizer pretty hard to get less-optimal code in; rewrites are fundamentally tied to the egraph pass that also does GVN, and the cost function / extraction will push away from any "suboptimal" expression you introduce, and also eclass scope logic (widest scope wins) will mean that, for example, once an expression is rewritten to a constant, no other options can be chosen

view this post on Zulip Chris Fallin (Apr 10 2024 at 21:00):

this seems like the sort of thing where a custom IR is exactly what you want

view this post on Zulip CompilerSmith (Apr 10 2024 at 21:34):

Chris Fallin said:

Generating new ISLE rules on the fly isn't really possible; the rules are compiled to Rust ahead of time (and in a way that requires seeing all rules together, to properly combine them) and then compiled into Cranelift. You could potentially build a large pool of potential rewrites statically and turn them on/off with "predicate guards" (if-let clauses on the ISLE rules calling to an "is this rule enabled" helper in Rust)

Ah i see that will cause us problem. Even if we generated millions of rewrite rules ahead of time we wouldnt be able to recursively apply them right? Or is that determined by the cost function?

I think we will probably keep the ir and term rewrite system we currently have and just use regalloc2 with our instruction selector.

view this post on Zulip CompilerSmith (Apr 10 2024 at 21:49):

Our goal here is to generate code that the optimizer will not be able to reduce.

view this post on Zulip CompilerSmith (Apr 10 2024 at 21:54):

The fact that once an expression is reduced to a constant cant be rewritten anymore makes total sense for optimization purposes but that will cause us problems as we will want to obliterate constant values and actually make it so that the value will never be represented in memory or a register at runtime (potentially).

view this post on Zulip CompilerSmith (Apr 10 2024 at 21:57):

we plan to rewrite the semantics of the expression which operates on constant values such that the actual constant will never materialize itself in a register or memory. This isnt possible for all constants (you can imagine passing it as an argument to a function) however lots of constants can be "dematerialized" like this.

This is just one of the many things we want to do, i think we will probably just use our term rewrite system though.


Last updated: Oct 23 2024 at 20:03 UTC