The isle DSL is very nice and i like the egraph integration support. i see there is support for mid level ir optimizations. My question is: would it be possible for someone to implement obfuscation using rewrite rules? Much like llvm ir obfuscation passes?
I would really love to generate large mixed-boolean-arithmetic expressions with the rewrite rules. However I dont want to have static rewrite rules id like to be able to generate unique sequences on the fly. Any suggestions or tips would be amazing.
we don't do it with ISLE, but this is essentially what wasm-mutate
does in its "peephole" pass:
fitzgen (he/him) said:
we don't do it with ISLE, but this is essentially what
wasm-mutate
does in its "peephole" pass:
So the mutation happens prior to any cranelift ir is created?
wasm-mutate
is a wasm-to-wasm tool, cranelift isn't involved unless you happen to take the resulting wasm and run it through wasmtime
fitzgen (he/him) said:
wasm-mutate
is a wasm-to-wasm tool, cranelift isn't involved unless you happen to take the resulting wasm and run it through wasmtime
ah i see, we are interested in doing term rewriting on the cranelift-ir directly. We are doing binary rewriting of PE files and x86 instructions and have our own little IR at the moment however cranelift-ir looks very attractive to us for obfuscation purposes.
Generating new ISLE rules on the fly isn't really possible; the rules are compiled to Rust ahead of time (and in a way that requires seeing all rules together, to properly combine them) and then compiled into Cranelift. You could potentially build a large pool of potential rewrites statically and turn them on/off with "predicate guards" (if-let clauses on the ISLE rules calling to an "is this rule enabled" helper in Rust)
however I'd also slightly discourage using Cranelift for this: you're going to be fighting the optimizer pretty hard to get less-optimal code in; rewrites are fundamentally tied to the egraph pass that also does GVN, and the cost function / extraction will push away from any "suboptimal" expression you introduce, and also eclass scope logic (widest scope wins) will mean that, for example, once an expression is rewritten to a constant, no other options can be chosen
this seems like the sort of thing where a custom IR is exactly what you want
Chris Fallin said:
Generating new ISLE rules on the fly isn't really possible; the rules are compiled to Rust ahead of time (and in a way that requires seeing all rules together, to properly combine them) and then compiled into Cranelift. You could potentially build a large pool of potential rewrites statically and turn them on/off with "predicate guards" (if-let clauses on the ISLE rules calling to an "is this rule enabled" helper in Rust)
Ah i see that will cause us problem. Even if we generated millions of rewrite rules ahead of time we wouldnt be able to recursively apply them right? Or is that determined by the cost function?
I think we will probably keep the ir and term rewrite system we currently have and just use regalloc2 with our instruction selector.
Our goal here is to generate code that the optimizer will not be able to reduce.
The fact that once an expression is reduced to a constant cant be rewritten anymore makes total sense for optimization purposes but that will cause us problems as we will want to obliterate constant values and actually make it so that the value will never be represented in memory or a register at runtime (potentially).
we plan to rewrite the semantics of the expression which operates on constant values such that the actual constant will never materialize itself in a register or memory. This isnt possible for all constants (you can imagine passing it as an argument to a function) however lots of constants can be "dematerialized" like this.
This is just one of the many things we want to do, i think we will probably just use our term rewrite system though.
Last updated: Nov 22 2024 at 16:03 UTC