Stream: cranelift

Topic: Thesis: Automated Code Optimization with E-Graphs


view this post on Zulip bjorn3 (Jan 03 2022 at 14:42):

This thesis was just linked on hacker news: https://arxiv.org/abs/2112.14714 Would it be possible to apply it to cranelift once the rewrite to isle is done?

view this post on Zulip Chris Fallin (Jan 03 2022 at 17:25):

@bjorn3 there's been a little discussion around e-graphs, e.g. as they're used (via the egg implementation) in wasm-mutate. I think it's a really cool data structure and could be useful; my main concern when I looked at it most recently was whether there was an efficient way to drive rule application beyond "try all rewrite rules in turn"

view this post on Zulip Chris Fallin (Jan 03 2022 at 17:26):

in principle if one could build a framework/compiler that makes use of ISLE-style rewrite rules to generate an efficient rewrite pass, I think it could be useful

view this post on Zulip Chris Fallin (Jan 03 2022 at 17:27):

in particular there's a nice alignment between ISLE's idea of an input graph with multiple possible matches per value (via programmable extractors) and an e-graph's idea of multiple nodes in an equivalence class

view this post on Zulip bjorn3 (Jan 03 2022 at 17:27):

Maybe it could be used only when runtime performance is more important than compile times like I would guess it is for Fastly?

view this post on Zulip Chris Fallin (Jan 03 2022 at 17:28):

right, it could be a good way to build a more optimizing middle-end

view this post on Zulip McCoy (Jan 03 2022 at 18:01):

in Julia land, I think this sort of rewrite-driven optimization stuff is aimed at middle layer opts -- not necessarily register allocation.

view this post on Zulip McCoy (Jan 03 2022 at 18:04):

I don't quite understand the scope of ISLE tho, so maybe this is right in the ballpark

view this post on Zulip bjorn3 (Jan 03 2022 at 18:06):

ISLE happens just before register allocation. It basically translates from clif ir to machine specific vcode which has a roughly 1-to-1 correspondence with machine instructions. This vcode is then register allocated and finally emitted to raw bytes (and performing jump threading in the process of emitting).

view this post on Zulip bjorn3 (Jan 03 2022 at 18:07):

Most peephole optimizations happen in a single pass during lowering from clif ir to vcode.

view this post on Zulip bjorn3 (Jan 03 2022 at 18:09):

This is the list of all optimizations that are performed: https://github.com/bytecodealliance/wasmtime/blob/3eb82f1d8ee95c3c372d8b119d09c1560fcef307/cranelift/codegen/src/context.rs#L135-L176 All architecture dependent things happen during the backend.compile_function() call at https://github.com/bytecodealliance/wasmtime/blob/3eb82f1d8ee95c3c372d8b119d09c1560fcef307/cranelift/codegen/src/context.rs#L172. That is everything starting from the clif ir to vcode lowering through ISLE.

Standalone JIT-style runtime for WebAssembly, using Cranelift - wasmtime/context.rs at 3eb82f1d8ee95c3c372d8b119d09c1560fcef307 · bytecodealliance/wasmtime
Standalone JIT-style runtime for WebAssembly, using Cranelift - wasmtime/context.rs at 3eb82f1d8ee95c3c372d8b119d09c1560fcef307 · bytecodealliance/wasmtime

Last updated: Nov 22 2024 at 16:03 UTC