The abstract of https://arxiv.org/abs/2011.13127 seems interesting:
Copy-and-Patch Compilation: A fast compilation algorithm for high-level languages and bytecode
[...]
The generated code runs an order of magnitude faster than interpretation and 14% faster than LLVM -O0. Our WebAssembly compiler generates code 4.9X-6.5X faster than Liftoff, the WebAssembly baseline compiler in Google Chrome. The generated code also outperforms Liftoff's by 39%-63% on the Coremark and PolyBenchC WebAssembly benchmarks.
i came here to share that too! i think cranelift could be useful to generate a stencil library. could be a nice improvement for baseline compilation in wasmtime
It's a really interesting technique, thanks!
is the full text available anywhere?
At the arxiv link above, I think! "PDF" in the right bar
oh, I missed it. thanks
in the same area, there's the classic "multi-level quickening" https://arxiv.org/abs/2109.02958
better link: http://fredrikbk.com/publications/copy-and-patch.pdf
@Alexa VanHattum @Chris Fallin continuing this thread, but not quite a paper, there's been some interesting work verification work on the LLVM side of things for aarch64: https://blog.regehr.org/archives/2265
Super cool, thank you!
https://drops.dagstuhl.de/opus/volltexte/2022/15933/pdf/dagrep_v011_i010_p173_21481.pdf
Secure compilation is an emerging field that puts together advances in security, programming languages, compilers, verification, systems, and hardware architectures in order to devise more secure compilation chains that eliminate many of today’s security vulnerabilities and that allow sound reasoning about security properties in the source language. For a concrete example, all modern languages provide a notion of structured control flow and an invoked procedure is expected to return to the right place. However, today’s compilation chains (compilers, linkers, loaders, runtime systems, hardware) cannot efficiently enforce this abstraction against linked low-level code, which can call and return to arbitrary instructions or smash the stack, blatantly violating the high-level abstraction. Other problems arise because today’s languages fail to specify security policies, such as data confidentiality, and the compilation chains thus fail to enforce them, especially against powerful side-channel attacks. The emerging secure compilation community aims to address such problems by identifying precise security goals and attacker models, designing more secure languages, devising efficient enforcement and mitigation mechanisms, and developing effective verification techniques for secure compilation chains.
This seminar strived to take a broad and inclusive view of secure compilation and to provide a forum for discussion on the topic. The goal was to identify interesting research directions and open challenges by bringing together people working on building secure compilation chains, on designing security enforcement and attack-mitigation mechanisms in both software and hardware, and on developing formal verification techniques for secure compilation.
@Alexa VanHattum https://stefanheule.com/s/projects/strata/ seems relevant for your work on verifying ISLE lowering rules. It has formal semantics for a lot of x86_64 instructions that were semi-automatically discovered. In all cases where the discovered semantics didn't match hand-written semantics, the hand-written semantics were in fact incorrect.
Not exactly a paper and possibly not all that practical, but https://www.mattkeeter.com/blog/2022-10-04-ssra/ shows a register allocator that is supposedly very fast, but does have some limitations.
Interesting write-up for sure; have seen this in LuaJIT before (as they mention) and it's a very effective technique for trace compilers. I don't see any mention of control-flow in the post, and that's one of the confounders for any single-pass algorithm; curious if they have more thoughts on it
It explicitly mentions that it doesn't work with control-flow. Of course you can always spill everything on any kind of control-flow, but that is probably very flow at runtime.
It also produces sub-optimal results if you have fixed-reg constraints since you only have one pass through the code.
But I guess you could remedy that by adding a pre-processing pass that gathers register hints for each vreg.
It's somewhat like what I had in mind for https://github.com/bytecodealliance/regalloc2/issues/81.
https://www.cs.cmu.edu/~dkoes/research/CGO08-NOLTIS.pdf looks interesting. It is a near-optimal instruction selector running in linear time.
(linked from https://lobste.rs/s/p93jau/e_graph_extraction_problem_is_np_complete)
Hydra: Generalizing Peephole Optimizations with Program Synthesis
https://users.cs.utah.edu/~regehr/generalization-oopsla24.pdf
Pretty cool! Bolting this onto souper/souper-harvest would be neat.
Afonso Bordado said:
Hydra: Generalizing Peephole Optimizations with Program Synthesis
https://users.cs.utah.edu/~regehr/generalization-oopsla24.pdf
See also https://github.com/bytecodealliance/wasmtime/issues/5783 -- would be great to land those already-generalized-for-us rules as a first step!
Last updated: Jan 24 2025 at 00:11 UTC