Germ210 opened issue #10280:
I am in the process of designing a custom compiler that initially targets WASM before lowering the code to Cranelift for further optimization and code generation. I have observed that the cranelift-wasm crate, which previously served as a bridge for handling WASM in Cranelift, is being deprecated in favor of integrating its functionality into wasmtime-cranelift. This change prompts me to seek clarification on how I can effectively pass in bytecode that represents valid WebAssembly and retrieve a valid Cranelift intermediate representation (IR). My goal is to manipulate and transform this IR within my compiler pipeline, so I would appreciate detailed guidance on how to achieve this with the new integration. Specifically, I am interested in understanding how to leverage wasmtime-cranelift to seamlessly transition from WASM to Cranelift IR for further manipulation.
cfallin commented on issue #10280:
Hi @Germ210 -- the short answer is that this is not supported, i.e., we intentionally do not provide a public interface for this.
The reason is that there is not a general sense in which one translates Wasm to CLIF-without-additional-context. The CLIF that we internally produce from Wasm in Wasmtime's compiler pipeline is tailored specifically to Wasmtime: it references and directly accesses internal data structures in Wasmtime, it directly calls runtime entry points, it implements custom calling conventions in our host-to-Wasm and Wasm-to-host transitions, it inlines fast-paths that directly implement some engine logic, etc. It is not just compiling Wasm to CLIF; it is compiling Wasm to host-level code that can work with the Wasmtime runtime, and that host-level code happens to be represented as CLIF.
Or, said another way: what would it mean to compile Wasm to CLIF, without Wasmtime? How should we lower a Wasm heap load or store? Or a table access, or an indirect call? CLIF doesn't provide any operations for "sanboxed heaps" or "tables of typed function pointers with runtime type checks"; it only provides machine-level operations. So it only makes sense to translate to CLIF when we can generate CLIF that is specific to Wasmtime's implementations of these entities. That is why the pass is now part of Wasmtime.
It would be helpful to understand your use-case more. How are you post-processing and using this CLIF? Do you link to Wasmtime's runtime in the end, or not? Is your transform aware of Wasmtime's runtime invariants?
Germ210 commented on issue #10280:
I appreciate the context around the Wasm-to-CLIF translation being tailored specifically for Wasmtime. The use of internal data structures and runtime-specific logic makes sense as a reason for not offering a public interface.
To clarify my use case: I'm looking to explore Wasm-to-CLIF translation in a more general sense for my own purposes. Specifically, I'm considering using the output as a part of a custom pipeline that doesn't directly interact with the Wasmtime runtime but aims to process Wasm in a similar manner. I’m aware of the limitations regarding heap loads, stores, and other Wasmtime-specific entities, and would be happy to explore how those gaps could be addressed. My thought process was to make an optimizing compiler through the pipeline of:
Source -> Lex -> Parse -> Binaryen IR -> (with optional optimization) Wasm -> CLIF -> Native object code/JIT
Regarding your question, I’m not currently planning to link to Wasmtime's runtime directly, but I’d be interested in understanding more about the invariants and how they might influence the way I handle these transformations.
cfallin commented on issue #10280:
I see. Unfortunately I don't think this is just something with "gaps" that could be "addressed". What you are asking is for Wasmtime to expose half of its implementation (the compiler), and then implement your own version of the other half (the runtime) to make the compiler output work. Essentially, you are implementing your own Wasm runtime. Implementing and keeping up-to-date the whole compiled-code-to-runtime interface -- which we do not hold stable and are free to change at any time, even in point releases, and which has many subtle invariants -- is bound to be a massive, error-prone project.
I suspect there is an X-Y problem here. What are you really trying to achieve? I.e., why build native code in this way rather than execute the Wasm via Wasmtime?
Germ210 commented on issue #10280:
You're right that maintaining compatibility with Wasmtime’s internal runtime invariants would be an enormous undertaking, especially given the instability of those internal interfaces. My goal isn’t to replicate Wasmtime’s runtime but rather to explore a more flexible compilation pipeline where I can perform custom optimizations before lowering to native code. I had initially considered leveraging Wasmtime’s CLIF as a potential intermediate step, but I see now that this approach would introduce significant challenges.
Given that, I’d appreciate any recommendations you might have for alternative strategies. Specifically, I’m interested in exploring how to perform custom optimizations on Wasm before lowering to native code. Would you suggest working directly with Cranelift or another toolchain better suited for this use case?
cfallin commented on issue #10280:
If you need to perform optimizations "before lowering to native code", i.e., in a way that remains within the semantics of sandboxed execution, a Wasm-to-Wasm rewrite pass is bound to be the simplest to implement, easiest to debug and most portable solution. This is in the same domain as Binaryen (i.e.,
wasm-opt
) and many other tools so is fairly well-explored space with decent tooling, fortunately.
Germ210 commented on issue #10280:
Thank you for the clarification and recommendation. A Wasm-to-Wasm rewrite pass does seem like the most practical and maintainable solution, especially considering the available tooling and portability benefits. I appreciate you pointing me in this direction — it’s been very helpful in refining my approach. From there, what do you recommend I do?
cfallin commented on issue #10280:
Well, you'd need to either make use of an existing framework for parsing Wasm into an IR then generating Wasm back out of that IR, or write your own. Binaryen is one classic answer for such a framework; at one point I ended up writing my own framework waffle; another is orca; or you could use wasmparser and wasm\_encoder yourself.
Germ210 closed issue #10280:
I am in the process of designing a custom compiler that initially targets WASM before lowering the code to Cranelift for further optimization and code generation. I have observed that the cranelift-wasm crate, which previously served as a bridge for handling WASM in Cranelift, is being deprecated in favor of integrating its functionality into wasmtime-cranelift. This change prompts me to seek clarification on how I can effectively pass in bytecode that represents valid WebAssembly and retrieve a valid Cranelift intermediate representation (IR). My goal is to manipulate and transform this IR within my compiler pipeline, so I would appreciate detailed guidance on how to achieve this with the new integration. Specifically, I am interested in understanding how to leverage wasmtime-cranelift to seamlessly transition from WASM to Cranelift IR for further manipulation.
Germ210 commented on issue #10280:
Thank you, closing this issue.
Last updated: Feb 28 2025 at 02:27 UTC