espoal opened issue #12238:
I'm building a next generation distributed datastore and I would like to add the possibility of querying using a relational model. A few weeks ago I had the crazy idea of compiling SQL prepared statements instead of running them through a traditional query planner, before finding out I'm not the first one to use this approach.
Right now I _transliterate_ the query to Rust before compiling it, but maybe there is a smarter approach. My questions are:
- Can Cranelift support this use case or is it too far from its original design?
- Could Cranelift help me skip the _transliteration_ step?
- Are you aware of anyone else using this same approach with Cranelift instead of LLVM?
More details:
Let's say I have a query in the form:SELECT ot.customer_ID, ct.customer_name, SUM(ot.order_total) as total_amount, UDF.heavy_function(ot.order_total) AS processed_total FROM orders_table AS ot JOIN customers_table AS ct ON ct.customer_ID = ot.customer_ID WHERE ot.customer_ID IN $customers_array AND ot.order_date BETWEEN $start_date AND $end_date GROUP BY ot.customer_ID, ct.customer_name;Right now the compilation looks like:
- The user register the SQL query before executing it
- I parse the query using an SQL parser
- I extract the parameters, these will enter in the signature of the Rust function representing the query
- I _transliterate_ to Rust using some very ugly code (a bunch of
match,ifand generic types)- I run some compile time optimizations (using a cost based function, like the vulcano query planner)
- I compile the code
- I cache the compiled output
- I notify the user that the query is ready to use
To run the query:
- The user calls the query and provide the inputs
- I search for the cached compiled output using an hash of the query
- I launch one thread for each shard
- I reduce the results and present it to the user
Ideally I would like to run the query inside a WASM runtime.
I apologize in advance if this question is off topic for this repo and for my noobishness.
cfallin commented on issue #12238:
@espoal yes, it should be possible to use Cranelift as a JIT-compiler backend for SQL queries -- we are a fully general-purpose compiler and there's nothing too special about SQL once it's query-planned down to loops and loads and compares and the like, IIUC. (We have SIMD operators you'd likely want to use for scans as well.)
That said, one point of clarification: you ask about Cranelift (which is an SSA IR to machine code compiler) but then say
Ideally I would like to run the query inside a WASM runtime.
which is a different thing -- to run Wasm (in our universe) one would typically use Wasmtime, which translates Wasm bytecode to Cranelift's IR then invokes Cranelift to compile that to machine code; and the resulting machine code calls into Wasmtime's runtime, so it's not a separable piece.
Either generating machine code and invoking it directly (raw Cranelift JIT backend) or generating Wasm modules on the fly and loading/invoking them in Wasmtime are possible; the latter harder mainly because of the memory access question (does the Wasm module get the whole database in memory as its heap? compile-time builtins? something else?). Using just Cranelift and generating loads (and stores) to your native data structures would be the conventional and recommended approach.
One other point: you ask if you could "skip the transliteration step", and I don't think you can no matter what you do, though you may be transliterating ("compiling"!) to a different target. Right now you're basically compiling SQL to Rust; instead you could compile SQL to Cranelift IR (CLIF). It's lower-level, so you'll have to implement e.g. match semantics yourself with raw if-else branches or branch tables, but very possible.
espoal commented on issue #12238:
Dear @cfallin thank you for your polite reply and for engaging with my idea! I appreciate it a lot!
So to repeat what I understood:
- I would parse the SQL and lower it to an AST
- I would perform query planning / some optimizations during this first lowering
- I would need to write a custom frontend that takes the AST and feeds it to Cranelift for lowering into CLIF
- Linking external code via
cranelift-modulewould happen here?- I could feed the generated CLIF to Wasmtime, letting the pipeline deal with any further lowering
- Or I could lower the CLIF to machine code for native execution
The situation gets a bit more complicated because I would like to let the user create UDFs in any language that is supported by Cranelift. This linking would happen at step 4 I guess? If the user writes a Rust/C++ UDF I would use the respective frontends to lower the code to CLIF, and then link the two representations using
cranelift-module?I'm also a bit confused about AOT vs JIT in the Wasmtime pipeline, let me see if I got this right:
- I could do pure JIT, and do everything on the fly
- I could mix AOT/JIT by compiling the CLIF to WASM, and cache the textual representation
- I could do pure AOT by compiling CLIF to WAST, and cache the output
With obvious tradeoffs in execution/compilation speed. Is this right?
To further elaborate on my idea I took inspiration from this presentation, where a JIT or AOT is picked based on some heuristics (slow vs fast queries, one time vs reusable queries). I also would like to improve the current compiling databases approaches by putting emphasis on parameterized queries which should enable better caching. Cranelift seems to be IDEAL to replicate the current body of academic research, without having to write custom LLVM patchsets like Umbra and Hyper are doing.
the latter harder mainly because of the memory access question (does the Wasm module get the whole database in memory as its heap? compile-time builtins? something else?)
My datastore is architected as a distributed state machine with swizzled pointers, like:
project@account/collectionName/tableName/columnName[:special_qualifier]At compile time the data requirements are analyzed and cached along the output. At runtime I would fetch the data before execution, so I could load it in the heap. I was thinking of exposing this via a
load_swz_ptrsyscall, possibly using the WASI specification if it's not overkill, very much like today normalloadoperations are used to load arbitrary records on the heap, orreadis used to load an arbitrary opened file.The execution runs in stages, so for the query I used as an example I would:
- Detect the indexes (
customer_idandorder_date) needed to satisfy theWHEREclause and cache this requirement at compilation time- At execution time I would fetch the live indexes before launching the first stage, one thread per shard
- The output of the first stage would be a list of
order_id, which again would be fetched before starting the next stage- The second stage would aggregate the results per shard and would
JOINthe resulting rows- The last stage would aggregate all the shards to a single output
One other point: you ask if you could "skip the transliteration step", and I don't think you can no matter what you do, though you may be transliterating ("compiling"!) to a different target. Right now you're basically compiling SQL to Rust; instead you could compile SQL to Cranelift IR (CLIF). It's lower-level, so you'll have to implement e.g. match semantics yourself with raw if-else branches or branch tables, but very possible.
You are totally right! My background is math, not CS, and I suffer a bit from impostor syndrome. What I meant by skipping _transliteration_ is the idea of using a real frontend to target CLIF, like a true CS PhD would do, instead of the very hacky approach I'm using right now.
espoal edited a comment on issue #12238:
Dear @cfallin thank you for your polite reply and for engaging with my idea! I appreciate it a lot!
So to repeat what I understood:
- I would parse the SQL and lower it to an AST
- I would perform query planning / some optimizations during this first lowering
- I would need to write a custom frontend that takes the AST and feeds it to Cranelift for lowering into CLIF
- Linking external code via
cranelift-modulewould happen here?- I could feed the generated CLIF to Wasmtime, letting the pipeline deal with any further lowering
- Or I could lower the CLIF to machine code for native execution
The situation gets a bit more complicated because I would like to let the user create UDFs in any language that is supported by Cranelift. This linking would happen at step 4 I guess? If the user writes a Rust/C++ UDF I would use the respective frontends to lower the code to CLIF, and then link the two representations using
cranelift-module? Also you can see why Wasmtime is beneficial for execution, since UDFs are user generated untrusted code and the sandboxing comes very useful.I'm also a bit confused about AOT vs JIT in the Wasmtime pipeline, let me see if I got this right:
- I could do pure JIT, and do everything on the fly
- I could mix AOT/JIT by compiling the CLIF to WASM, and cache the textual representation
- I could do pure AOT by compiling CLIF to WAST, and cache the output
With obvious tradeoffs in execution/compilation speed. Is this right?
To further elaborate on my idea I took inspiration from this presentation, where a JIT or AOT is picked based on some heuristics (slow vs fast queries, one time vs reusable queries). I also would like to improve the current compiling databases approaches by putting emphasis on parameterized queries which should enable better caching. Cranelift seems to be IDEAL to replicate the current body of academic research, without having to write custom LLVM patchsets like Umbra and Hyper are doing.
the latter harder mainly because of the memory access question (does the Wasm module get the whole database in memory as its heap? compile-time builtins? something else?)
My datastore is architected as a distributed state machine with swizzled pointers, like:
project@account/collectionName/tableName/columnName[:special_qualifier]At compile time the data requirements are analyzed and cached along the output. At runtime I would fetch the data before execution, so I could load it in the heap. I was thinking of exposing this via a
load_swz_ptrsyscall, possibly using the WASI specification if it's not overkill, very much like today normalloadoperations are used to load arbitrary records on the heap, orreadis used to load an arbitrary opened file.The execution runs in stages, so for the query I used as an example I would:
- Detect the indexes (
customer_idandorder_date) needed to satisfy theWHEREclause and cache this requirement at compilation time- At execution time I would fetch the live indexes before launching the first stage, one thread per shard
- The output of the first stage would be a list of
order_id, which again would be fetched before starting the next stage- The second stage would aggregate the results per shard and would
JOINthe resulting rows- The last stage would aggregate all the shards to a single output
One other point: you ask if you could "skip the transliteration step", and I don't think you can no matter what you do, though you may be transliterating ("compiling"!) to a different target. Right now you're basically compiling SQL to Rust; instead you could compile SQL to Cranelift IR (CLIF). It's lower-level, so you'll have to implement e.g. match semantics yourself with raw if-else branches or branch tables, but very possible.
You are totally right! My background is math, not CS, and I suffer a bit from impostor syndrome. What I meant by skipping _transliteration_ is the idea of using a real frontend to target CLIF, like a true CS PhD would do, instead of the very hacky approach I'm using right now.
cfallin commented on issue #12238:
You're on the right track! A few thoughts:
- One thing to make clear is that the Wasm-based path and the CLIF-based path are more or less disjoint. Wasmtime doesn't accept CLIF; it only accepts Wasm. (Think of CLIF and the Cranelift backend as an internal implementation detail of Wasmtime; in fact it has another compiler backend that doesn't use CLIF/Cranelift at all, which can be chosen at runtime.)
- Given that, if we scope the problem to only compiling SQL, it seems pretty clear to me that compiling directly to CLIF is the most straightforward route. In essence you are implementing your own sandbox by providing only safe capabilities/abstractions at the SQL semantic level. The tradeoff is that your SQL-to-CLIF compiler is part of the trusted base and bugs in that compiler might result in sandbox escapes at the database level. But it avoids the (IMHO fairly major) impedance mismatches inherent in trying to ship data into and back out of the Wasm sandbox in a performant way when doing large scans of in-memory cached data.
- The UDF problem is separate and fairly interesting. In fact I've heard of others using Wasm for exactly this use-case. In theory you could do a native SQL-to-CLIF compiler that, for apply-UDF nodes in your AST, compiles to a call back into host code that then invokes the UDF Wasm module via Wasmtime APIs.
Last updated: Jan 09 2026 at 13:15 UTC