zkASM backend · cranelift · Zulip Chat Archive

If you've been keeping a close eye on the pull request list, you may have noticed https://github.com/bytecodealliance/wasmtime/pull/6898. This PR was an accident of sorts, but the experiments with the zkASM backend did continue making progress since. We've made enough progress that the backend is now able to produce valid code for very basic wasm programs (although the code is pretty MVP quality.)

As we're getting closer to dedicating more time to this, I wanted to do a temperature check of sorts. How comfortable you all would be with a backend like this being developed in the upstream repository? Or maybe perhaps there is another, better to develop a target, other than maintaining a fork, that we haven't thought of?

This target would be unique first and foremost because it is not targeting, and is unlikely to ever target, any particular piece of hardware. Instead, the machine executing the produced code is implemented in software. It is also a pretty unique machine in that unlike regular architectures it does not actually execute machine instructions in the traditional sense; it also does not only produce effects of executing a program, but also generates additional piece of data that allows to prove (more quickly than by executing the program itself) that the effects produced are indeed a result of executing a that program. Hence zero-knowledge in zkASM.

From top of my head the caveats are that our current prototype:

Unlike the other targets, outputs is a textual assembly, which is later converted by non-trivial logic to “operations” for the underlying machine.
- Though I don't see why we couldn't introduce a binary object file representation of the same, if necessary.
The machine isn't completely nailed down quite yet and changes to it are being made quite liberally
- For instance, the team that's working on zkASM may change the implementation away from using 256 bit registers towards the more usual 64 bit ones;
- On the other hand there's no need to maintain compatibility with previous versions of the “architecture”.

Why do we want to work upstream so relatively soon? Well, there are a couple of things we hope to achieve by doing this: first most of us working on the backend don't really have any working experience with the cranelift internals, so it would be great to have some guidance, and we felt that having the changes to show up as PRs on the upstream repository would hopefully give us an opportunity to ask for insights more easily (although we don't necessarily expect anybody to dedicate their precious time actually responding;) and more importantly because it would make it easier for us to keep up with the relevant changes in cranelift as they occur upstream.

Any questions? Glad to try to answer or, failing that, get people who know the answers chime in :smile:

zkasm hackaton by nagisa · Pull Request #6898 · bytecodealliance/wasmtime

does not build yet

Introduction to zkASM | Polygon Wiki

The ideas behind zkASM

Alex Crichton (Sep 18 2023 at 14:40):

From a project perspective two things that may interest you are:

Documentation on tier 3 requirements in Wasmtime (and by extension Cranelift) - basically the minimum requirements for landing something in-tree

Alex Crichton (Sep 18 2023 at 14:41):

There's weekly Cranelift meetings which might be a good venue to discuss this with other Cranelift maintainers (if necessary beyond Zulip, which of course also works)

Alex Crichton (Sep 18 2023 at 14:41):

ah, and meetings link

Chris Fallin (Sep 18 2023 at 15:20):

One other thing that would be good to understand (it's a little unclear from the zkASM link you gave): what's the expected use-case?

For "exists as silicon in the physical world" ISAs, this is fairly self-evident, as long as the architecture is still in use: e.g. I don't think there's any question that we'd accept an arm32 backend, or even something a little rarer now like mips32, as long as all the other tier 3 reqs (maintainers, ...) are met. But for more niche virtual ISAs, we'd want to consider the use-cases against the maintenance cost: e.g. are we enabling one very specific user or is this something that could enable a bunch of folks to experiment and make use of Cranelift in new ways.

Another thing to consider with less-common ISAs is what tooling and documentation exists: for all our ports currently, we can use qemu to test, we can build ELF binaries of Wasmtime for linux/<CPU> and test them, we can debug, there are ISA manuals. It'd be good to understand where all of that stands with zkASM as well

Andrei Kashin (akashin) (Sep 18 2023 at 18:16):

Chris Fallin said:

One other thing that would be good to understand (it's a little unclear from the zkASM link you gave): what's the expected use-case?

Great question :)

At a really high-level, WASM -> zkASM compiler paired with zkASM prover (already exists) would allow generating short proofs that a given WASM program produced a given result and the proof could be cheaply verified by anyone.

The main use-case is verifiably outsourcing computation - you can ask someone to run an expensive program for you and give you the result together with a proof of a computation. This way you can be sure that they indeed ran your program and not just gave you some random output. This turns out to be useful in many different contexts, to name a few:

Scaling blockchains (this is the angle that we come from) - blockchain is basically a group of nodes that try to agree the result of same deterministic computation. And currently all the nodes need to re-run the same computation and compare it with results from other nodes. In the model with ZK proofs, it would be enough for a single node to produce a proof and the rest of nodes just need to verify it.
Verifiable compilation provenance - proving that the result of compiling VIM source code with a given GCC compiler produces a given binary. The user will be able to validate the proof by only having VIM git commit hash and md5 sums of GCC and resulting VIM binary and won't need to repeat the compilation.

I wrote a more concrete explanation in: https://hackmd.io/DK6S9vsbQ9GAmBJEOIwh4w?view#Background, let me know if it makes sense!

zkProver Design Approach | Polygon Wiki

A brief introduction to the zkProver's State Machine design approach

Cranelift for ZK WASM - HackMD

Andrei Kashin (akashin) (Sep 19 2023 at 08:34):

Chris Fallin said:

Another thing to consider with less-common ISAs is what tooling and documentation exists: for all our ports currently, we can use qemu to test, we can build ELF binaries of Wasmtime for linux/<CPU> and test them, we can debug, there are ISA manuals. It'd be good to understand where all of that stands with zkASM as well

There is already a decent amount of tooling around ZK ASM:

Text Assembly Parser
Simulators and profilers in JS and C++
ISA Spec document

We will most likely extend this tooling with Rust implementation/bindings to make testing in Rust-based projects simpler.

GitHub - 0xPolygonHermez/zkasmcom: This repo compiles .zkasm to a json ready for the zkExecutor

This repo compiles .zkasm to a json ready for the zkExecutor - GitHub - 0xPolygonHermez/zkasmcom: This repo compiles .zkasm to a json ready for the zkExecutor

GitHub - 0xPolygonHermez/zkevm-proverjs: zkEVM proof generator reference written in Javascript

zkEVM proof generator reference written in Javascript - GitHub - 0xPolygonHermez/zkevm-proverjs: zkEVM proof generator reference written in Javascript

GitHub - 0xPolygonHermez/zkevm-prover: zkEVM prover in C++

zkEVM prover in C++. Contribute to 0xPolygonHermez/zkevm-prover development by creating an account on GitHub.

Andrei Kashin (akashin) (Sep 19 2023 at 08:37):

Alex Crichton said:

There's weekly Cranelift meetings which might be a good venue to discuss this with other Cranelift maintainers (if necessary beyond Zulip, which of course also works)

Will you be open for us to give a quick (10-15 min) presentation to provide more context on one these meetings (e.g. September 27th) ?

Chris Fallin (Sep 19 2023 at 16:42):

Andrei Kashin (akashin) said:

Will you be open for us to give a quick (10-15 min) presentation to provide more context on one these meetings (e.g. September 27th) ?

You're always welcome to add agenda items -- please feel free to submit a PR to the bytecodealliance/meetings repo to add an item under cranelift/2023/ in the file for whichever date you prefer.

Andrei Kashin (akashin) (Sep 20 2023 at 09:18):

Chris Fallin said:

Andrei Kashin (akashin) said:

Will you be open for us to give a quick (10-15 min) presentation to provide more context on one these meetings (e.g. September 27th) ?

You're always welcome to add agenda items -- please feel free to submit a PR to the bytecodealliance/meetings repo to add an item under cranelift/2023/ in the file for whichever date you prefer.

I see! Sent https://github.com/bytecodealliance/meetings/pull/161, let me know if you need any more details there.

Add zkASM to agenda for September 27 by akashin · Pull Request #161 · bytecodealliance/meetings

Let me know if I need to provide a more detailed agenda!

Juan Bono (Sep 20 2023 at 11:06):

Hi @Andrei Kashin (akashin) , excuse me if my question is off topic but what prover are you using?

Really nice initiative btw.

fitzgen (he/him) (Sep 20 2023 at 16:32):

Andrei Kashin (akashin) said:

At a really high-level, WASM -> zkASM compiler paired with zkASM prover (already exists) would allow generating short proofs that a given WASM program produced a given result and the proof could be cheaply verified by anyone.

I think there might be some hidden assumptions here that are going to affect how well you can prove that a compiled Wasm program is computing anything. I want to help uncover them and make a bunch of stuff explicit so that we are all on the same page and you understand what you're asking and what you're getting into.

The biggest thing to note is that cranelift doesn't compile Wasm in isolation:

you can't just take cranelift off the shelf to compile Wasm and get a runnable wasm binary.
cranelift doesn't even know anything about Wasm, it just compiles its own intermediate representation called CLIF.
you can use the cranelift-wasm crate to translate Wasm to CLIF. but CLIF by itself can't implement Wasm's semantics. Wasm translated to CLIF needs access to the external world to, for example, load and store to the native memory regions that implement the Wasm linear memories. How to get those regions and where they are placed in the native address space is something you have to configure in cranelift-wasm and is very specific to the runtime you're using.
that's because you need to pair a Wasm runtime with Cranelift to actually compile and run Wasm, for us that runtime is Wasmtime
Wasmtime doesn't even implement all of these interface-with-the-outside-world bits in inline CLIF itself, a lot of that stuff is in libcalls. for example the memory.grow instruction is implemented in Rust code as a libcall that the compiled Wasm calls out to
so the compiled code is going to have a bunch of stuff that you'd need to just accept as axioms in your proofs, like function calls out to the runtime/host. the compiled wasm binary is not "pure" or standalone. I don't know if that is a deal breaker or not, or if it is even possible to have host calls in zkasm
but this also implies that all of Wasmtime needs to be available in zkasm since the compiled code relies on calling directly to Wasmtime's runtime functionality. getting all of Wasmtime compiled to zkasm on top of the new cranelift backend is going to be a very large amount of work. and I suspect that zkasm will be missing things that Wasmtime assumes and requires, like the ability to map new pages executable. and we would need to analyze the effect that being able to compile all of Wasmtime to zkasm, if it is even possible, would have on Wasmtime's maintainability and look at the balance of the costs and benefits.
in theory you could create a runtime that can run Wasm inside some virtual cpu while leaving the runtime in native. that isn't Wasmtime; Wasmtime is not architected to do that.

I know I am coming across somewhat pessimistic, but I really want folks to understand that this isn't a thing where one could "just" add a new zkasm Cranelift backend to get provable Wasm computations. There is a lot more involved here.

nagisa (Sep 21 2023 at 07:31):

Thank you for taking the time to bring these concerns to our attention!

this also implies that all of Wasmtime needs to be available in zkasm since the compiled code relies on calling directly to Wasmtime's runtime functionality.

As a background, we both come from a team that's been working on a (fork of a) WASM runtime, so we do have the potential complications in the areas you've raised on our mind as well.

zkasm, specifically, does have ways to call out to external code for complicated behaviours and mechanisms to write down proofs for those call outs. As far as I am aware, even more complicated operations that are traditionally regular instructions, such as multiplication and division, are expected to be implemented using this mechanism at this moment in time (although this might change in the future, thus basically shifting the burden of proof to a different place.)

nagisa (Sep 21 2023 at 08:11):

I guess the major thing that remains to be seen is how to best approach providing/linking in proofs for not only the libcalls but also the user provided imports/exports. For our immediate use-case that wasn't a big deal since our interface is locked-in, while for something general purpose like wasmtime proper there would need to be a generic solution that anybody can use. And I have no idea how to best approach this at this moment in time!

Andrei Kashin (akashin) (Sep 21 2023 at 15:58):

Juan Bono said:

Hi Andrei Kashin (akashin) , excuse me if my question is off topic but what prover are you using?

Really nice initiative btw.

Thank you :)
We will be using PIL (Polynomial Identity Langugage) to describe constraints and generate STARKs and Plonky2 as a proving system (based on FRI).
I think we'll try to stay reasonably flexible on this front as new proving systems appear at a fast pace, but for the first version, we'll focus on this proving stack.

Polynomial Identity Language | Polygon Wiki

Polynomial Identity Language (PIL) is a domain-specific language (DSL) created to provide developers with a holistic framework for constructing programs through an easy-to-use interface, and abstracting the complexity of proof/verification mechanisms.

GitHub - mir-protocol/plonky2

Contribute to mir-protocol/plonky2 development by creating an account on GitHub.

nagisa (Sep 27 2023 at 20:24):

So my summary takeaway is:

We will probably continue working in a fork for the time being, but will seriously explore having this backend as a separate crate as it matures;
It is in clift’s interest to have the infrastructure (and, I guess, documentation) necessary for first-class backends of such sort (and we should start by creating an issue for this – I'll make a note to start one tomorrow;)
If we were to come in with a PR upstream, we would still need a good plan on how to handle things like imported wasm memory and functions like memory.grow and make sure it works well and is generally applicable.

nagisa (Oct 02 2023 at 09:14):

Filed the issue at https://github.com/bytecodealliance/wasmtime/issues/7124

Externally maintained cranelift backends · Issue #7124 · bytecodealliance/wasmtime

Feature Today cranelift has a number of backends for traditional computer architectures: x86, aarch64, riscv64 and some others as well. It would be great if we could have many more, however adding ...

Chris Clark (Oct 08 2023 at 19:23):

If this is possible to verify an expected computation, why does it need to be a different isa?

The more I think about the whole purpose of the project, the more I think it just isn't possible. In order to verify a cryptic hash function its input and output, you must run the hash function. Otherwise, it wasn't a very good hash function. If its just using the public key to verify the digest, that operation is cheap.

Moreover, the zkasm is specific to things related to cryptography, not a real isa.

Chris Fallin (Oct 08 2023 at 19:29):

@Chris Clark the "not a real [I might say 'physical'] ISA" bit is the main reason we prefer not to upstream it at the moment, since it's so different from other ISAs with a full OS stack, ABI, linkers and debuggers, and the like. In that sense I'd agree.

However I would also note and defend the idea a bit that zero-knowledge proofs are a pretty deep research area, and there's something novel and useful there. (It's pretty counterintuitive -- it really is "zero knowledge", i.e. you don't know the input, but it's probabilistic.) Not to pass any judgment on whether or how it fits into Cranelift but I want to make sure we are fair to the idea and those who work hard on it :-)

Chris Clark (Oct 08 2023 at 20:19):

My apologies. Countering my statement of it "just isn't possible", is probably that entire area of research. I will leave that up to the experts, and meant no ill-will.

nagisa (Oct 09 2023 at 13:42):

To answer "why does it need to be a different ISA" is probably quite multifaceted. To motivate "why ISA at all" first: Zero Knowledge proofs use some pretty complicated math behind the scenes I am not really able to fully grok despite my involvement in a relevant project. The way I view zkASM is that should be considered largely as a mechanism to simplify expressing the computations to prove in a way that’s more familiar to the broader groups (such as that of Software Engineers). One major benefit, just like with many simplifying abstractions, is that it is now possible to verify that the conversion is correct once and that makes it scalable for the experts in the field to pool their attention to verifying just those few equations. “Why a different ISA” then? I don’t see why we couldn’t be converting x64 or RISC-V down to those complicated math expressions. It is just that making a new one doesn’t cost very much and gives the ones working on the machine some additional flexibility to adapt it to what the underlying maths can express.

Overall I think zkASM is quite similar to all of the traditional [hardware/physical] ISAs, from the perspective of motivation, benefits and tradeoffs.

On the point of it being specific to cryptography – zkASM is still a project in heavy development and it’s direction up to this point was indeed driven largely by crypto field needs. We don’t think there’s a huge reason it wouldn’t be possible to make it more generally applicable than it currently is, and is one of the things that is being worked on right now. In principle the machine is already a linear bounded automaton complete, although lack of some operations can make it quite painful to use in certain contexts.

On the point of hash function computations: I unfortunately am enabled to be ignorant enough of the underlying research to be able to counter this observation. I do know that (some?) hash functions are quite compatible and comparatively easy to integrate into broader zk-proved computations, and that it isn’t necessary to recompute the hash to prove it. I do _suspect_ that it may weaken some probablistic guarantees of the hash functions, but I never thought about this hard enough to actually ask the people in-the-know to verify this. That’s pretty much all I can say about it :shrug:

Maybe Andrei will be able to add more to my response when they’re back in ~month.

ghostway (Dec 07 2023 at 11:55):

This sounds quite interesting, personally. Although I do not possess the skills to do this by myself

Viktar Makouski (Jan 17 2024 at 11:03):

I'm working on zkasm backend and have a question about cranelift filetests. Prerequisite: we don't change many parts of cranelift, we only add new isa and write code to lower clif -> zkasm.

Currently, for testing, in general we use next approach -- we generate some runnable wasm files from wasm spectest, and have rust #[test] function which compiles them. Than, we run generated zkasm by separate script. It was an idea to move our wasm files from separate folder to cranelift filetests. Any thoughts about it, is it worth to do?

Last updated: Apr 07 2025 at 16:03 UTC

Stream: cranelift

Topic: zkASM backend

nagisa (Sep 18 2023 at 08:30):

Alex Crichton (Sep 18 2023 at 14:40):

Alex Crichton (Sep 18 2023 at 14:41):

Alex Crichton (Sep 18 2023 at 14:41):

Chris Fallin (Sep 18 2023 at 15:20):

Andrei Kashin (akashin) (Sep 18 2023 at 18:16):

Andrei Kashin (akashin) (Sep 19 2023 at 08:34):

Andrei Kashin (akashin) (Sep 19 2023 at 08:37):

Chris Fallin (Sep 19 2023 at 16:42):

Andrei Kashin (akashin) (Sep 20 2023 at 09:18):

Juan Bono (Sep 20 2023 at 11:06):

fitzgen (he/him) (Sep 20 2023 at 16:32):

nagisa (Sep 21 2023 at 07:31):

nagisa (Sep 21 2023 at 08:11):

Andrei Kashin (akashin) (Sep 21 2023 at 15:58):

nagisa (Sep 27 2023 at 20:24):

nagisa (Oct 02 2023 at 09:14):

Chris Clark (Oct 08 2023 at 19:23):

Chris Fallin (Oct 08 2023 at 19:29):

Chris Clark (Oct 08 2023 at 20:19):

nagisa (Oct 09 2023 at 13:42):

ghostway (Dec 07 2023 at 11:55):

Viktar Makouski (Jan 17 2024 at 11:03):