Stream: cranelift

Topic: zkASM backend


view this post on Zulip nagisa (Sep 18 2023 at 08:30):

Hey all,

If you've been keeping a close eye on the pull request list, you may have noticed https://github.com/bytecodealliance/wasmtime/pull/6898. This PR was an accident of sorts, but the experiments with the zkASM backend did continue making progress since. We've made enough progress that the backend is now able to produce valid code for very basic wasm programs (although the code is pretty MVP quality.)

As we're getting closer to dedicating more time to this, I wanted to do a temperature check of sorts. How comfortable you all would be with a backend like this being developed in the upstream repository? Or maybe perhaps there is another, better to develop a target, other than maintaining a fork, that we haven't thought of?

This target would be unique first and foremost because it is not targeting, and is unlikely to ever target, any particular piece of hardware. Instead, the machine executing the produced code is implemented in software. It is also a pretty unique machine in that unlike regular architectures it does not actually execute machine instructions in the traditional sense; it also does not only produce effects of executing a program, but also generates additional piece of data that allows to prove (more quickly than by executing the program itself) that the effects produced are indeed a result of executing a that program. Hence zero-knowledge in zkASM.

From top of my head the caveats are that our current prototype:

Why do we want to work upstream so relatively soon? Well, there are a couple of things we hope to achieve by doing this: first most of us working on the backend don't really have any working experience with the cranelift internals, so it would be great to have some guidance, and we felt that having the changes to show up as PRs on the upstream repository would hopefully give us an opportunity to ask for insights more easily (although we don't necessarily expect anybody to dedicate their precious time actually responding;) and more importantly because it would make it easier for us to keep up with the relevant changes in cranelift as they occur upstream.

Any questions? Glad to try to answer or, failing that, get people who know the answers chime in :smile:

does not build yet
The ideas behind zkASM

view this post on Zulip Alex Crichton (Sep 18 2023 at 14:40):

From a project perspective two things that may interest you are:

view this post on Zulip Alex Crichton (Sep 18 2023 at 14:41):

view this post on Zulip Alex Crichton (Sep 18 2023 at 14:41):

ah, and meetings link

view this post on Zulip Chris Fallin (Sep 18 2023 at 15:20):

One other thing that would be good to understand (it's a little unclear from the zkASM link you gave): what's the expected use-case?

For "exists as silicon in the physical world" ISAs, this is fairly self-evident, as long as the architecture is still in use: e.g. I don't think there's any question that we'd accept an arm32 backend, or even something a little rarer now like mips32, as long as all the other tier 3 reqs (maintainers, ...) are met. But for more niche virtual ISAs, we'd want to consider the use-cases against the maintenance cost: e.g. are we enabling one very specific user or is this something that could enable a bunch of folks to experiment and make use of Cranelift in new ways.

Another thing to consider with less-common ISAs is what tooling and documentation exists: for all our ports currently, we can use qemu to test, we can build ELF binaries of Wasmtime for linux/<CPU> and test them, we can debug, there are ISA manuals. It'd be good to understand where all of that stands with zkASM as well

view this post on Zulip Andrei Kashin (akashin) (Sep 18 2023 at 18:16):

Chris Fallin said:

One other thing that would be good to understand (it's a little unclear from the zkASM link you gave): what's the expected use-case?

Great question :)

At a really high-level, WASM -> zkASM compiler paired with zkASM prover (already exists) would allow generating short proofs that a given WASM program produced a given result and the proof could be cheaply verified by anyone.

The main use-case is verifiably outsourcing computation - you can ask someone to run an expensive program for you and give you the result together with a proof of a computation. This way you can be sure that they indeed ran your program and not just gave you some random output. This turns out to be useful in many different contexts, to name a few:

I wrote a more concrete explanation in: https://hackmd.io/DK6S9vsbQ9GAmBJEOIwh4w?view#Background, let me know if it makes sense!

A brief introduction to the zkProver's State Machine design approach
or

view this post on Zulip Andrei Kashin (akashin) (Sep 19 2023 at 08:34):

Chris Fallin said:

Another thing to consider with less-common ISAs is what tooling and documentation exists: for all our ports currently, we can use qemu to test, we can build ELF binaries of Wasmtime for linux/<CPU> and test them, we can debug, there are ISA manuals. It'd be good to understand where all of that stands with zkASM as well

There is already a decent amount of tooling around ZK ASM:

We will most likely extend this tooling with Rust implementation/bindings to make testing in Rust-based projects simpler.

This repo compiles .zkasm to a json ready for the zkExecutor - GitHub - 0xPolygonHermez/zkasmcom: This repo compiles .zkasm to a json ready for the zkExecutor
zkEVM proof generator reference written in Javascript - GitHub - 0xPolygonHermez/zkevm-proverjs: zkEVM proof generator reference written in Javascript
zkEVM prover in C++. Contribute to 0xPolygonHermez/zkevm-prover development by creating an account on GitHub.

view this post on Zulip Andrei Kashin (akashin) (Sep 19 2023 at 08:37):

Alex Crichton said:

Will you be open for us to give a quick (10-15 min) presentation to provide more context on one these meetings (e.g. September 27th) ?

view this post on Zulip Chris Fallin (Sep 19 2023 at 16:42):

Andrei Kashin (akashin) said:

Will you be open for us to give a quick (10-15 min) presentation to provide more context on one these meetings (e.g. September 27th) ?

You're always welcome to add agenda items -- please feel free to submit a PR to the bytecodealliance/meetings repo to add an item under cranelift/2023/ in the file for whichever date you prefer.

view this post on Zulip Andrei Kashin (akashin) (Sep 20 2023 at 09:18):

Chris Fallin said:

Andrei Kashin (akashin) said:

Will you be open for us to give a quick (10-15 min) presentation to provide more context on one these meetings (e.g. September 27th) ?

You're always welcome to add agenda items -- please feel free to submit a PR to the bytecodealliance/meetings repo to add an item under cranelift/2023/ in the file for whichever date you prefer.

I see! Sent https://github.com/bytecodealliance/meetings/pull/161, let me know if you need any more details there.

Let me know if I need to provide a more detailed agenda!

view this post on Zulip Juan Bono (Sep 20 2023 at 11:06):

Hi @Andrei Kashin (akashin) , excuse me if my question is off topic but what prover are you using?

Really nice initiative btw.

view this post on Zulip fitzgen (he/him) (Sep 20 2023 at 16:32):

Andrei Kashin (akashin) said:

At a really high-level, WASM -> zkASM compiler paired with zkASM prover (already exists) would allow generating short proofs that a given WASM program produced a given result and the proof could be cheaply verified by anyone.

I think there might be some hidden assumptions here that are going to affect how well you can prove that a compiled Wasm program is computing anything. I want to help uncover them and make a bunch of stuff explicit so that we are all on the same page and you understand what you're asking and what you're getting into.

The biggest thing to note is that cranelift doesn't compile Wasm in isolation:

I know I am coming across somewhat pessimistic, but I really want folks to understand that this isn't a thing where one could "just" add a new zkasm Cranelift backend to get provable Wasm computations. There is a lot more involved here.

view this post on Zulip nagisa (Sep 21 2023 at 07:31):

Thank you for taking the time to bring these concerns to our attention!

this also implies that all of Wasmtime needs to be available in zkasm since the compiled code relies on calling directly to Wasmtime's runtime functionality.

As a background, we both come from a team that's been working on a (fork of a) WASM runtime, so we do have the potential complications in the areas you've raised on our mind as well.

zkasm, specifically, does have ways to call out to external code for complicated behaviours and mechanisms to write down proofs for those call outs. As far as I am aware, even more complicated operations that are traditionally regular instructions, such as multiplication and division, are expected to be implemented using this mechanism at this moment in time (although this might change in the future, thus basically shifting the burden of proof to a different place.)

view this post on Zulip nagisa (Sep 21 2023 at 08:11):

I guess the major thing that remains to be seen is how to best approach providing/linking in proofs for not only the libcalls but also the user provided imports/exports. For our immediate use-case that wasn't a big deal since our interface is locked-in, while for something general purpose like wasmtime proper there would need to be a generic solution that anybody can use. And I have no idea how to best approach this at this moment in time!

view this post on Zulip Andrei Kashin (akashin) (Sep 21 2023 at 15:58):

Juan Bono said:

Hi Andrei Kashin (akashin) , excuse me if my question is off topic but what prover are you using?

Really nice initiative btw.

Thank you :)
We will be using PIL (Polynomial Identity Langugage) to describe constraints and generate STARKs and Plonky2 as a proving system (based on FRI).
I think we'll try to stay reasonably flexible on this front as new proving systems appear at a fast pace, but for the first version, we'll focus on this proving stack.

Polynomial Identity Language (PIL) is a domain-specific language (DSL) created to provide developers with a holistic framework for constructing programs through an easy-to-use interface, and abstracting the complexity of proof/verification mechanisms.
Contribute to mir-protocol/plonky2 development by creating an account on GitHub.

view this post on Zulip nagisa (Sep 27 2023 at 20:24):

So my summary takeaway is:

  1. We will probably continue working in a fork for the time being, but will seriously explore having this backend as a separate crate as it matures;
  2. It is in clift’s interest to have the infrastructure (and, I guess, documentation) necessary for first-class backends of such sort (and we should start by creating an issue for this – I'll make a note to start one tomorrow;)
  3. If we were to come in with a PR upstream, we would still need a good plan on how to handle things like imported wasm memory and functions like memory.grow and make sure it works well and is generally applicable.

view this post on Zulip nagisa (Oct 02 2023 at 09:14):

Filed the issue at https://github.com/bytecodealliance/wasmtime/issues/7124

Feature Today cranelift has a number of backends for traditional computer architectures: x86, aarch64, riscv64 and some others as well. It would be great if we could have many more, however adding ...

view this post on Zulip Chris Clark (Oct 08 2023 at 19:23):

If this is possible to verify an expected computation, why does it need to be a different isa?

The more I think about the whole purpose of the project, the more I think it just isn't possible. In order to verify a cryptic hash function its input and output, you must run the hash function. Otherwise, it wasn't a very good hash function. If its just using the public key to verify the digest, that operation is cheap.

Moreover, the zkasm is specific to things related to cryptography, not a real isa.

view this post on Zulip Chris Fallin (Oct 08 2023 at 19:29):

@Chris Clark the "not a real [I might say 'physical'] ISA" bit is the main reason we prefer not to upstream it at the moment, since it's so different from other ISAs with a full OS stack, ABI, linkers and debuggers, and the like. In that sense I'd agree.

However I would also note and defend the idea a bit that zero-knowledge proofs are a pretty deep research area, and there's something novel and useful there. (It's pretty counterintuitive -- it really is "zero knowledge", i.e. you don't know the input, but it's probabilistic.) Not to pass any judgment on whether or how it fits into Cranelift but I want to make sure we are fair to the idea and those who work hard on it :-)

view this post on Zulip Chris Clark (Oct 08 2023 at 20:19):

My apologies. Countering my statement of it "just isn't possible", is probably that entire area of research. I will leave that up to the experts, and meant no ill-will.

view this post on Zulip nagisa (Oct 09 2023 at 13:42):

To answer "why does it need to be a different ISA" is probably quite multifaceted. To motivate "why ISA at all" first: Zero Knowledge proofs use some pretty complicated math behind the scenes I am not really able to fully grok despite my involvement in a relevant project. The way I view zkASM is that should be considered largely as a mechanism to simplify expressing the computations to prove in a way that’s more familiar to the broader groups (such as that of Software Engineers). One major benefit, just like with many simplifying abstractions, is that it is now possible to verify that the conversion is correct once and that makes it scalable for the experts in the field to pool their attention to verifying just those few equations. “Why a different ISA” then? I don’t see why we couldn’t be converting x64 or RISC-V down to those complicated math expressions. It is just that making a new one doesn’t cost very much and gives the ones working on the machine some additional flexibility to adapt it to what the underlying maths can express.

Overall I think zkASM is quite similar to all of the traditional [hardware/physical] ISAs, from the perspective of motivation, benefits and tradeoffs.

On the point of it being specific to cryptography – zkASM is still a project in heavy development and it’s direction up to this point was indeed driven largely by crypto field needs. We don’t think there’s a huge reason it wouldn’t be possible to make it more generally applicable than it currently is, and is one of the things that is being worked on right now. In principle the machine is already a linear bounded automaton complete, although lack of some operations can make it quite painful to use in certain contexts.

On the point of hash function computations: I unfortunately am enabled to be ignorant enough of the underlying research to be able to counter this observation. I do know that (some?) hash functions are quite compatible and comparatively easy to integrate into broader zk-proved computations, and that it isn’t necessary to recompute the hash to prove it. I do _suspect_ that it may weaken some probablistic guarantees of the hash functions, but I never thought about this hard enough to actually ask the people in-the-know to verify this. That’s pretty much all I can say about it :shrug:

Maybe Andrei will be able to add more to my response when they’re back in ~month.

view this post on Zulip ghostway (Dec 07 2023 at 11:55):

This sounds quite interesting, personally. Although I do not possess the skills to do this by myself

view this post on Zulip Viktar Makouski (Jan 17 2024 at 11:03):

I'm working on zkasm backend and have a question about cranelift filetests. Prerequisite: we don't change many parts of cranelift, we only add new isa and write code to lower clif -> zkasm.

Currently, for testing, in general we use next approach -- we generate some runnable wasm files from wasm spectest, and have rust #[test] function which compiles them. Than, we run generated zkasm by separate script. It was an idea to move our wasm files from separate folder to cranelift filetests. Any thoughts about it, is it worth to do?


Last updated: Nov 22 2024 at 17:03 UTC