Stream: cranelift

Topic: aarch64 fuzzing


view this post on Zulip Sam Parker (Jun 22 2021 at 14:14):

@Chris Fallin I see that you're also interesting in some aarch64 fuzzing... Is there any plans to do this publicly? I've just started, today, looking into the world of fuzzing... would we need oss-fuzz to support an aarch64 linux platform?

view this post on Zulip Chris Fallin (Jun 22 2021 at 15:49):

Hi @Sam Parker , it depends on what you mean by "aarch64 fuzzing". You may have seen my comment on a rust-fuzz issue about aarch64 support -- I discovered it basically works, and we (in Bytecode Alliance) have an aarch64 machine we can use for various development purposes, so I was using spare cycles on it to run the regalloc fuzzers. But: this is just running some fuzzers that are platform-independent; the "aarch64" there is arbitrary in the sense that the machine is just running some deterministic computation (i.e., running the fuzzers on aarch64 doesn't somehow give better confidence in Cranelift targeting aarch64).

We could run the fuzz targets that actually generate and execute code, e.g. the differential fuzzer that runs CL output and compares to a wasm interpreter. I haven't tried that yet, but it would be interesting!

The answer to your question also depends on what you mean by "publicly". In the oss-fuzz case, there's a list of email addresses (which can be seen in the public oss-fuzz repo) that get fuzzbug reports; if we were to run aarch64-generated-code fuzzing on our aarch64 machine it would just be accessible to whoever is running it. Do you mean some sort of setup where fuzzbugs are reported publicly? Or just "run fuzzing on the current version of the repo"? In that case, oss-fuzz support would probably be best as it would be automatic, but I don't know what their plans are.

Anyway, this has reminded me that it would be useful to actually run the differential fuzzer on aarch64, so I'll try that on our aarch64 server machine soon -- thanks :-)

view this post on Zulip Afonso Bordado (Jun 22 2021 at 15:54):

We could run the fuzz targets that actually generate and execute code, e.g. the differential fuzzer that runs CL output and compares to a wasm interpreter. I haven't tried that yet, but it would be interesting!

I actually have been thinking about doing something along these lines for a while, but haven't started anything yet. Similar to what we do with wasm-smith but for clif test modules

view this post on Zulip fitzgen (he/him) (Jun 22 2021 at 22:09):

executing arbitrary clif isn't safe (unlike arbitrary wasm) since it can do unchecked memory operations

were you thinking of avoiding memory operations in the generator? always emitting checks along with them? something else?

view this post on Zulip Chris Fallin (Jun 22 2021 at 22:14):

Ah, I missed that bit. One could I suppose run the interpreter oracle first, if we had a CLIF interpreter (hey @Andrew Brown that's your cue!), and check that all addresses that are accessed are within bounds of a known heap region. Then if the oracle completes, we know the CLIF is safe to run

view this post on Zulip Chris Fallin (Jun 22 2021 at 22:14):

And, more generally @Afonso Bordado, @Andrew Brown and I have talked about using his prototype clif interpreter stuff for fuzzing, so we should talk more if you're interested in driving efforts on that

view this post on Zulip Afonso Bordado (Jun 22 2021 at 22:15):

Well, avoiding memory operations would not be ideal, we would get less coverage.

The plan was to run the interperter first and if it crashes, thats ok, the compiled code must do the same

We should be able to safeley cause SEGV's?

view this post on Zulip Afonso Bordado (Jun 22 2021 at 22:16):

Or is there some memory access that can cause actual issues?

view this post on Zulip Chris Fallin (Jun 22 2021 at 22:18):

ah, well, if you actually let the SEGV'ing code execute, that's potentially problematic because the errant access could hit VM data, or fuzzing runtime data, or ...

view this post on Zulip Chris Fallin (Jun 22 2021 at 22:18):

(and yes the point of a fuzzer is to catch stuff like that but IMHO the guarantee of "code that should semantically crash does crash" is much less valuable than the inverse)

view this post on Zulip Afonso Bordado (Jun 22 2021 at 22:19):

Ah, right
So, bounds checking on the interperter first might be the better way to go

view this post on Zulip Chris Fallin (Jun 22 2021 at 22:19):

yup, that would be my instinct anyway

view this post on Zulip Afonso Bordado (Jun 22 2021 at 22:20):

Yeah, i do have an interest in doing this, i'd like to finish getting cg_clif on arm first, since i'm kinda in the middle of that. Although we're almost there

view this post on Zulip Chris Fallin (Jun 22 2021 at 22:20):

cool! thanks for all the aarch64 patches by the way; happy to see that coming together

view this post on Zulip Andrew Brown (Jun 22 2021 at 22:22):

Yeah, I'm interested in talking more about this: the interpreter's heap and stack modifications are currently unimplemented because I never really needed them but I would be happy to look at that again if we need a way to observe the heap and stack.

Standalone JIT-style runtime for WebAssembly, using Cranelift - bytecodealliance/wasmtime

view this post on Zulip Andrew Brown (Jun 22 2021 at 22:24):

Never mind, the state container that is actually used by the interpreter does have a naive implementation for heap but the stack is unimplemented: see here.

Standalone JIT-style runtime for WebAssembly, using Cranelift - bytecodealliance/wasmtime

view this post on Zulip Afonso Bordado (Jun 22 2021 at 22:29):

Besides random memory accesses, do we have any other concerns that would cause bad behaviour in the fuzzer?

view this post on Zulip Afonso Bordado (Jun 22 2021 at 22:30):

We can't do syscalls from clif right?

view this post on Zulip Afonso Bordado (Jun 22 2021 at 22:32):

probably only if we called into libc, and we should probably restrict calls to only functions generated by the fuzzer

view this post on Zulip Andrew Brown (Jun 22 2021 at 22:34):

yeah, libcalls are possible; I was thinking more about traps... @Chris Fallin, any way that breaks when fuzzing CLIF directly?

view this post on Zulip Chris Fallin (Jun 22 2021 at 22:35):

hmm, traps should be caught correctly -- I don't see any issues with e.g. uncontained behavior (the semantics of the trap are that we return control in a defined way to the embedder)

view this post on Zulip Chris Fallin (Jun 22 2021 at 22:35):

and would be useful to fuzz too, e.g. around corner cases with numeric traps or whatever

view this post on Zulip Sam Parker (Jun 23 2021 at 09:05):

To clarify, by I meant having an aarch64 github workflow runner and running a fuzzer which tested aarch64 code generation. Comparing against an interpreter sounds cool. I, maybe naively, assumed that this would all be driven by wasm-smith and that would be 'safe'.

view this post on Zulip bjorn3 (Jun 23 2021 at 09:17):

Chris Fallin said:

ah, well, if you actually let the SEGV'ing code execute, that's potentially problematic because the errant access could hit VM data, or fuzzing runtime data, or ...

An interpreter could be used to determine where the code will SEGV and then the VM could ensure that said location is not mapped by eg running the code in a very lightweight process that is able to relocate itself.

view this post on Zulip bjorn3 (Jun 23 2021 at 09:19):

Afonso Bordado said:

Yeah, i do have an interest in doing this, i'd like to finish getting cg_clif on arm first, since i'm kinda in the middle of that. Although we're almost there

Thanks for this! After the 128bit int opts are all implemented I think the only other thing necessary would be to implement TLS accesses.

view this post on Zulip Afonso Bordado (Jun 23 2021 at 09:31):

bjorn3 said:

Afonso Bordado said:

Yeah, i do have an interest in doing this, i'd like to finish getting cg_clif on arm first, since i'm kinda in the middle of that. Although we're almost there

Thanks for this! After the 128bit int opts are all implemented I think the only other thing necessary would be to implement TLS accesses.

I'm working on that right now!

view this post on Zulip Afonso Bordado (Jun 23 2021 at 09:32):

Although i'm having some difficulties, I'll probably submit a draft PR detailing what I'm struggling with

view this post on Zulip bjorn3 (Jun 23 2021 at 09:37):

I am by no means an arm expert, but I will try if I can help.

view this post on Zulip Afonso Bordado (Jun 27 2021 at 16:35):

Hey @Sam Parker I couldn't tag you on github, but here is a first draft of the cranelift fuzzer: https://github.com/bytecodealliance/wasmtime/pull/3038

Hey! Here's an initial version of the cranelift fuzzer. We generate a random cranelift function based on input bytes from the fuzzer. Currently we can generate 7 instructions, but it is already...

view this post on Zulip Chris Fallin (Jun 28 2021 at 00:13):

@Afonso Bordado thanks for this! I'll start to review in detail tomorrow. One bit of feedback I have right away is that it's probably better to call it something more specific, like "CLIF-level differential fuzzer" or somesuch, as Cranelift is already fuzzed by a number of different fuzz targets

view this post on Zulip Sam Parker (Jul 01 2021 at 14:31):

I have been away for a few days, so great to come back to see the beginnings merged! :) I've continued to look into how aarch64 testing can be done publicly and it seems that qemu + oss-fuzz is unlikely to happen (https://github.com/google/oss-fuzz/issues/1754). Has anyone from the bytecodealliance approached Equinix about hosting some testing infrastructure (https://developer.arm.com/solutions/infrastructure/works-on-arm/equinix)? I'm assuming it's the same service that packet.net provided... and I'm wondering whether it would be feasible to get fuzzers running there, as well as oss-fuzz for x64?

I would like to start testing projects/zlib-ng on aarch64. I was thinking that it would be possible to run the fuzzers on top of qemu-arm64. Has somebody done that for another project?
As part of Equinix — the world’s digital infrastructure company — Equinix Metal provides automated & interconnected bare metal infrastructure.

view this post on Zulip Chris Fallin (Jul 01 2021 at 16:14):

@Sam Parker we actually do have an aarch64 box provisioned by works-on-arm for general Bytecode Alliance use! (Anyone else who's at an official BA member company and reading this, we can create accounts on it; just ping me.) I had used it for a bit to do some regalloc2 fuzzing but haven't yet tried to run any of the execution-based fuzzers (the existing wasm-level differential one, or Afonso's new CLIF-level one) on it; I'm happy to kick off some jobs on an ad-hoc basis when we think we have a milestone that needs testing.

It would be a bit more involved to get an oss-fuzz level of infrastructure going (automatically pull & build latest versions, send notifications on fuzz failures, track when they're fixed, etc) -- while it looks like their runner is open-source, it has integrations to Jenkins, GCS, an issue tracker, etc. I do wonder which will happen first: (i) one of us has time to play with this on our aarch64 host, or (ii) Google eventually provides aarch64 runners in oss-fuzz. I suspect maybe (ii) :-) In any case, happy to kick off some jobs in a tmux and let it run when requested.

view this post on Zulip Sam Parker (Jul 01 2021 at 16:36):

Ah, nice! I was wondering whether it would be feasible for a driver script to automatically post a github issue with the attached reproducer... I'm very new to github workflows though and I have no idea how much can be automated. I'm making all this noise because I have an intern starting with me in a couple of weeks and ideally I'd like him to spend all his time on fuzzing, including helping out with the new CLIF differential

view this post on Zulip Chris Fallin (Jul 01 2021 at 16:39):

Ah that's fantastic! This is definitely something that could be prototyped on x86-64 (a script to automatically fuzz and file issues, that is); if there is something working eventually, we could set it up on the ARM box

view this post on Zulip Damian Heaton (Jul 14 2021 at 10:54):

Working on a script like Sam described; are there any particular sections from the fuzzer output that would be useful in a GitHub issue when a test fails?


Last updated: Nov 22 2024 at 16:03 UTC