@Chris Fallin I see that you're also interesting in some aarch64 fuzzing... Is there any plans to do this publicly? I've just started, today, looking into the world of fuzzing... would we need oss-fuzz to support an aarch64 linux platform?
Hi @Sam Parker , it depends on what you mean by "aarch64 fuzzing". You may have seen my comment on a rust-fuzz issue about aarch64 support -- I discovered it basically works, and we (in Bytecode Alliance) have an aarch64 machine we can use for various development purposes, so I was using spare cycles on it to run the regalloc fuzzers. But: this is just running some fuzzers that are platform-independent; the "aarch64" there is arbitrary in the sense that the machine is just running some deterministic computation (i.e., running the fuzzers on aarch64 doesn't somehow give better confidence in Cranelift targeting aarch64).
We could run the fuzz targets that actually generate and execute code, e.g. the differential fuzzer that runs CL output and compares to a wasm interpreter. I haven't tried that yet, but it would be interesting!
The answer to your question also depends on what you mean by "publicly". In the oss-fuzz case, there's a list of email addresses (which can be seen in the public oss-fuzz repo) that get fuzzbug reports; if we were to run aarch64-generated-code fuzzing on our aarch64 machine it would just be accessible to whoever is running it. Do you mean some sort of setup where fuzzbugs are reported publicly? Or just "run fuzzing on the current version of the repo"? In that case, oss-fuzz support would probably be best as it would be automatic, but I don't know what their plans are.
Anyway, this has reminded me that it would be useful to actually run the differential fuzzer on aarch64, so I'll try that on our aarch64 server machine soon -- thanks :-)
We could run the fuzz targets that actually generate and execute code, e.g. the differential fuzzer that runs CL output and compares to a wasm interpreter. I haven't tried that yet, but it would be interesting!
I actually have been thinking about doing something along these lines for a while, but haven't started anything yet. Similar to what we do with wasm-smith but for clif test modules
executing arbitrary clif isn't safe (unlike arbitrary wasm) since it can do unchecked memory operations
were you thinking of avoiding memory operations in the generator? always emitting checks along with them? something else?
Ah, I missed that bit. One could I suppose run the interpreter oracle first, if we had a CLIF interpreter (hey @Andrew Brown that's your cue!), and check that all addresses that are accessed are within bounds of a known heap region. Then if the oracle completes, we know the CLIF is safe to run
And, more generally @Afonso Bordado, @Andrew Brown and I have talked about using his prototype clif interpreter stuff for fuzzing, so we should talk more if you're interested in driving efforts on that
Well, avoiding memory operations would not be ideal, we would get less coverage.
The plan was to run the interperter first and if it crashes, thats ok, the compiled code must do the same
We should be able to safeley cause SEGV's?
Or is there some memory access that can cause actual issues?
ah, well, if you actually let the SEGV'ing code execute, that's potentially problematic because the errant access could hit VM data, or fuzzing runtime data, or ...
(and yes the point of a fuzzer is to catch stuff like that but IMHO the guarantee of "code that should semantically crash does crash" is much less valuable than the inverse)
Ah, right
So, bounds checking on the interperter first might be the better way to go
yup, that would be my instinct anyway
Yeah, i do have an interest in doing this, i'd like to finish getting cg_clif on arm first, since i'm kinda in the middle of that. Although we're almost there
cool! thanks for all the aarch64 patches by the way; happy to see that coming together
Yeah, I'm interested in talking more about this: the interpreter's heap and stack modifications are currently unimplemented because I never really needed them but I would be happy to look at that again if we need a way to observe the heap and stack.
Never mind, the state container that is actually used by the interpreter does have a naive implementation for heap but the stack is unimplemented: see here.
Besides random memory accesses, do we have any other concerns that would cause bad behaviour in the fuzzer?
We can't do syscalls from clif right?
probably only if we called into libc, and we should probably restrict calls to only functions generated by the fuzzer
yeah, libcalls are possible; I was thinking more about traps... @Chris Fallin, any way that breaks when fuzzing CLIF directly?
hmm, traps should be caught correctly -- I don't see any issues with e.g. uncontained behavior (the semantics of the trap are that we return control in a defined way to the embedder)
and would be useful to fuzz too, e.g. around corner cases with numeric traps or whatever
To clarify, by I meant having an aarch64 github workflow runner and running a fuzzer which tested aarch64 code generation. Comparing against an interpreter sounds cool. I, maybe naively, assumed that this would all be driven by wasm-smith and that would be 'safe'.
Chris Fallin said:
ah, well, if you actually let the SEGV'ing code execute, that's potentially problematic because the errant access could hit VM data, or fuzzing runtime data, or ...
An interpreter could be used to determine where the code will SEGV and then the VM could ensure that said location is not mapped by eg running the code in a very lightweight process that is able to relocate itself.
Afonso Bordado said:
Yeah, i do have an interest in doing this, i'd like to finish getting cg_clif on arm first, since i'm kinda in the middle of that. Although we're almost there
Thanks for this! After the 128bit int opts are all implemented I think the only other thing necessary would be to implement TLS accesses.
bjorn3 said:
Afonso Bordado said:
Yeah, i do have an interest in doing this, i'd like to finish getting cg_clif on arm first, since i'm kinda in the middle of that. Although we're almost there
Thanks for this! After the 128bit int opts are all implemented I think the only other thing necessary would be to implement TLS accesses.
I'm working on that right now!
Although i'm having some difficulties, I'll probably submit a draft PR detailing what I'm struggling with
I am by no means an arm expert, but I will try if I can help.
Hey @Sam Parker I couldn't tag you on github, but here is a first draft of the cranelift fuzzer: https://github.com/bytecodealliance/wasmtime/pull/3038
@Afonso Bordado thanks for this! I'll start to review in detail tomorrow. One bit of feedback I have right away is that it's probably better to call it something more specific, like "CLIF-level differential fuzzer" or somesuch, as Cranelift is already fuzzed by a number of different fuzz targets
I have been away for a few days, so great to come back to see the beginnings merged! :) I've continued to look into how aarch64 testing can be done publicly and it seems that qemu + oss-fuzz is unlikely to happen (https://github.com/google/oss-fuzz/issues/1754). Has anyone from the bytecodealliance approached Equinix about hosting some testing infrastructure (https://developer.arm.com/solutions/infrastructure/works-on-arm/equinix)? I'm assuming it's the same service that packet.net provided... and I'm wondering whether it would be feasible to get fuzzers running there, as well as oss-fuzz for x64?
@Sam Parker we actually do have an aarch64 box provisioned by works-on-arm for general Bytecode Alliance use! (Anyone else who's at an official BA member company and reading this, we can create accounts on it; just ping me.) I had used it for a bit to do some regalloc2 fuzzing but haven't yet tried to run any of the execution-based fuzzers (the existing wasm-level differential one, or Afonso's new CLIF-level one) on it; I'm happy to kick off some jobs on an ad-hoc basis when we think we have a milestone that needs testing.
It would be a bit more involved to get an oss-fuzz level of infrastructure going (automatically pull & build latest versions, send notifications on fuzz failures, track when they're fixed, etc) -- while it looks like their runner is open-source, it has integrations to Jenkins, GCS, an issue tracker, etc. I do wonder which will happen first: (i) one of us has time to play with this on our aarch64 host, or (ii) Google eventually provides aarch64 runners in oss-fuzz. I suspect maybe (ii) :-) In any case, happy to kick off some jobs in a tmux and let it run when requested.
Ah, nice! I was wondering whether it would be feasible for a driver script to automatically post a github issue with the attached reproducer... I'm very new to github workflows though and I have no idea how much can be automated. I'm making all this noise because I have an intern starting with me in a couple of weeks and ideally I'd like him to spend all his time on fuzzing, including helping out with the new CLIF differential
Ah that's fantastic! This is definitely something that could be prototyped on x86-64 (a script to automatically fuzz and file issues, that is); if there is something working eventually, we could set it up on the ARM box
Working on a script like Sam described; are there any particular sections from the fuzzer output that would be useful in a GitHub issue when a test fails?
Last updated: Dec 23 2024 at 13:07 UTC