eggyal opened issue #7630:
I happened to be browsing some code in this repo and came upon on some surprising (mis?)use of
unsafe
in function sigs. So I decided to search the entire codebase forunsafe
and take a deeper look, and these surprises appear to be quite widespread.Has this repo a particularly interesting/unusual policy toward
unsafe
? My understanding is that a fn should beunsafe
if UB could arise from safe code unless the caller upholds some condition.But then we have, for example, the getter methods on
ValRaw
which should surely all beunsafe
as they require the caller to uphold that the instance is of the correct variant for the access; but instead they are exposed as safe methods and just assume that safety condition has been met. It would be trivial for callers to trigger UB purely in safe code because of this.We also have, for example,
ExternRef::to_raw
which is currentlyunsafe
ostensibly because the returned raw pointer "is only safe to pass into a store if a GC doesn't happen between when the value is produce and when it's passed into the store"; however, for that reason, passing such a raw pointer into a store should always be (and, I think, correctly is always) anunsafe
operation and so there's no need for this function (nor indeedStoreOpaque::insert_vmexternref_without_gc
) to beunsafe
.If I haven't misunderstood anything or missed some particular policy of this repo, would you accept PRs that:
- correct all such signatures;
- document the safety requirements of all
unsafe
functions; and- add comments that justify all
unsafe
operations?I propose undertaking this as both an audit of
unsafe
usage and (more personally) as an exercise to become more familiar with the codebase. The downside is that in many cases these will be breaking changes.
pchickey commented on issue #7630:
Hi! Welcome to the Wasmtime repository. I'd like to suggest that, as a newcomer to this project which is maintained by many folks who have written Rust for a very long time, and in some cases even helped shape the language and its implementation, that you reframe this report to first try to assume the choices made are correct, and that you want to understand why those choices were made, rather than assume they are incorrect.
pchickey edited a comment on issue #7630:
Hi! Welcome to the Wasmtime project. I'd like to suggest that, as a newcomer to this project which is maintained by many folks who have written Rust for a very long time, and in some cases even helped shape the language and its implementation, that you reframe this report to first try to assume the choices made are correct, and that you want to understand why those choices were made, rather than assume they are incorrect.
eggyal commented on issue #7630:
Not entirely a newcomer to this project, and have contributed to rustc (including bits of
unsafe
code) for some years now. I did however frame the question with "If I haven't misunderstood anything or missed some particular policy of this repo" because I fully recognise that I may have misunderstood or missed something pertinent and very much welcome being corrected if so. But I also gave two specific examples that appear objectively incorrect, together with my reasoning. Please, tell me where I have it wrong.
alexcrichton commented on issue #7630:
I'd at least personally echo what @pchickey already mentioned in the sense that we're trying to do our due diligence with respect to
unsafe
in Wasmtime where we can. For example we aren't trying to just slapunsafe
around to get things compiling but instead focus on understanding why bits and pieces areunsafe
and what impact they have. That being said it's also important to understand the context of this.Wasmtime is a WebAssembly runtime which executes code generated by Cranelift. Virtually nothing it does is safe. We explicitly chose to write it in Rust due to Rust's safety guarantees, but that wouldn't help us much if we just tagged everything as:
fn safe_wrapper() { // SAFETY: can't prove Cranelift is safe so assume it is unsafe { call_wasm() } }
So on one hand we're trying to balance what we get from Rust vs the reality of doing unsafe things pervasively throughout the runtime. This is one reason, for example, that we require all changes to go through code review by maintainers which have tests and all that jazz. This of course doesn't catch all bugs, however!
Another important piece of context is that from an API perspective Wasmtime puts little effort into auditing anything "beneath" the
wasmtime
crate. Thewasmtime
crate is the public face of Wasmtime and theunsafe
-vs-safe there is carefully scrutinized, but anunsafe
API inwasmtime-runtime
receives little-to-no scrutiny beyond what the author originally chose. For example theinsert_vmexternref_without_gc
API you pointed out doesn't get much scrutiny since it's internal.That's at least my description of the current status quo. I'd echo again what @pchickey said in that it's probably best to approach this with care and understanding to better understand what's going on. I don't doubt that we've made the wrong decision sometimes on what should be
unsafe
or where the qualifier should be mentioned. While we try to do the right thing we're human after all and make mistakes. That being said I do think it's best to assume good faith intentions on all sides, for example we're not trying to be a special project with our own definition ofunsafe
. We're no different than any other Rust project in that regard, ideally we'd have zerounsafe
but that doesn't work so we try the next best thing.
With that background context, I personally think it'd be good to have some audits in this area. That being said though this is not an easy undertaking. Modifying internals of Wasmtime like this often requires in-depth knowledge of how the pieces all fit together. I also personally think it would be useful to clutter most of Wasmtime with unchecked brittle comments that generally all repeat the same thing (e.g. if everything just said
// SAFETY: well cranelift can't be proven safe so here we are
).We've historically tried to keep architecture decisions in Wasmtime documented and that's a good place for decisions along the lines of "well to understand anything you gotta understand this".
So overall I would at least personally say it'd be a good idea to audit things, but I would also request that you don't send a huge PR with comments everywhere that would take quite some time to review. I'd recommend starting on a particular location and working your way out from there. For example
ValRaw
might be a good place to start. I wrote that down myself and made those methods all safe to begin with, and I'm not sure I agree with your characterization that they're obviouslyunsafe
. I think it would be good to dig in a bit more to craft an example of unsafety, perhaps even under MIRI, to better understand. Regardless at least more documentation is warranted if you're left with the impression that it should beunsafe
(whether or not it should be), because naturally even if it's with internals it's still best to be clear.
eggyal commented on issue #7630:
Alex, thank you for such a detailed and thorough response. I will take on board your comments and proceed sensitively and with care.
eggyal closed issue #7630:
I happened to be browsing some code in this repo and came upon on some surprising (mis?)use of
unsafe
in function sigs. So I decided to search the entire codebase forunsafe
and take a deeper look, and these surprises appear to be quite widespread.Has this repo a particularly interesting/unusual policy toward
unsafe
? My understanding is that a fn should beunsafe
if UB could arise from safe code unless the caller upholds some condition.But then we have, for example, the getter methods on
ValRaw
which should surely all beunsafe
as they require the caller to uphold that the instance is of the correct variant for the access; but instead they are exposed as safe methods and just assume that safety condition has been met. It would be trivial for callers to trigger UB purely in safe code because of this.We also have, for example,
ExternRef::to_raw
which is currentlyunsafe
ostensibly because the returned raw pointer "is only safe to pass into a store if a GC doesn't happen between when the value is produce and when it's passed into the store"; however, for that reason, passing such a raw pointer into a store should always be (and, I think, correctly is always) anunsafe
operation and so there's no need for this function (nor indeedStoreOpaque::insert_vmexternref_without_gc
) to beunsafe
.If I haven't misunderstood anything or missed some particular policy of this repo, would you accept PRs that:
- correct all such signatures;
- document the safety requirements of all
unsafe
functions; and- add comments that justify all
unsafe
operations?I propose undertaking this as both an audit of
unsafe
usage and (more personally) as an exercise to become more familiar with the codebase. The downside is that in many cases these will be breaking changes.
Last updated: Jan 24 2025 at 00:11 UTC