Stream: git-wasmtime

Topic: wasmtime / issue #7630 Use of `unsafe`


view this post on Zulip Wasmtime GitHub notifications bot (Dec 04 2023 at 18:57):

eggyal opened issue #7630:

I happened to be browsing some code in this repo and came upon on some surprising (mis?)use of unsafe in function sigs. So I decided to search the entire codebase for unsafe and take a deeper look, and these surprises appear to be quite widespread.

Has this repo a particularly interesting/unusual policy toward unsafe? My understanding is that a fn should be unsafe if UB could arise from safe code unless the caller upholds some condition.

But then we have, for example, the getter methods on ValRaw which should surely all be unsafe as they require the caller to uphold that the instance is of the correct variant for the access; but instead they are exposed as safe methods and just assume that safety condition has been met. It would be trivial for callers to trigger UB purely in safe code because of this.

We also have, for example, ExternRef::to_raw which is currently unsafe ostensibly because the returned raw pointer "is only safe to pass into a store if a GC doesn't happen between when the value is produce and when it's passed into the store"; however, for that reason, passing such a raw pointer into a store should always be (and, I think, correctly is always) an unsafe operation and so there's no need for this function (nor indeed StoreOpaque::insert_vmexternref_without_gc) to be unsafe.

If I haven't misunderstood anything or missed some particular policy of this repo, would you accept PRs that:

  1. correct all such signatures;
  2. document the safety requirements of all unsafe functions; and
  3. add comments that justify all unsafe operations?

I propose undertaking this as both an audit of unsafe usage and (more personally) as an exercise to become more familiar with the codebase. The downside is that in many cases these will be breaking changes.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 04 2023 at 19:05):

pchickey commented on issue #7630:

Hi! Welcome to the Wasmtime repository. I'd like to suggest that, as a newcomer to this project which is maintained by many folks who have written Rust for a very long time, and in some cases even helped shape the language and its implementation, that you reframe this report to first try to assume the choices made are correct, and that you want to understand why those choices were made, rather than assume they are incorrect.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 04 2023 at 19:05):

pchickey edited a comment on issue #7630:

Hi! Welcome to the Wasmtime project. I'd like to suggest that, as a newcomer to this project which is maintained by many folks who have written Rust for a very long time, and in some cases even helped shape the language and its implementation, that you reframe this report to first try to assume the choices made are correct, and that you want to understand why those choices were made, rather than assume they are incorrect.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 04 2023 at 19:11):

eggyal commented on issue #7630:

Not entirely a newcomer to this project, and have contributed to rustc (including bits of unsafe code) for some years now. I did however frame the question with "If I haven't misunderstood anything or missed some particular policy of this repo" because I fully recognise that I may have misunderstood or missed something pertinent and very much welcome being corrected if so. But I also gave two specific examples that appear objectively incorrect, together with my reasoning. Please, tell me where I have it wrong.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 04 2023 at 19:40):

alexcrichton commented on issue #7630:

I'd at least personally echo what @pchickey already mentioned in the sense that we're trying to do our due diligence with respect to unsafe in Wasmtime where we can. For example we aren't trying to just slap unsafe around to get things compiling but instead focus on understanding why bits and pieces are unsafe and what impact they have. That being said it's also important to understand the context of this.

Wasmtime is a WebAssembly runtime which executes code generated by Cranelift. Virtually nothing it does is safe. We explicitly chose to write it in Rust due to Rust's safety guarantees, but that wouldn't help us much if we just tagged everything as:

fn safe_wrapper() {
    // SAFETY: can't prove Cranelift is safe so assume it is
    unsafe { call_wasm() }
}

So on one hand we're trying to balance what we get from Rust vs the reality of doing unsafe things pervasively throughout the runtime. This is one reason, for example, that we require all changes to go through code review by maintainers which have tests and all that jazz. This of course doesn't catch all bugs, however!

Another important piece of context is that from an API perspective Wasmtime puts little effort into auditing anything "beneath" the wasmtime crate. The wasmtime crate is the public face of Wasmtime and the unsafe-vs-safe there is carefully scrutinized, but an unsafe API in wasmtime-runtime receives little-to-no scrutiny beyond what the author originally chose. For example the insert_vmexternref_without_gc API you pointed out doesn't get much scrutiny since it's internal.

That's at least my description of the current status quo. I'd echo again what @pchickey said in that it's probably best to approach this with care and understanding to better understand what's going on. I don't doubt that we've made the wrong decision sometimes on what should be unsafe or where the qualifier should be mentioned. While we try to do the right thing we're human after all and make mistakes. That being said I do think it's best to assume good faith intentions on all sides, for example we're not trying to be a special project with our own definition of unsafe. We're no different than any other Rust project in that regard, ideally we'd have zero unsafe but that doesn't work so we try the next best thing.


With that background context, I personally think it'd be good to have some audits in this area. That being said though this is not an easy undertaking. Modifying internals of Wasmtime like this often requires in-depth knowledge of how the pieces all fit together. I also personally think it would be useful to clutter most of Wasmtime with unchecked brittle comments that generally all repeat the same thing (e.g. if everything just said // SAFETY: well cranelift can't be proven safe so here we are).

We've historically tried to keep architecture decisions in Wasmtime documented and that's a good place for decisions along the lines of "well to understand anything you gotta understand this".

So overall I would at least personally say it'd be a good idea to audit things, but I would also request that you don't send a huge PR with comments everywhere that would take quite some time to review. I'd recommend starting on a particular location and working your way out from there. For example ValRaw might be a good place to start. I wrote that down myself and made those methods all safe to begin with, and I'm not sure I agree with your characterization that they're obviously unsafe. I think it would be good to dig in a bit more to craft an example of unsafety, perhaps even under MIRI, to better understand. Regardless at least more documentation is warranted if you're left with the impression that it should be unsafe (whether or not it should be), because naturally even if it's with internals it's still best to be clear.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 04 2023 at 20:04):

eggyal commented on issue #7630:

Alex, thank you for such a detailed and thorough response. I will take on board your comments and proceed sensitively and with care.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 04 2023 at 20:29):

eggyal closed issue #7630:

I happened to be browsing some code in this repo and came upon on some surprising (mis?)use of unsafe in function sigs. So I decided to search the entire codebase for unsafe and take a deeper look, and these surprises appear to be quite widespread.

Has this repo a particularly interesting/unusual policy toward unsafe? My understanding is that a fn should be unsafe if UB could arise from safe code unless the caller upholds some condition.

But then we have, for example, the getter methods on ValRaw which should surely all be unsafe as they require the caller to uphold that the instance is of the correct variant for the access; but instead they are exposed as safe methods and just assume that safety condition has been met. It would be trivial for callers to trigger UB purely in safe code because of this.

We also have, for example, ExternRef::to_raw which is currently unsafe ostensibly because the returned raw pointer "is only safe to pass into a store if a GC doesn't happen between when the value is produce and when it's passed into the store"; however, for that reason, passing such a raw pointer into a store should always be (and, I think, correctly is always) an unsafe operation and so there's no need for this function (nor indeed StoreOpaque::insert_vmexternref_without_gc) to be unsafe.

If I haven't misunderstood anything or missed some particular policy of this repo, would you accept PRs that:

  1. correct all such signatures;
  2. document the safety requirements of all unsafe functions; and
  3. add comments that justify all unsafe operations?

I propose undertaking this as both an audit of unsafe usage and (more personally) as an exercise to become more familiar with the codebase. The downside is that in many cases these will be breaking changes.


Last updated: Dec 23 2024 at 12:05 UTC