wig generated types & safe access to linear memory · wasmtime

Stream: wasmtime

Topic: wig generated types & safe access to linear memory

Pat Hickey (Dec 17 2019 at 20:39):

As I've been working on expanding the type of code generation that can be done for wasi-common by wig, I've come up against some fundamental problems with how the Rust host code is allowed to access linear memory safely. I believe the current approach is open to UB, since we can take overlapping mutable slices into memory depending on user input

Pat Hickey (Dec 17 2019 at 20:40):

I've been trying to design a way to run-time validate that borrows are safe, sort of like RefCell does, and how that design will work with generating type definitions via wig

Pat Hickey (Dec 17 2019 at 20:41):

so far I've been working on a design in this gdoc with @Andy Wortman but would appreciate feedback from other wasmtime folks, probably @Alex Crichton would be most relevant? https://docs.google.com/document/d/1oEG49YcM9DFKLenhRkswx8tTRG1Z0Q6fmed8cWH0Jmg/edit?usp=sharing

Pat Hickey (Dec 17 2019 at 20:44):

once I have a bit more confidence in the design I'll post this text in an issue.

Pat Hickey (Dec 18 2019 at 00:57):

related to the above - ive added code to witx, based on some old stuff i did in lucet-idl, to calculate layout in the wasm32 C ABI. We need this to identify which portions of pointees are themselves pointers, among other reasons (arrays of structs is another good one)

Pat Hickey (Dec 18 2019 at 00:57):

https://github.com/WebAssembly/WASI/pull/181/commits/4261799a9d6b999542a5214b25585d65050bb7fe

Compute layout of datatypes according to C ABI by pchickey · Pull Request #181 · WebAssembly/WASI

Based on #164 This is useful for validating pointers, enums, and flags in host code.

Pat Hickey (Dec 18 2019 at 00:57):

neither @Andy Wortman or I are aware of a good source of truth for how llvm lays out structs & etc in memory beyond running llvm itself.

Pat Hickey (Dec 18 2019 at 00:58):

maybe @Dan Gohman has a good idea here.

Dan Gohman (Dec 18 2019 at 00:58):

Outside of LLVM, what we have is https://github.com/WebAssembly/tool-conventions/blob/master/BasicCABI.md#data-representation

WebAssembly/tool-conventions

Conventions supporting interoperatibility between tools working with WebAssembly. - WebAssembly/tool-conventions

Pat Hickey (Dec 18 2019 at 00:59):

for checking the correctness of that code when it was in lucet-idl, I used proptest to generate random schemas, and then I rendered them to C source text, and I rendered all of the layout information I calculated to static asserts using sizeof / offsetof

Pat Hickey (Dec 18 2019 at 00:59):

and ran that until it could run for a good long time without finding a counterexample.

Pat Hickey (Dec 18 2019 at 01:00):

ah, im not following this rule

Each member is assigned to the lowest available offset with the appropriate alignment

Pat Hickey (Dec 18 2019 at 01:01):

idk how I never found that via counterexample back in lucet-idl. I guess I wasnt searching the space well enough

iximeow (Dec 18 2019 at 01:31):

i think in the context of struct field layout, that rule can be disambiguated by.... i wonder if markdown works here:

If two pointers to object types both point to the same object, or both point one past the last element of the same array object, they compare equal.
from 6.5.8 paragraph 5 of http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf
which i read as, functionally, forbidding reordering struct fields

(i don't see much in the spec here talking about the layout of aggregates at all, otherwise, actually...)

Pat Hickey (Dec 18 2019 at 01:31):

so in that case, i believe my calculation of aggregate layout is correct.

Pat Hickey (Dec 18 2019 at 01:32):

anyhow, put the gdoc into a github issue: https://github.com/bytecodealliance/wasmtime/issues/734

[wasi-common] Re-design of generated types, for safer use and more complete code generation · Issue #734 · bytecodealliance/wasmtime

wig presently generates wasi-common types from witx descriptions of the standard. This is a great start, but there are a handful of ways in which the design of wasi-common is difficult to automatic...

Pat Hickey (Dec 18 2019 at 01:34):

(deleted)

iximeow (Dec 18 2019 at 01:38):

i want to suggest that we should clarify that offset described in BasicCABI.md grows in order of the fields of an aggregate, but that also seems like a mistake for the same reason i can't find such a sentence in the C spec: it definitively rules out reordering fields :( does it make sense to say this is essentially an FFI specification and we make some arbitrary choices for the sake of confidence here?

iximeow (Dec 18 2019 at 01:38):

since in theory there's no reason witx must match the real order of bytes anywhere, so long as serializers/deserializers work right. if it happens to be a pointer cast, cool

Dan Gohman (Dec 18 2019 at 21:05):

Are you asking whether we can reorder fields in BasicCABI.md or in witx implementations?

Pat Hickey (Dec 18 2019 at 21:45):

I was thinking about the BasicCABI.md

Pat Hickey (Dec 18 2019 at 21:45):

im assuming we wont reorder fields from witx -> C ABI, because doing so would be a surprise

Dan Gohman (Dec 18 2019 at 21:50):

Unfortunately, C compilers can't reorder struct fields in general, because people use C structs to interface with independently serialized data.

Dan Gohman (Dec 18 2019 at 21:50):

As in, read data from a file into a buffer, cast the buffer's address to a pointer-to-struct, and read the fields.

Pat Hickey (Dec 18 2019 at 21:51):

yep. not sure i agree that its unfortunate

Pat Hickey (Dec 18 2019 at 21:51):

Dan Gohman (Dec 18 2019 at 21:51):

Rust can reorder fields, because in Rust if you want to do that kind of thing, you have to use repr(C) :-)

Last updated: Apr 08 2025 at 08:04 UTC