As I've been working on expanding the type of code generation that can be done for wasi-common
by wig
, I've come up against some fundamental problems with how the Rust host code is allowed to access linear memory safely. I believe the current approach is open to UB, since we can take overlapping mutable slices into memory depending on user input
I've been trying to design a way to run-time validate that borrows are safe, sort of like RefCell
does, and how that design will work with generating type definitions via wig
so far I've been working on a design in this gdoc with @Andy Wortman but would appreciate feedback from other wasmtime folks, probably @Alex Crichton would be most relevant? https://docs.google.com/document/d/1oEG49YcM9DFKLenhRkswx8tTRG1Z0Q6fmed8cWH0Jmg/edit?usp=sharing
once I have a bit more confidence in the design I'll post this text in an issue.
related to the above - ive added code to witx, based on some old stuff i did in lucet-idl, to calculate layout in the wasm32 C ABI. We need this to identify which portions of pointees are themselves pointers, among other reasons (arrays of structs is another good one)
https://github.com/WebAssembly/WASI/pull/181/commits/4261799a9d6b999542a5214b25585d65050bb7fe
neither @Andy Wortman or I are aware of a good source of truth for how llvm lays out structs & etc in memory beyond running llvm itself.
maybe @Dan Gohman has a good idea here.
Outside of LLVM, what we have is https://github.com/WebAssembly/tool-conventions/blob/master/BasicCABI.md#data-representation
for checking the correctness of that code when it was in lucet-idl, I used proptest
to generate random schemas, and then I rendered them to C source text, and I rendered all of the layout information I calculated to static asserts using sizeof / offsetof
and ran that until it could run for a good long time without finding a counterexample.
ah, im not following this rule
Each member is assigned to the lowest available offset with the appropriate alignment
idk how I never found that via counterexample back in lucet-idl. I guess I wasnt searching the space well enough
i think in the context of struct field layout, that rule can be disambiguated by.... i wonder if markdown works here:
If two pointers to object types both point to the same object, or both point one past the last element of the same array object, they compare equal.
from 6.5.8 paragraph 5 of http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf
which i read as, functionally, forbidding reordering struct fields
(i don't see much in the spec here talking about the layout of aggregates at all, otherwise, actually...)
so in that case, i believe my calculation of aggregate layout is correct.
anyhow, put the gdoc into a github issue: https://github.com/bytecodealliance/wasmtime/issues/734
(deleted)
i want to suggest that we should clarify that offset
described in BasicCABI.md
grows in order of the fields of an aggregate, but that also seems like a mistake for the same reason i can't find such a sentence in the C spec: it definitively rules out reordering fields :( does it make sense to say this is essentially an FFI specification and we make some arbitrary choices for the sake of confidence here?
since in theory there's no reason witx must match the real order of bytes anywhere, so long as serializers/deserializers work right. if it happens to be a pointer cast, cool
Are you asking whether we can reorder fields in BasicCABI.md or in witx implementations?
I was thinking about the BasicCABI.md
im assuming we wont reorder fields from witx -> C ABI, because doing so would be a surprise
Unfortunately, C compilers can't reorder struct fields in general, because people use C structs to interface with independently serialized data.
As in, read data from a file into a buffer, cast the buffer's address to a pointer-to-struct, and read the fields.
yep. not sure i agree that its unfortunate
:)
Rust can reorder fields, because in Rust if you want to do that kind of thing, you have to use repr(C)
:-)
Last updated: Jan 24 2025 at 00:11 UTC