afonso360 opened PR #3187 from interperter-stack-vaddr
to main
:
Hey,
This PR is #3164 with an additional Virtual Addressing scheme that was discussed as a better alternative to returning the real addresses.
The virtual addresses are split into 4 regions (stack, heap, tables and global values), and the address itself is composed of an
entry
field and anoffset
field. In general theentry
field corresponds to the instance of the resource (e.g. table5 is entry 5) and theoffset
field is a byte offset inside that entry.There is one exception to this which is the stack, where due to only having one stack, the whole address is an offset field.
The number of bits in entry vs offset fields is variable with respect to the
region
and the address size (32bits vs 64bits). This is done because with 32 bit addresses we would have to compromise on heap size, or have a small number of global values / tables. With 64 bit addresses we do not have to compromise on this, but we need to support 32 bit addresses.cc: @abrown @cfallin
afonso360 edited PR #3187 from interperter-stack-vaddr
to main
:
Hey,
This PR is #3164 with an additional Virtual Addressing scheme that was discussed as a better alternative to returning the real addresses.
The virtual addresses are split into 4 regions (stack, heap, tables and global values), and the address itself is composed of an
entry
field and anoffset
field. In general theentry
field corresponds to the instance of the resource (e.g. table5 is entry 5) and theoffset
field is a byte offset inside that entry.There is one exception to this which is the stack, where due to only having one stack, the whole address is an offset field.
The number of bits in entry vs offset fields is variable with respect to the region and the address size (32bits vs 64bits). This is done because with 32 bit addresses we would have to compromise on heap size, or have a small number of global values / tables. With 64 bit addresses we do not have to compromise on this, but we need to support 32 bit addresses.
cc: @abrown @cfallin
afonso360 updated PR #3187 from interperter-stack-vaddr
to main
.
afonso360 updated PR #3187 from interperter-stack-vaddr
to main
.
abrown submitted PR review.
abrown submitted PR review.
abrown created PR review comment:
nit: I think it might be helpful here to just document all of the decisions made in this file in a table:
address size address kind region value (2 bits) entry bits (#) offset bits (#) 32 Stack 0b00 0 30 32 Heap 0b01 2 28 ... ... ... ... ... 64 Stack 0b00 0 62 ... ... ... ... ...
abrown created PR review comment:
I should really create a helper functions for this common "just run the interpreter on this CLIF" case. Don't worry about it for this PR but eventually I'll do something like that.
cfallin submitted PR review.
cfallin created PR review comment:
Would it make sense to track the current sum (basically a "frame pointer") in the interpreter state, rather than compute this dynamically? I could see a worst-case quadratic cost here especially if the fuzzer gets too devious with its test-case generation.
cfallin submitted PR review.
cfallin created PR review comment:
Is there a particular reason that we need to pull these specific errors out of the more general "heap out-of-bounds" error below? It seems to me that at least the first two (OOB load, OOB store) are canonical OOB cases, and "invalid address" would make some sense if we had a more general notion of address-space maps in Wasm, but I think the complexity and confusion presented in the API here is not really worth it. @abrown thoughts?
cfallin created PR review comment:
Converting to a raw pointer then accessing it with
unsafe
feels a little bit unnecessary to me -- not that it's hard to see the safety here, since the code is very simple, but it entrenches a pattern that could go wrong later if it gets much more complicated.Instead perhaps could we:
- Get a slice
&mut [u8]
of raw memory bytes from whatever region/object we're accessing (so the underlying interpreter-state for the stack, any wasm memories, etc., is just aVec<u8>
)- Slice that into the number of bytes we need at the offset we need
- Provide accessors on
DataValue
that read and write from&[u8]
/&mut [u8]
?
abrown submitted PR review.
abrown created PR review comment:
I guess that makes sense:
InvalidAddress
is also really just an OOB. Can we get away withStackOutOfBounds
here (to match the other...OutOfBounds
below)?
abrown submitted PR review.
abrown created PR review comment:
I think @afonso360 was just going along with what existed previously in
DataValue
(those unsafe methods are necessary for interacting with trampolines). I agree that it would probably be better to just some safe versions of those functions toDataValue
.
cfallin submitted PR review.
cfallin created PR review comment:
Indeed, it was a reasonable approach given the APIs that existed! The unsafe accessors on
DataValue
can certainly stay for the other use-cases (and perhaps a future refactor can centralize those into e.g.unsafe { std::slice::from_raw_parts(...) }
in the trampoline-adjacent code and then use of safe APIs onDataValue
, but that's a separate conversation!).
afonso360 submitted PR review.
afonso360 created PR review comment:
I didn't use
HeapOutOfBounds
because i thought it would be heap specific, but we can fold all other memory accesses / invalid address into that.We can also specialize a
StackOutOfBounds
forstack_loads
/stores
.
afonso360 submitted PR review.
afonso360 created PR review comment:
Yeah, sounds like a good idea. I can probably fix that when I implement heap access, ill probably have to add a number of tests similar to these ones.
afonso360 updated PR #3187 from interperter-stack-vaddr
to main
.
afonso360 submitted PR review.
afonso360 created PR review comment:
It looks like
read_value_from
/write_value_to
are only used in the interpreter, so I replaced them withwrite_to_slice
andread_from_slice
. It looks like we don't use them anywhere else, but if we need them we have a version that usesslice::from_raw_parts
in 5c594c4
afonso360 updated PR #3187 from interperter-stack-vaddr
to main
.
afonso360 submitted PR review.
afonso360 created PR review comment:
Oh, it looks like we duplicate the
write_value_to
in function_runner.rs
afonso360 updated PR #3187 from interperter-stack-vaddr
to main
.
afonso360 updated PR #3187 from interperter-stack-vaddr
to main
.
afonso360 submitted PR review.
afonso360 created PR review comment:
Since we had this duplicated, I thought it would be best to merge it all into DataValue right now before we have some hard to track down memory layout mismatch.
afonso360 edited PR review comment.
afonso360 updated PR #3187 from interperter-stack-vaddr
to main
.
abrown merged PR #3187.
Last updated: Jan 24 2025 at 00:11 UTC