ssunkin-fastly opened PR #6820 from ssunkin-fastly:wasmtime-valgrind-hacking
to bytecodealliance:main
:
This is a draft PR for Valgrind on Wasm for early review by cfallin and ixi.
ssunkin-fastly updated PR #6820.
cfallin submitted PR review:
This is looking really great! I have a whole bunch of minor comments but the overall shape of the code is good and we should be able to refine things fairly quickly, I think.
I know you're working on doc-comments but will note in particular that our CI checks will fail until there's a doc-comment on every public type, enum, function; it might feel a little tedious but it's good practice :-)
It'd be good to file some followup GitHub issues and refer to issue numbers where appropriate in comments as well. Off the top of my head:
- Support for multiple memories
- Support for instrumenting SIMD loads/stores
- Support for handling different memory layouts (assume whole initial memory size is valid, for example)
- Option to print valgrind errors and continue (as the "real Valgrind" does) rather than trap
and anything else on your to-do list you want to capture.
Happy to pair on any of the requests here if unclear!
cfallin submitted PR review:
This is looking really great! I have a whole bunch of minor comments but the overall shape of the code is good and we should be able to refine things fairly quickly, I think.
I know you're working on doc-comments but will note in particular that our CI checks will fail until there's a doc-comment on every public type, enum, function; it might feel a little tedious but it's good practice :-)
It'd be good to file some followup GitHub issues and refer to issue numbers where appropriate in comments as well. Off the top of my head:
- Support for multiple memories
- Support for instrumenting SIMD loads/stores
- Support for handling different memory layouts (assume whole initial memory size is valid, for example)
- Option to print valgrind errors and continue (as the "real Valgrind" does) rather than trap
and anything else on your to-do list you want to capture.
Happy to pair on any of the requests here if unclear!
cfallin created PR review comment:
Let's call this
wasmtime-valgrind
for consistency with the other crates incrates/
; also, the version should match the others (11.0.0
here but likely13
once you merge or rebase to a latestmain
).
cfallin created PR review comment:
tiny style nit but there's a missing newline at the end of this file (GitHub flags it with a symbol) -- could you add one?
cfallin created PR review comment:
stray comment about
Value
can be removed I think?
cfallin created PR review comment:
Let's add a TODO referring to a filed issue to add the load/store hooks for vector operands -- something like
// TODO(#1234): add
before_load/
before_storefor SIMD loads and stores.
where1234
is a GitHub issue number for a new issue you've filed.
cfallin created PR review comment:
stray comment we can remove I think?
cfallin created PR review comment:
Just making a note here of empty doc-comments to fill in before merge (I know you're working on this at the moment).
cfallin created PR review comment:
To make this a bit more robust (someone could define a function named
malloc
with no args), let's check whether there actually is an arg 2. Something like:let func_args = builder.func.dfg.block_params(builder.func.layout.entry_block().unwrap()); let len = if func_args.len() < 3 { return; } else { // If a function named `malloc` has at least one argument, we assume the // first argument is the requested allocation size. func_args[2] };
(also comment about the assumption, something like the above, would be good)
cfallin created PR review comment:
commented-out println and comment can be removed
cfallin created PR review comment:
Let's add a similar early return if
retvals
is empty (someone could definevoid malloc()
if they wanted to!) and a comment about the assumption (that the returned value is a pointer to the allocated memory).
cfallin created PR review comment:
Let's call this
hook_malloc_exit
I think -- it's a little clearer (we're "hooking" or inserting a hook, rather than checking right now at compile time). Likewise below forcheck_free_exit
...
cfallin created PR review comment:
Let's change this to a
panic
and add a helpful message -- "function name not aUserFuncName::User
as expected" or something like that?
cfallin created PR review comment:
Likewise here: check, early return, comment about assumption.
cfallin created PR review comment:
Have we tested what happens if we don't have a name section? We should I think handle this case gracefully (I guess by treating it as if no functions match). You can test this by doing
wasm-strip program.wasm
and then running (wasm-strip
is part of Binaryen I think; happy to help or run this for you if needed).
cfallin created PR review comment:
This pattern occurs several times -- can we wrap it in a helper
current_func_name(&self) -> Option<&str>
?(The
Option
return lets us naturally handle the missing name section too: we can doif self.current_func_name() == Some("malloc") { ... }
and we get no-match forNone
.)
cfallin created PR review comment:
I think we can skip these calls here and below (and then rename the
builder
argument to_builder
so we don't get warnings about unused variables/args).
cfallin created PR review comment:
Let's add a TODO here to support multiple memories; for now we have just one
valgrind_state
and it corresponds to memory 0.
cfallin created PR review comment:
likewise here, add newline at end of file
cfallin created PR review comment:
Let's add a comment here about the assumption we're making -- that global 0 is the auxiliary stack pointer.
cfallin created PR review comment:
Can we add an argument here for the memory index (
MemoryIndex
type), and insert the below logic only if the grow is on memory 0? In practice we won't hit the case today but it's possible to define a Wasm module that has multiple memories, and a grow of some other memory could confuse this logic otherwise.
cfallin created PR review comment:
we can remove this println now I think (also search for printlns elsewhere to be sure)
cfallin created PR review comment:
Let's make this a constant (and then uppercase it to keep rustc happy):
let KIB: usize = 1024;
cfallin created PR review comment:
I think this is left-over from prior to the import of sources into the main repo -- you should be able to edit
.gitmodules
manually to remove the stanza?
cfallin created PR review comment:
Let's re-enable this -- or is it disabled because of some assumption we're making wrt data location?
cfallin created PR review comment:
enormously tiny nitpick but this would read slightly more naturally for me as
... * 64 * KIB
cfallin created PR review comment:
Let's update all the metadata here to match the other crates in
crates/
-- developers as "The Wasmtime Developers", license, current version, etc (and remove the comment from the starter template below).
cfallin created PR review comment:
We can probably remove this TODO?
cfallin created PR review comment:
To reduce ambiguity I think this should be called
stack_end
or something similar? ("Stack size" makes me think of the actual size, so a stack from 1MiB down to 1MiB - 4KiB is 4KiB large; but the "end" is 1MiB)
cfallin created PR review comment:
Let's add a doc-comment at the top of this module like
//! add some docs here //! //! in markdown format, with multiple paragraphs as desired //! //! like this
documenting the assumptions about memory layout here.
cfallin created PR review comment:
We'll want a much longer doc-comment of course, probably with a pointer to documentation for more details on how to use!
cfallin created PR review comment:
I'm somewhat surprised this is needed -- could we document the case where it is if so (example addresses or whatnot)?
ssunkin-fastly submitted PR review.
ssunkin-fastly created PR review comment:
This was disabled because all memory below the start of the stack is just marked as read-write-ok
ssunkin-fastly submitted PR review.
ssunkin-fastly created PR review comment:
ah the reason these were added were to call the builtins so that there wouldn't be unused function warnings
iximeow submitted PR review.
iximeow created PR review comment:
fwiw i think the error here is when
state.allocations.len() == 0
, if the length is one i expect this would choose the one valid index (0
). maybe this comment predates theif state.allocations.is_empty()
check above?
iximeow submitted PR review.
iximeow created PR review comment:
Chris mentioned a longer doc comment on the CLI flag below, but this also seems to have prior art for linking to bytecodealliance.github.io for docs pages. i'm not sure if it makes sense to basically copy the CLI flag's help text here, or maybe have the flag say something a little shorter like "enables memory checking for wasm programs, see the doc comment over on config for more"...?
suppose that also depends on what you've changed through the first iteration of feedback here. i'll keep an eye out!
cfallin requested wasmtime-default-reviewers for a review on PR #6820.
cfallin requested wasmtime-core-reviewers for a review on PR #6820.
ssunkin-fastly updated PR #6820.
cfallin has marked PR #6820 as ready for review.
cfallin requested wasmtime-compiler-reviewers for a review on PR #6820.
cfallin requested fitzgen for a review on PR #6820.
cfallin created PR review comment:
Let's add a brief note that wmemcheck requires a non-stripped wasm (with a name section) so that it can recognize the malloc and free functions, and doesn't work with custom allocators yet.
Also, can we add a reference to Valgrind? Something like "wmemcheck is inspired by, and replicates the basic functionality of, Valgrind's memcheck tool."
cfallin submitted PR review:
Very close now!
cfallin created PR review comment:
Let's write these from the generic perspective of Cranelift, without knowledge of wasmtime or wmemcheck (here we're defining the interface/tools that the higher abstraction layer uses). So something like "Insert code before a function return."
Likewise below: before loads/stores/global updates/memory growths.
cfallin created PR review comment:
No need to refer to the implementation ("including wmemcheck_state") here -- the relevant bit for the CLI user is just that it does memory-usage error checking.
cfallin submitted PR review.
cfallin submitted PR review:
Very close now!
cfallin has enabled auto merge for PR #6820.
cfallin submitted PR review:
LGTM, thanks so much!
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Could this perhaps cause an error to get returned during
Engine::new
ifwmemcheck
wasn't enabled at compile time?
cfallin created PR review comment:
Let's rename this to
wasmtime-wmemcheck
, to stick with the convention that everything incrates/
is named with that prefix.
ssunkin-fastly updated PR #6820.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Could this be added to the overall workspace in this repository? (there's a few other examples of crates that have their own fuzzers which are also part of the main workspace)
alexcrichton created PR review comment:
Instead of making this
pub
can access to the underlying index go through.as_u32()
?
cfallin updated PR #6820.
cfallin submitted PR review.
cfallin created PR review comment:
Yup, added a check to
Config::validate()
!
cfallin submitted PR review.
cfallin created PR review comment:
I've gone ahead and deleted the
fuzz
crate -- I think we had talked about this earlier but missed it, as the fuzz targets are out-of-date anyway.
cfallin submitted PR review.
cfallin created PR review comment:
Fixed.
cfallin edited PR #6820:
This PR introduces
wmemcheck
, a memory checker for a Wasm guest running inside Wasmtime.
wmemcheck
operates by recognizing calls to malloc and free inside the Wasm module, and tracking which bytes are valid. It then instruments loads and stores and flags errors when accesses are invalid. This is inspired by thememcheck
tool in Valgrind, hence the name.The functionality is off by default and is enabled with
wasmtime run --wmemcheck
.
Last updated: Dec 23 2024 at 12:05 UTC