Stream: git-wasmtime

Topic: wasmtime / PR #9535 add experimental support for illumos


view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 20:38):

sunshowers opened PR #9535 from sunshowers:illumos to bytecodealliance:main:

With this change, the basic wasm tests all pass on illumos. Note the addition of NORESERVE to mmap calls.

However:

While wasmtime appears to be functional on illumos, it is still quite slow, particularly in the wast tests. For example, the following test:

cargo +beta test --test wast -- Cranelift/pooling/tests/spec_testsuite/load.wast

takes 0.07 seconds on Linux, but over 5 seconds on illumos. Some profiling suggests that this is due to lock contention inside the kernel while freeing memory, so I don't think this is a wasmtime issue. I'd like to pull some illumos experts in to do some debugging here as time permits, but I don't think this PR should necessarily be held up on that.

Thanks to iximeow for all the help with this!

(One note is that due to a rustc segfault on illumos, building wasmtime requires Rust 1.83 or higher. I did my building and testing with +beta.)

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 20:38):

sunshowers requested wasmtime-core-reviewers for a review on PR #9535.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 20:38):

sunshowers requested pchickey for a review on PR #9535.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 20:40):

sunshowers submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 20:40):

sunshowers created PR review comment:

Having spent some time digging into this, I believe that we should also pass in NORESERVE on Linux here as well. On Linux this is a bit hidden with overcommit, but e.g. Chromium passes in NORESERVE: https://chromium.googlesource.com/v8/v8/+/chromium/2840/src/base/platform/platform-linux.cc#234

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 20:41):

sunshowers edited PR review comment.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 20:43):

sunshowers commented on PR #9535:

re slowness, for future reference, with 997Hz sampling:

user stacks: https://gist.github.com/sunshowers/b69b7bd2e671d9c23355d5e952636c5e

kernel stacks: https://gist.github.com/sunshowers/fa822f161e54d57a8103f6736656fbe8

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 20:44):

sunshowers edited PR #9535:

With this change, the basic wasm tests all pass on illumos. Note the addition of NORESERVE to mmap calls.

However:

While wasmtime appears to be functional on illumos, it is still quite slow, particularly in the wast tests. For example, the following test:

cargo +beta test --test wast -- Cranelift/pooling/tests/spec_testsuite/load.wast

takes 0.07 seconds on Linux, but over 5 seconds on illumos. Some profiling suggests that this is due to lock contention inside the kernel while freeing memory, so I don't think this is a wasmtime issue. I'd like to pull some illumos experts in to do some debugging here as time permits, but I don't think this PR should necessarily be held up on that.

Thanks to iximeow for all the help with this!

(One note is that due to a rustc segfault on illumos, building wasmtime requires Rust 1.83 or higher. I did my building and testing with +beta.)

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 21:02):

alexcrichton created PR review comment:

Question for you (as I've not had much experience historically with MMAP_NORESERVE):

In above situations MMAP_NORESERVE is paired with ProtFlags::empty() which makes sense to me because that's just a VM reservation as opposed to actually needing readable/writable memory at any one point in time. Here, though, there are read/write flags (same as remap_as_zeros_at below). Does that means that we should leave off MMAP_NORESERVE here?

This function is for example used for erasing wasm memory that was already accessible to the guest itself so in some sense we in theory already have pages reserved for it, so I think there's a case to be made for leaving off the flag here and below. That being said I'm new to this flag, so I'm curious what you think too!

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 21:02):

alexcrichton submitted PR review:

This is great, thank you for your time in doing this!

One high-level question for you: how easy is it to compile for illumos from Linux? For example we'd ideally couple this with a check in our CI that illumos continues to build (e.g. at least via cargo check) but our CI only runs Linux/macOS/Windows right now so we'd have to cross-compile. If it's easy to cross-compile I think it'd be quite reasonable to have a check added to CI that this passes. For example this is our FreeBSD check (and reading that again we should probably apply the same trick you've done here with handling vtune/ittapi by default, but that's ok to leave to a future PR)

takes 0.07 seconds on Linux, but over 5 seconds on illumos

Whoa! It looks like the pooling allocator is the part that's slow here and that, by default, has a large number of virtual memory mappings associated with it. For example it'll allocate terabytes of virtual memory and then within that giant chunk it'll slice up roughly 10_000 linear memories (each with guard regions between them). These are prepared with a MemoryImageSlot each.

My guess is that the way things are managed is tuned to "this is acceptable due to some fast path in Linux we're hidding" which we didn't really design for and just happened to run across. I think it'd be reasonable to solve this along a number of lines such as:

Basically I think it'd be reasonable to handle this in Wasmtime with code changes too. I'm certainly no expert with illumos though so I'm also happy to defer to those with more knowledge of it.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 21:02):

alexcrichton created PR review comment:

Sounds reasonable to me!

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 21:02):

alexcrichton created PR review comment:

I think for this one in specific we'll want to leave off NORESERVE since this is creating a mapping that is entirely readable/writable and aren't doing the same trick with address-space reservation like the reserve call below.

(unless in your testing you found that this was required, in which case I'd be curious to dig in!)

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 21:07):

sunshowers commented on PR #9535:

Change default settings in Wasmtime for illumos (e.g. different defaults for the pooling allocator or something like that)

Interesting -- how would I go about doing this? I was floundering around the illumos code haha :)

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 21:13):

sunshowers submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 21:13):

sunshowers created PR review comment:

I was wondering about this and the other cases with non-PROT_NONE permissions! From reading the code it wasn't quite clear to me whether this is memory that guests would certainly have accessed and therefore have physical reservations for already, or memory that guests could merely _theoretically_ access.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 21:13):

sunshowers edited a comment on PR #9535:

Change default settings in Wasmtime for illumos (e.g. different defaults for the pooling allocator or something like that)

Interesting -- how would I go about doing this? I was floundering around the wasmtime code haha :)

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 21:14):

sunshowers submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 21:14):

sunshowers created PR review comment:

@iximeow actually suggested I do this, so I'm wondering if they have thoughts

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 21:16):

sunshowers commented on PR #9535:

One high-level question for you: how easy is it to compile for illumos from Linux? For example we'd ideally couple this with a check in our CI that illumos continues to build (e.g. at least via cargo check) but our CI only runs Linux/macOS/Windows right now so we'd have to cross-compile. If it's easy to cross-compile I think it'd be quite reasonable to have a check added to CI that this passes.

cargo check is definitely feasible. cargo build would require the linker etc, and is likely easier with cross.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 21:20):

iximeow submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 21:20):

iximeow created PR review comment:

i may have misremembered where decommit_pages gets used! my memory was that this is used when cleaning up after an instance exits, where a new module that could have a smaller max heap size. so my thinking was that even though the pages had been accessible before, we don't yet know if that will still be the case on the instance's next invocation.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 21:25):

sunshowers edited a comment on PR #9535:

One high-level question for you: how easy is it to compile for illumos from Linux? For example we'd ideally couple this with a check in our CI that illumos continues to build (e.g. at least via cargo check) but our CI only runs Linux/macOS/Windows right now so we'd have to cross-compile. If it's easy to cross-compile I think it'd be quite reasonable to have a check added to CI that this passes.

cargo check is definitely feasible. cargo build would require the linker etc, and is likely easier with cross.

edit: ahh spoke too soon. cargo check on illumos currently fails with "error occurred: Failed to find tool. Is gar installed?" (full output).

cross build --target x86_64-unknown-illumos appears to work though.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 21:26):

sunshowers edited a comment on PR #9535:

One high-level question for you: how easy is it to compile for illumos from Linux? For example we'd ideally couple this with a check in our CI that illumos continues to build (e.g. at least via cargo check) but our CI only runs Linux/macOS/Windows right now so we'd have to cross-compile. If it's easy to cross-compile I think it'd be quite reasonable to have a check added to CI that this passes.

cargo check is definitely feasible. cargo build would require the linker etc, and is likely easier with cross.

edit: ahh spoke too soon. cargo check on illumos currently fails with "error occurred: Failed to find tool. Is gar installed?" (full output).

gar is what GNU ar is called on illumos. One could mangle paths and such but it's easier to just use cross build --target x86_64-unknown-illumos, which works great.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 22:18):

alexcrichton submitted PR review:

but it's easier to just use cross build --target x86_64-unknown-illumos

oh nice! If you'd like to add that here as well feel free, otherwise I'm happy to try to finagle this in a follow-up too


If you want to setup the NORESERVE flag for linux builds as well I think this might be good to go after that?

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 22:18):

alexcrichton created PR review comment:

Heh this function definitely has a storied history and has gone through many iterations. I had to reread up on the callsites to refresh myself on what exactly is happening here. @iximeow you're right that this is used after an instance is deallocated, but this is separate from resizing memory. Here decommit_pages is only used to purge backing memory for ranges that were possibly-in-use previously and reset them to their original contents. That's used for both linear memories and tables then. Later other functions like hide_existing_mapping are use to shrink memory or expose_existing_mapping are used to grow memory.

So put another way: this function is used on memory which was theoretically accessible to wasm but it may or may not have touched because it may not have touched every page in its linear memory. Some of it may be reused for the next instance without further syscalls. Some of it may be hidden via mprotect and then re-exposed via mprotect to use in wasm later on too.

I suspect Wasmtime's not built currently to work well with precise overcommit because we'd have to hook into expose_existing_mapping as well for example which currently is just a bland mprotect. Otherwise I don't think it's unreasonable to use NORESERVE here.

Overall though it seems like the main use case for NORESERVE is in the reserve function above where the allocations are often 6G+. Everywhere else should be relatively small allocations/deallocations related to what wasm is about to use or previously using.

In any case happy to leave this in here and we can always frob it later if necessary.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 22:30):

alexcrichton submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 22:30):

alexcrichton created PR review comment:

on second read, mind moving this up to the first case? I think this is the same as that up there and should only need an update to the #[cfg] to include illumos too

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 22:30):

sunshowers requested wasmtime-default-reviewers for a review on PR #9535.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 22:30):

sunshowers updated PR #9535.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 22:30):

sunshowers submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 22:30):

sunshowers created PR review comment:

Ah good catch.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 22:32):

sunshowers updated PR #9535.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 22:37):

sunshowers commented on PR #9535:

Ugh, I guess I need to clean the target dir for illumos. Silly glibc.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 22:47):

sunshowers updated PR #9535.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 22:48):

sunshowers updated PR #9535.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 22:50):

sunshowers updated PR #9535.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 22:54):

sunshowers commented on PR #9535:

All right, CI updated and I think this is good to go.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 22:55):

sunshowers submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 31 2024 at 22:55):

sunshowers created PR review comment:

I left this in based on the discussion below -- do you feel like that's okay or should I test without the flag?

view this post on Zulip Wasmtime GitHub notifications bot (Nov 01 2024 at 01:16):

alexcrichton submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 01 2024 at 01:16):

alexcrichton created PR review comment:

Nah yeah let's leave it here and see how it goes, it doesn't seem unreasonable given the parts below

view this post on Zulip Wasmtime GitHub notifications bot (Nov 01 2024 at 01:16):

alexcrichton submitted PR review.

view this post on Zulip Wasmtime GitHub notifications bot (Nov 01 2024 at 01:35):

alexcrichton merged PR #9535.


Last updated: Dec 23 2024 at 12:05 UTC