Stream: wasmtime

Topic: ✔ Issue with pooling allocator


view this post on Zulip Taylor Thomas (May 02 2024 at 19:15):

Hey all! I'm having a problem the pooling allocator for wasmtime. What it boils down to is that we can never get the pooling allocator to work on a device that has less than 2GB-ish of memory. I tried using the defaults, lowering the defaults even further for all the options and every time it looked like it tried to allocate the same amount of memory, returning the error:

Error: failed to initialize host

Caused by:
    0: failed to build runtime
    1: failed to construct engine
    2: failed to create stack pool mapping
    3: mmap failed to allocate 0x7d3e8000 bytes
    4: Cannot allocate memory (os error 12)

So no matter what I did it tried to allocate 0x7d3e8000 bytes. So my question is: Do any of the options actually help tweak this? And as a follow up, should we just be using the dynamic allocator on small systems? I keep trying to dig into the code, but it is a bit too low level for me and hard to follow what the actual impact on allocated memory is.

Just to be clear, here is what I've tried:

  1. Using PoolingAllocationConfig::default()
  2. Lowering the default numbers (see below for snippet)
  3. Lowering the static memory guard size to 1GB
pooling_config
            .total_component_instances(500)
            .total_memories(500)
            .total_tables(500)
            .linear_memory_keep_resident((10 * MB) as usize)
            .table_keep_resident((10 * MB) as usize);

Anyone have any ideas here?

view this post on Zulip Taylor Thomas (May 02 2024 at 19:17):

Also, it has failed in both the TablePool allocation and the StackPool allocation, depending on the settings

view this post on Zulip Chris Fallin (May 02 2024 at 19:26):

assuming this is a Linux machine -- have you tried tweaking the VM overcommit settings?

view this post on Zulip Chris Fallin (May 02 2024 at 19:27):

in general we're pretty free about mmap'ing large regions and they'll be only sparsely populated; the actual RSS should be close to the sum of all instances' heaps, tables, vmcontexts

view this post on Zulip Taylor Thomas (May 02 2024 at 19:29):

Yeah the failures have been on linux. Let me try tweaking the overcommit settings

view this post on Zulip Taylor Thomas (May 02 2024 at 19:32):

That did it

view this post on Zulip Taylor Thomas (May 02 2024 at 19:35):

Setting it to 1 that is

view this post on Zulip Taylor Thomas (May 02 2024 at 19:35):

The default heuristic didn't work

view this post on Zulip Chris Fallin (May 02 2024 at 19:41):

it's worth doing the math on maximum actual RSS too to make sure the default heuristic wasn't "onto something" (i.e. was actually reasonable)

view this post on Zulip Chris Fallin (May 02 2024 at 19:41):

ballpark math can be as simple as number-of-slots * max-heap-size

view this post on Zulip Lann Martin (May 02 2024 at 19:41):

Yeah default overcommit (on _some_ distros) is to only allow overcommit up to actual physical ram size

view this post on Zulip Taylor Thomas (May 02 2024 at 21:29):

So the follow up here is why it had issues even when I lowered all of the max number of components and memories. Am I not tweaking it right?

view this post on Zulip Peter Huene (May 02 2024 at 21:42):

the default guard region size is 2GiB, so each linear memory slot the pooling allocator creates will reserve 6GiB of address space

view this post on Zulip Taylor Thomas (May 02 2024 at 21:44):

Yeah, I knew that the guard region was configurable but that didn't help either without overcommit being changed

view this post on Zulip Taylor Thomas (May 02 2024 at 21:46):

Would the proper thing be to lower the static_memory_maximum_size to be something like 2GB instead of 4GB?

view this post on Zulip Taylor Thomas (May 02 2024 at 21:58):

Well, that seemed to do it for me. Lowered memory size to 2GB and guard to 1GB

view this post on Zulip Chris Fallin (May 02 2024 at 21:58):

I'm curious about your other knobs, most particularly the number of slots in the pooling allocator -- if an individual module has a max memory size of 2GiB, and you have more than one slot in use, you'll exceed your system's 2GiB physical memory

view this post on Zulip Chris Fallin (May 02 2024 at 21:59):

(overcommit / optimistic underprovisioning is of course a thing, but at least in some contexts one wants to size for worst-case instead)

view this post on Zulip Taylor Thomas (May 02 2024 at 22:00):

By slots do you mean "number of components" or something else? I figured with an overprovisioning here, I'd assume that it doesn't actually consume all that memory until it is actually used (since it is just in the virtual address space)

view this post on Zulip Peter Huene (May 02 2024 at 22:06):

related, i'm not sure if a guard page size > 0 make sense for a static memory size < 4GB anyway as bounds checks will always be emitted (i think the pooling allocator might still reserve those pages even though they'll never be hit by an out-of-bounds memory access)

view this post on Zulip Peter Huene (May 02 2024 at 22:07):

and by slots, I think it would be total_memories in this case (there's also total_stacks to think about for async)

view this post on Zulip Taylor Thomas (May 02 2024 at 22:08):

Oh yep you're right:

For 32-bit wasm memories a 4GB static memory is required to even start removing bounds checks.

view this post on Zulip Peter Huene (May 02 2024 at 22:12):

but to Chris' point, I'd expect any static memory size < 4GB in a pooling allocator on a device with constrained memory to be closer to a value that one would expect to be able to support 500 concurrently used linear memories for; like 4 MB or something

view this post on Zulip Taylor Thomas (May 02 2024 at 22:13):

Yep, this is all starting to make some sense. Might document this or write a blog post so there are some more example out there

view this post on Zulip Taylor Thomas (May 02 2024 at 22:51):

Thanks all for the help here. I think I have a semi-decent grasp on things now

view this post on Zulip Notification Bot (May 02 2024 at 22:51):

Taylor Thomas has marked this topic as resolved.


Last updated: Jan 24 2025 at 00:11 UTC