✔ Issue with pooling allocator · wasmtime

Hey all! I'm having a problem the pooling allocator for wasmtime. What it boils down to is that we can never get the pooling allocator to work on a device that has less than 2GB-ish of memory. I tried using the defaults, lowering the defaults even further for all the options and every time it looked like it tried to allocate the same amount of memory, returning the error:

Error: failed to initialize host

Caused by:
    0: failed to build runtime
    1: failed to construct engine
    2: failed to create stack pool mapping
    3: mmap failed to allocate 0x7d3e8000 bytes
    4: Cannot allocate memory (os error 12)

So no matter what I did it tried to allocate 0x7d3e8000 bytes. So my question is: Do any of the options actually help tweak this? And as a follow up, should we just be using the dynamic allocator on small systems? I keep trying to dig into the code, but it is a bit too low level for me and hard to follow what the actual impact on allocated memory is.

pooling_config
            .total_component_instances(500)
            .total_memories(500)
            .total_tables(500)
            .linear_memory_keep_resident((10 * MB) as usize)
            .table_keep_resident((10 * MB) as usize);

Taylor Thomas (May 02 2024 at 19:17):

Also, it has failed in both the TablePool allocation and the StackPool allocation, depending on the settings

Chris Fallin (May 02 2024 at 19:26):

assuming this is a Linux machine -- have you tried tweaking the VM overcommit settings?

Chris Fallin (May 02 2024 at 19:27):

in general we're pretty free about mmap'ing large regions and they'll be only sparsely populated; the actual RSS should be close to the sum of all instances' heaps, tables, vmcontexts

Taylor Thomas (May 02 2024 at 19:29):

Yeah the failures have been on linux. Let me try tweaking the overcommit settings

Taylor Thomas (May 02 2024 at 19:32):

Taylor Thomas (May 02 2024 at 19:35):

Chris Fallin (May 02 2024 at 19:41):

it's worth doing the math on maximum actual RSS too to make sure the default heuristic wasn't "onto something" (i.e. was actually reasonable)

Chris Fallin (May 02 2024 at 19:41):

Lann Martin (May 02 2024 at 19:41):

Yeah default overcommit (on _some_ distros) is to only allow overcommit up to actual physical ram size

Taylor Thomas (May 02 2024 at 21:29):

So the follow up here is why it had issues even when I lowered all of the max number of components and memories. Am I not tweaking it right?

Peter Huene (May 02 2024 at 21:42):

the default guard region size is 2GiB, so each linear memory slot the pooling allocator creates will reserve 6GiB of address space

Taylor Thomas (May 02 2024 at 21:44):

Yeah, I knew that the guard region was configurable but that didn't help either without overcommit being changed

Taylor Thomas (May 02 2024 at 21:46):

Would the proper thing be to lower the static_memory_maximum_size to be something like 2GB instead of 4GB?

Taylor Thomas (May 02 2024 at 21:58):

Chris Fallin (May 02 2024 at 21:58):

I'm curious about your other knobs, most particularly the number of slots in the pooling allocator -- if an individual module has a max memory size of 2GiB, and you have more than one slot in use, you'll exceed your system's 2GiB physical memory

Chris Fallin (May 02 2024 at 21:59):

(overcommit / optimistic underprovisioning is of course a thing, but at least in some contexts one wants to size for worst-case instead)

Taylor Thomas (May 02 2024 at 22:00):

By slots do you mean "number of components" or something else? I figured with an overprovisioning here, I'd assume that it doesn't actually consume all that memory until it is actually used (since it is just in the virtual address space)

Peter Huene (May 02 2024 at 22:06):

related, i'm not sure if a guard page size > 0 make sense for a static memory size < 4GB anyway as bounds checks will always be emitted (i think the pooling allocator might still reserve those pages even though they'll never be hit by an out-of-bounds memory access)

Peter Huene (May 02 2024 at 22:07):

and by slots, I think it would be total_memories in this case (there's also total_stacks to think about for async)

Taylor Thomas (May 02 2024 at 22:08):

Peter Huene (May 02 2024 at 22:12):

but to Chris' point, I'd expect any static memory size < 4GB in a pooling allocator on a device with constrained memory to be closer to a value that one would expect to be able to support 500 concurrently used linear memories for; like 4 MB or something

Taylor Thomas (May 02 2024 at 22:13):

Yep, this is all starting to make some sense. Might document this or write a blog post so there are some more example out there

Stream: wasmtime

Topic: ✔ Issue with pooling allocator

Taylor Thomas (May 02 2024 at 19:15):

Taylor Thomas (May 02 2024 at 19:17):

Chris Fallin (May 02 2024 at 19:26):

Chris Fallin (May 02 2024 at 19:27):

Taylor Thomas (May 02 2024 at 19:29):

Taylor Thomas (May 02 2024 at 19:32):

Taylor Thomas (May 02 2024 at 19:35):

Taylor Thomas (May 02 2024 at 19:35):

Chris Fallin (May 02 2024 at 19:41):

Chris Fallin (May 02 2024 at 19:41):

Lann Martin (May 02 2024 at 19:41):

Taylor Thomas (May 02 2024 at 21:29):

Peter Huene (May 02 2024 at 21:42):

Taylor Thomas (May 02 2024 at 21:44):

Taylor Thomas (May 02 2024 at 21:46):

Taylor Thomas (May 02 2024 at 21:58):

Chris Fallin (May 02 2024 at 21:58):

Chris Fallin (May 02 2024 at 21:59):

Taylor Thomas (May 02 2024 at 22:00):

Peter Huene (May 02 2024 at 22:06):

Peter Huene (May 02 2024 at 22:07):

Taylor Thomas (May 02 2024 at 22:08):

Peter Huene (May 02 2024 at 22:12):

Taylor Thomas (May 02 2024 at 22:13):

Taylor Thomas (May 02 2024 at 22:51):

Notification Bot (May 02 2024 at 22:51):