Stream: git-wasmtime

Topic: wasmtime / issue #9914 Memory stay even the engine out of...


view this post on Zulip Wasmtime GitHub notifications bot (Dec 30 2024 at 02:34):

giteewif opened issue #9914:

I use the wasmtime api of loading wasm to get a module, when a module out of lifetime in rust code, i get the memory , and figure out that the memory also cost a lot. Is the compile memory still stay even the module out of lifetime? In rust , when the module out of time i think it would be freed out.

view this post on Zulip Wasmtime GitHub notifications bot (Dec 30 2024 at 02:36):

giteewif edited issue #9914:

I use the wasmtime api of loading wasm to get a module, when a module out of lifetime in rust code, i get the memory , and figure out that the memory also cost a lot. Is the compile memory still stay even the module out of lifetime? In rust , when the module out of time i think it would be freed out. And i also test on engine. Test is like the figure below.
![image](https://github.com/user-attachments/assets/ef9c83c2-8bc4-4fb0-95a7-3decd978a16d)

view this post on Zulip Wasmtime GitHub notifications bot (Jan 06 2025 at 15:36):

alexcrichton commented on issue #9914:

Once the Engine, Module, and Store have all been deallocated then the memory is released back to the OS. Could you detail a bit more what statistic syolu're measuring in get_memory_use and get_detailed_memory_info and what they are printing out in your snippet above?

view this post on Zulip Wasmtime GitHub notifications bot (Jan 07 2025 at 02:41):

giteewif commented on issue #9914:

I also think so.get_memory_use and get_detailed_memory_info is based on the /proc/self/smaps, read the rss or vmrss. Detailed test bellow

fn call_engine() -> Result<()>{
    let (wasm_bytes, work_path) = get_wasm_bytes().unwrap();

    {
    let (engine, linker) = init_engine_linker().unwrap();

    // {

    let module = load_module(&engine, &wasm_bytes).unwrap();


    // get_detailed_memory_info(std::process::id().to_string().as_str()).unwrap();
    // let module3 = load_module(&engine, &wasm_bytes).unwrap();

    // call_fork()?;

    let (instance, store) = init_func_call_env(&engine, &linker, &module, &work_path).unwrap();

    call_func(&instance, "_start", store).unwrap();

    // get_memory_use(std::process::id().to_string().as_str()).unwrap();
    get_detailed_memory_info(std::process::id().to_string().as_str()).unwrap();

    }
    Ok(())
}

fn main() -> Result<()> {
    get_detailed_memory_info(std::process::id().to_string().as_str()).unwrap();
    call_engine().unwrap();

    println!("after");
    get_memory_use(std::process::id().to_string().as_str()).unwrap();
    get_detailed_memory_info(std::process::id().to_string().as_str()).unwrap();
}

output
![image](https://github.com/user-attachments/assets/9beae311-4497-41ef-abef-ca9721b89533)

view this post on Zulip Wasmtime GitHub notifications bot (Jan 07 2025 at 02:43):

giteewif edited a comment on issue #9914:

I also think so.get_memory_use and get_detailed_memory_info is based on the /proc/self/smaps, read the rss or vmrss. Detailed test bellow

fn call_engine() -> Result<()>{
    let (wasm_bytes, work_path) = get_wasm_bytes().unwrap();

    {
    let (engine, linker) = init_engine_linker().unwrap();

    // {

    let module = load_module(&engine, &wasm_bytes).unwrap();


    let (instance, store) = init_func_call_env(&engine, &linker, &module, &work_path).unwrap();

    call_func(&instance, "_start", store).unwrap();

    get_detailed_memory_info(std::process::id().to_string().as_str()).unwrap();

    }
    Ok(())
}

fn main() -> Result<()> {
    get_detailed_memory_info(std::process::id().to_string().as_str()).unwrap();
    call_engine().unwrap();

    println!("after");
    get_memory_use(std::process::id().to_string().as_str()).unwrap();
    get_detailed_memory_info(std::process::id().to_string().as_str()).unwrap();
}

output
![image](https://github.com/user-attachments/assets/9beae311-4497-41ef-abef-ca9721b89533)

view this post on Zulip Wasmtime GitHub notifications bot (Jan 07 2025 at 02:43):

giteewif edited a comment on issue #9914:

I also think so.get_memory_use and get_detailed_memory_info is based on the /proc/self/smaps, read the rss or vmrss. Detailed test bellow

fn call_engine() -> Result<()>{
    let (wasm_bytes, work_path) = get_wasm_bytes().unwrap();

    {
    let (engine, linker) = init_engine_linker().unwrap();


    let module = load_module(&engine, &wasm_bytes).unwrap();


    let (instance, store) = init_func_call_env(&engine, &linker, &module, &work_path).unwrap();

    call_func(&instance, "_start", store).unwrap();

    get_detailed_memory_info(std::process::id().to_string().as_str()).unwrap();

    }
    Ok(())
}

fn main() -> Result<()> {
    get_detailed_memory_info(std::process::id().to_string().as_str()).unwrap();
    call_engine().unwrap();

    println!("after");
    get_memory_use(std::process::id().to_string().as_str()).unwrap();
    get_detailed_memory_info(std::process::id().to_string().as_str()).unwrap();
}

output
![image](https://github.com/user-attachments/assets/9beae311-4497-41ef-abef-ca9721b89533)

view this post on Zulip Wasmtime GitHub notifications bot (Jan 07 2025 at 17:20):

alexcrichton commented on issue #9914:

I'm not personally super familar with these statistics from Linux, so I don't know exactly what the problem is myself. That being said you might be seeing some side effects of using Tokio as a runtime through wasmtime-wasi which persists data structures and such.

Regardless though I'm not aware of how Engine, Store, Module, etc, could leak something at this time. There's always the possibility of a bug, however.

Can you clarify which of these statistics you're looking at in particular? For example did you expect "Total RSS" to go down to what it was before?

view this post on Zulip Wasmtime GitHub notifications bot (Jan 08 2025 at 02:46):

giteewif commented on issue #9914:

I think it is not the problems of the tokio, wasmtime-wasi is created when WasiCtxBuilder call build.

At the begining, the wasmtime only take 4 MB rss, it increases at 20 MB after loading the Module.In the load_modules func , i just call the func Module::new().

In my aspect, RSS represent the real physical memory the process takes. It seems that Module::new() takes more memory. I think it results from the compilation of a Module. In my expectation it should release the memory, when the module out of time, or engine out of lifetime which means "Total RSS" should go down to 4-10 MB. However, the total rss also in 20MB.

It may results from cranelift compiler? I'm not sure.

Here i give the more concise info
![image](https://github.com/user-attachments/assets/abc7d615-5f6f-445a-8ccb-7dd0275c5221)

view this post on Zulip Wasmtime GitHub notifications bot (Jan 08 2025 at 02:47):

giteewif edited a comment on issue #9914:

I think it is not the problems of the tokio, wasmtime-wasi is created when WasiCtxBuilder call build.

At the begining, the wasmtime only take 2 MB rss, it increases at 22 MB after loading the Module.In the load_modules func , i just call the func Module::new().

In my aspect, RSS represent the real physical memory the process takes. It seems that Module::new() takes more memory. I think it results from the compilation of a Module. In my expectation it should release the memory, when the module out of time, or engine out of lifetime which means "Total RSS" should go down to 3-10 MB. However, the total rss also in 20MB.

It may results from cranelift compiler? I'm not sure.

Here i give the more concise info
![image](https://github.com/user-attachments/assets/abc7d615-5f6f-445a-8ccb-7dd0275c5221)

view this post on Zulip Wasmtime GitHub notifications bot (Jan 08 2025 at 15:44):

alexcrichton commented on issue #9914:

Oh you know this is probably related to the Rayon thread pool for parallel compilation of the module. Could you try configuring Config::parallel_compilation(false) and see if that improves the RSS you're measuring?

view this post on Zulip Wasmtime GitHub notifications bot (Jan 08 2025 at 15:53):

cfallin commented on issue #9914:

It's also possible that allocations with the system allocator (regular Box/Vec stuff in the compiler for example) are returned to the allocator, but the allocator does not return the memory to the OS. Pretty likely even (otherwise new allocations would always be expensive because there would be no freelisted already-mapped memory!). For that reason, I think one would have to have Valgrind-level instrumentation to truly determine that we have no leaks.

(For what it's worth, this is the standard in C/C++ programs too: one tests for leaks with Valgrind, not by looking at RSS.)

view this post on Zulip Wasmtime GitHub notifications bot (Jan 08 2025 at 15:53):

cfallin edited a comment on issue #9914:

It's also possible that allocations with the system allocator (regular Box/Vec stuff in the compiler for example) are returned to the allocator, but the allocator does not return the memory to the OS. Pretty likely even (otherwise new allocations would always be expensive because there would be no freelisted already-mapped memory!). For that reason, I think one would have to have Valgrind-level instrumentation to truly determine that we have no leaks.

(For what it's worth, this is the standard in C/C++ programs too: one usually tests for leaks with Valgrind, not by looking at RSS.)

view this post on Zulip Wasmtime GitHub notifications bot (Jan 09 2025 at 02:08):

giteewif commented on issue #9914:

Oh you know this is probably related to the Rayon thread pool for parallel compilation of the module. Could you try configuring Config::parallel_compilation(false) and see if that improves the RSS you're measuring?

I have gave it a try. It showed that rss measured is improved. It is associated with the rayon thread pool in some parts.
![image](https://github.com/user-attachments/assets/bc1afc41-88e2-4946-8f31-bc67ee585d37)

It's also possible that allocations with the system allocator (regular Box/Vec stuff in the compiler for example) are returned to the allocator, but the allocator does not return the memory to the OS. Pretty likely even (otherwise new allocations would always be expensive because there would be no freelisted already-mapped memory!). For that reason, I think one would have to have Valgrind-level instrumentation to truly determine that we have no leaks.

(For what it's worth, this is the standard in C/C++ programs too: one usually tests for leaks with Valgrind, not by looking at RSS.)

I got it, i would like to test based on the valgrind.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 09 2025 at 02:11):

giteewif edited a comment on issue #9914:

Oh you know this is probably related to the Rayon thread pool for parallel compilation of the module. Could you try configuring Config::parallel_compilation(false) and see if that improves the RSS you're measuring?

I have gave it a try. It showed that rss measured is improved. It is associated with the rayon thread pool in some parts.
![image](https://github.com/user-attachments/assets/bc1afc41-88e2-4946-8f31-bc67ee585d37)

It's also possible that allocations with the system allocator (regular Box/Vec stuff in the compiler for example) are returned to the allocator, but the allocator does not return the memory to the OS. Pretty likely even (otherwise new allocations would always be expensive because there would be no freelisted already-mapped memory!). For that reason, I think one would have to have Valgrind-level instrumentation to truly determine that we have no leaks.

(For what it's worth, this is the standard in C/C++ programs too: one usually tests for leaks with Valgrind, not by looking at RSS.)

I got it, i would like to test based on the valgrind. And i think the system allocator is the probable reason, cause i see the similar situation in other wasm runtime.

view this post on Zulip Wasmtime GitHub notifications bot (Jan 09 2025 at 04:38):

alexcrichton closed issue #9914:

I use the wasmtime api of loading wasm to get a module, when a module out of lifetime in rust code, i get the memory , and figure out that the memory also cost a lot. Is the compile memory still stay even the module out of lifetime? In rust , when the module out of time i think it would be freed out. And i also test on engine. Test is like the figure below.
![image](https://github.com/user-attachments/assets/ef9c83c2-8bc4-4fb0-95a7-3decd978a16d)

view this post on Zulip Wasmtime GitHub notifications bot (Jan 09 2025 at 04:38):

alexcrichton commented on issue #9914:

Ah ok, nice! Chris also brings up excellent points as well which I would definitely echo too (and completely forgot about at the start of this thread...)

In any case I don't think we have any leaks here in Wasmtime, so I'm going to close this. If you see suspicious behavior though in Valgrind let us know!


Last updated: Jan 24 2025 at 00:11 UTC