giteewif opened issue #9914:
I use the wasmtime api of loading wasm to get a module, when a module out of lifetime in rust code, i get the memory , and figure out that the memory also cost a lot. Is the compile memory still stay even the module out of lifetime? In rust , when the module out of time i think it would be freed out.
giteewif edited issue #9914:
I use the wasmtime api of loading wasm to get a module, when a module out of lifetime in rust code, i get the memory , and figure out that the memory also cost a lot. Is the compile memory still stay even the module out of lifetime? In rust , when the module out of time i think it would be freed out. And i also test on engine. Test is like the figure below.
![image](https://github.com/user-attachments/assets/ef9c83c2-8bc4-4fb0-95a7-3decd978a16d)
alexcrichton commented on issue #9914:
Once the
Engine
,Module
, andStore
have all been deallocated then the memory is released back to the OS. Could you detail a bit more what statistic syolu're measuring inget_memory_use
andget_detailed_memory_info
and what they are printing out in your snippet above?
giteewif commented on issue #9914:
I also think so.get_memory_use and get_detailed_memory_info is based on the /proc/self/smaps, read the rss or vmrss. Detailed test bellow
fn call_engine() -> Result<()>{ let (wasm_bytes, work_path) = get_wasm_bytes().unwrap(); { let (engine, linker) = init_engine_linker().unwrap(); // { let module = load_module(&engine, &wasm_bytes).unwrap(); // get_detailed_memory_info(std::process::id().to_string().as_str()).unwrap(); // let module3 = load_module(&engine, &wasm_bytes).unwrap(); // call_fork()?; let (instance, store) = init_func_call_env(&engine, &linker, &module, &work_path).unwrap(); call_func(&instance, "_start", store).unwrap(); // get_memory_use(std::process::id().to_string().as_str()).unwrap(); get_detailed_memory_info(std::process::id().to_string().as_str()).unwrap(); } Ok(()) } fn main() -> Result<()> { get_detailed_memory_info(std::process::id().to_string().as_str()).unwrap(); call_engine().unwrap(); println!("after"); get_memory_use(std::process::id().to_string().as_str()).unwrap(); get_detailed_memory_info(std::process::id().to_string().as_str()).unwrap(); }
output
![image](https://github.com/user-attachments/assets/9beae311-4497-41ef-abef-ca9721b89533)
giteewif edited a comment on issue #9914:
I also think so.get_memory_use and get_detailed_memory_info is based on the /proc/self/smaps, read the rss or vmrss. Detailed test bellow
fn call_engine() -> Result<()>{ let (wasm_bytes, work_path) = get_wasm_bytes().unwrap(); { let (engine, linker) = init_engine_linker().unwrap(); // { let module = load_module(&engine, &wasm_bytes).unwrap(); let (instance, store) = init_func_call_env(&engine, &linker, &module, &work_path).unwrap(); call_func(&instance, "_start", store).unwrap(); get_detailed_memory_info(std::process::id().to_string().as_str()).unwrap(); } Ok(()) } fn main() -> Result<()> { get_detailed_memory_info(std::process::id().to_string().as_str()).unwrap(); call_engine().unwrap(); println!("after"); get_memory_use(std::process::id().to_string().as_str()).unwrap(); get_detailed_memory_info(std::process::id().to_string().as_str()).unwrap(); }
output
![image](https://github.com/user-attachments/assets/9beae311-4497-41ef-abef-ca9721b89533)
giteewif edited a comment on issue #9914:
I also think so.get_memory_use and get_detailed_memory_info is based on the /proc/self/smaps, read the rss or vmrss. Detailed test bellow
fn call_engine() -> Result<()>{ let (wasm_bytes, work_path) = get_wasm_bytes().unwrap(); { let (engine, linker) = init_engine_linker().unwrap(); let module = load_module(&engine, &wasm_bytes).unwrap(); let (instance, store) = init_func_call_env(&engine, &linker, &module, &work_path).unwrap(); call_func(&instance, "_start", store).unwrap(); get_detailed_memory_info(std::process::id().to_string().as_str()).unwrap(); } Ok(()) } fn main() -> Result<()> { get_detailed_memory_info(std::process::id().to_string().as_str()).unwrap(); call_engine().unwrap(); println!("after"); get_memory_use(std::process::id().to_string().as_str()).unwrap(); get_detailed_memory_info(std::process::id().to_string().as_str()).unwrap(); }
output
![image](https://github.com/user-attachments/assets/9beae311-4497-41ef-abef-ca9721b89533)
alexcrichton commented on issue #9914:
I'm not personally super familar with these statistics from Linux, so I don't know exactly what the problem is myself. That being said you might be seeing some side effects of using Tokio as a runtime through
wasmtime-wasi
which persists data structures and such.Regardless though I'm not aware of how
Engine
,Store
,Module
, etc, could leak something at this time. There's always the possibility of a bug, however.Can you clarify which of these statistics you're looking at in particular? For example did you expect "Total RSS" to go down to what it was before?
giteewif commented on issue #9914:
I think it is not the problems of the tokio, wasmtime-wasi is created when WasiCtxBuilder call build.
At the begining, the wasmtime only take 4 MB rss, it increases at 20 MB after loading the Module.In the load_modules func , i just call the func Module::new().
In my aspect, RSS represent the real physical memory the process takes. It seems that Module::new() takes more memory. I think it results from the compilation of a Module. In my expectation it should release the memory, when the module out of time, or engine out of lifetime which means "Total RSS" should go down to 4-10 MB. However, the total rss also in 20MB.
It may results from cranelift compiler? I'm not sure.
Here i give the more concise info
![image](https://github.com/user-attachments/assets/abc7d615-5f6f-445a-8ccb-7dd0275c5221)
giteewif edited a comment on issue #9914:
I think it is not the problems of the tokio, wasmtime-wasi is created when WasiCtxBuilder call build.
At the begining, the wasmtime only take 2 MB rss, it increases at 22 MB after loading the Module.In the load_modules func , i just call the func Module::new().
In my aspect, RSS represent the real physical memory the process takes. It seems that Module::new() takes more memory. I think it results from the compilation of a Module. In my expectation it should release the memory, when the module out of time, or engine out of lifetime which means "Total RSS" should go down to 3-10 MB. However, the total rss also in 20MB.
It may results from cranelift compiler? I'm not sure.
Here i give the more concise info
![image](https://github.com/user-attachments/assets/abc7d615-5f6f-445a-8ccb-7dd0275c5221)
alexcrichton commented on issue #9914:
Oh you know this is probably related to the Rayon thread pool for parallel compilation of the module. Could you try configuring
Config::parallel_compilation(false)
and see if that improves the RSS you're measuring?
cfallin commented on issue #9914:
It's also possible that allocations with the system allocator (regular
Box
/Vec
stuff in the compiler for example) are returned to the allocator, but the allocator does not return the memory to the OS. Pretty likely even (otherwise new allocations would always be expensive because there would be no freelisted already-mapped memory!). For that reason, I think one would have to have Valgrind-level instrumentation to truly determine that we have no leaks.(For what it's worth, this is the standard in C/C++ programs too: one tests for leaks with Valgrind, not by looking at RSS.)
cfallin edited a comment on issue #9914:
It's also possible that allocations with the system allocator (regular
Box
/Vec
stuff in the compiler for example) are returned to the allocator, but the allocator does not return the memory to the OS. Pretty likely even (otherwise new allocations would always be expensive because there would be no freelisted already-mapped memory!). For that reason, I think one would have to have Valgrind-level instrumentation to truly determine that we have no leaks.(For what it's worth, this is the standard in C/C++ programs too: one usually tests for leaks with Valgrind, not by looking at RSS.)
giteewif commented on issue #9914:
Oh you know this is probably related to the Rayon thread pool for parallel compilation of the module. Could you try configuring
Config::parallel_compilation(false)
and see if that improves the RSS you're measuring?I have gave it a try. It showed that rss measured is improved. It is associated with the rayon thread pool in some parts.
![image](https://github.com/user-attachments/assets/bc1afc41-88e2-4946-8f31-bc67ee585d37)It's also possible that allocations with the system allocator (regular
Box
/Vec
stuff in the compiler for example) are returned to the allocator, but the allocator does not return the memory to the OS. Pretty likely even (otherwise new allocations would always be expensive because there would be no freelisted already-mapped memory!). For that reason, I think one would have to have Valgrind-level instrumentation to truly determine that we have no leaks.(For what it's worth, this is the standard in C/C++ programs too: one usually tests for leaks with Valgrind, not by looking at RSS.)
I got it, i would like to test based on the valgrind.
giteewif edited a comment on issue #9914:
Oh you know this is probably related to the Rayon thread pool for parallel compilation of the module. Could you try configuring
Config::parallel_compilation(false)
and see if that improves the RSS you're measuring?I have gave it a try. It showed that rss measured is improved. It is associated with the rayon thread pool in some parts.
![image](https://github.com/user-attachments/assets/bc1afc41-88e2-4946-8f31-bc67ee585d37)It's also possible that allocations with the system allocator (regular
Box
/Vec
stuff in the compiler for example) are returned to the allocator, but the allocator does not return the memory to the OS. Pretty likely even (otherwise new allocations would always be expensive because there would be no freelisted already-mapped memory!). For that reason, I think one would have to have Valgrind-level instrumentation to truly determine that we have no leaks.(For what it's worth, this is the standard in C/C++ programs too: one usually tests for leaks with Valgrind, not by looking at RSS.)
I got it, i would like to test based on the valgrind. And i think the system allocator is the probable reason, cause i see the similar situation in other wasm runtime.
alexcrichton closed issue #9914:
I use the wasmtime api of loading wasm to get a module, when a module out of lifetime in rust code, i get the memory , and figure out that the memory also cost a lot. Is the compile memory still stay even the module out of lifetime? In rust , when the module out of time i think it would be freed out. And i also test on engine. Test is like the figure below.
![image](https://github.com/user-attachments/assets/ef9c83c2-8bc4-4fb0-95a7-3decd978a16d)
alexcrichton commented on issue #9914:
Ah ok, nice! Chris also brings up excellent points as well which I would definitely echo too (and completely forgot about at the start of this thread...)
In any case I don't think we have any leaks here in Wasmtime, so I'm going to close this. If you see suspicious behavior though in Valgrind let us know!
Last updated: Jan 24 2025 at 00:11 UTC