Stream: git-wasmtime

Topic: wasmtime / issue #6541 The `__deregister_frame` is slow o...


view this post on Zulip Wasmtime GitHub notifications bot (Jun 08 2023 at 14:30):

TimonPost edited issue #6541:

We have a precompiled wasm module of about 337,2MB that if dropped the UnwindRegistration::drop takes around ~22 secs (can quite vary) to finish. I did some benchmarking and we have 266664 unwind registers and the time per frame deregister is for me around 0.0843227 ms. Three runs:

Deregister 266664 frames in 22485 ms (84.3227 microsec/frame)
Deregister 266664 frames in 22448 ms (84.1817 microsec/frame)
Deregister 266664 frames in 22232 ms (83.3742 microsec/frame)

A fast deregister/register time is vital to us. Is there an easy solution here or a solution?

view this post on Zulip Wasmtime GitHub notifications bot (Jun 08 2023 at 14:32):

TimonPost edited issue #6541:

We have a precompiled wasm module of about 337,2MB that if dropped the UnwindRegistration::drop takes around ~22 secs (can quite vary) to finish. Sometimes it takes over 100 seconds. I did some benchmarking and we have 266664 unwind registers and the time per frame deregister is for me around 0.0843227 ms. Three runs:

Deregister 266664 frames in 22485 ms (84.3227 microsec/frame)
Deregister 266664 frames in 22448 ms (84.1817 microsec/frame)
Deregister 266664 frames in 22232 ms (83.3742 microsec/frame)

A fast deregister/register time is vital to us. Is there an easy solution here or a solution?

view this post on Zulip Wasmtime GitHub notifications bot (Jun 08 2023 at 14:33):

TimonPost edited issue #6541:

We have a precompiled wasm module of about 337,2MB that if dropped the UnwindRegistration::drop takes around ~22 secs (can quite vary) to finish. Sometimes it takes over 100 seconds. I did some benchmarking and we have 266664 unwind registers and the time per frame deregister is for me around 0.0843227 ms. Three runs:

Deregister 266664 frames in 22485 ms (84.3227 microsec/frame)
Deregister 266664 frames in 22448 ms (84.1817 microsec/frame)
Deregister 266664 frames in 22232 ms (83.3742 microsec/frame)

A fast deregister/register time is vital to us. Is there an easy solution here or a solution?

This can be very easy reproduced:

#[test]
fn deserialize_test_memory() {
    let engine = Engine::new(&Config::new().epoch_interruption(true)).unwrap();

    let path = "path-to-precomipled-bytes";

    unsafe {
        let deserialized: wasmtime::Module = Module::deserialize_file(&engine, path).unwrap();

        let time = Instant::now();
        println!("Dropping module...");
        drop(deserialized);
        println!("Dropped module {:?}", time.elapsed());
    }
}

view this post on Zulip Wasmtime GitHub notifications bot (Jun 08 2023 at 14:47):

TimonPost edited issue #6541:

We have a precompiled wasm module of about 337,2MB that if dropped the UnwindRegistration::drop takes around ~22 secs (can quite vary) to finish. Sometimes it takes over 100 seconds. I did some benchmarking and we have 266664 unwind registers and the time per frame deregister is for me around 0.0843227 ms. Three runs:

Deregister 266664 frames in 22485 ms (84.3227 microsec/frame)
Deregister 266664 frames in 22448 ms (84.1817 microsec/frame)
Deregister 266664 frames in 22232 ms (83.3742 microsec/frame)

A fast deregister/register time is vital to us. Although our module is quite large. Is there an easy solution here or a solution?

This can be very easy reproduced:

#[test]
fn deserialize_test_memory() {
    let engine = Engine::new(&Config::new().epoch_interruption(true)).unwrap();

    let path = "path-to-precomipled-bytes";

    unsafe {
        let deserialized: wasmtime::Module = Module::deserialize_file(&engine, path).unwrap();

        let time = Instant::now();
        println!("Dropping module...");
        drop(deserialized);
        println!("Dropped module {:?}", time.elapsed());
    }
}

view this post on Zulip Wasmtime GitHub notifications bot (Jun 08 2023 at 14:53):

TimonPost edited issue #6541:

We have a precompiled wasm module of about 337,2MB that if dropped the UnwindRegistration::drop takes around ~22 secs (can quite vary) to finish. Sometimes it takes over 100 seconds. I did some benchmarking and we have 266664 unwind registers and the time per frame deregister is for me around 0.0843227 ms. Three runs:

Deregister 266664 frames in 22485 ms (84.3227 microsec/frame)
Deregister 266664 frames in 22448 ms (84.1817 microsec/frame)
Deregister 266664 frames in 22232 ms (83.3742 microsec/frame)

A fast deregister/register time is vital to us. When hot-reloading after cargo watch detects a change when a single code line changes it used to take 2 secs, but somewhere in last few months, this has somehow gotten very slow. Although our module is quite large. Is there an easy solution here or a solution?

This can be very easy reproduced:

#[test]
fn deserialize_test_memory() {
    let engine = Engine::new(&Config::new().epoch_interruption(true)).unwrap();

    let path = "path-to-precomipled-bytes";

    unsafe {
        let deserialized: wasmtime::Module = Module::deserialize_file(&engine, path).unwrap();

        let time = Instant::now();
        println!("Dropping module...");
        drop(deserialized);
        println!("Dropped module {:?}", time.elapsed());
    }
}

view this post on Zulip Wasmtime GitHub notifications bot (Jun 08 2023 at 14:53):

TimonPost edited issue #6541:

We have a precompiled wasm module of about 337,2MB that if dropped the UnwindRegistration::drop takes around ~22 secs (can quite vary) to finish. Sometimes it takes over 100 seconds. I did some benchmarking and we have 266664 unwind registers and the time per frame deregister is for me around 0.0843227 ms. Three runs:

Deregister 266664 frames in 22485 ms (84.3227 microsec/frame)
Deregister 266664 frames in 22448 ms (84.1817 microsec/frame)
Deregister 266664 frames in 22232 ms (83.3742 microsec/frame)

A fast deregister/register time is vital to us. When hot-reloading after cargo watch detects a change when a single code line changes it used to take 2 secs, but somewhere in last few months, this has somehow gotten very slow. Is there an easy solution here or a solution?

This can be very easy reproduced:

#[test]
fn deserialize_test_memory() {
    let engine = Engine::new(&Config::new().epoch_interruption(true)).unwrap();

    let path = "path-to-precomipled-bytes";

    unsafe {
        let deserialized: wasmtime::Module = Module::deserialize_file(&engine, path).unwrap();

        let time = Instant::now();
        println!("Dropping module...");
        drop(deserialized);
        println!("Dropped module {:?}", time.elapsed());
    }
}

view this post on Zulip Wasmtime GitHub notifications bot (Jun 08 2023 at 14:54):

TimonPost edited issue #6541:

We have a precompiled wasm module of about 337,2MB that if dropped the UnwindRegistration::drop takes around ~22 secs (can quite vary) to finish. Sometimes it takes over 100 seconds. I did some benchmarking and we have 266664 unwind registers and the time per frame deregister is for me around 0.0843227 ms.

This dropping happens when a wasmtime::Module drops.

Three runs:

Deregister 266664 frames in 22485 ms (84.3227 microsec/frame)
Deregister 266664 frames in 22448 ms (84.1817 microsec/frame)
Deregister 266664 frames in 22232 ms (83.3742 microsec/frame)

A fast deregister/register time is vital to us. When hot-reloading after cargo watch detects a change when a single code line changes it used to take 2 secs, but somewhere in last few months, this has somehow gotten very slow. Is there an easy solution here or a solution?

This can be very easy reproduced:

#[test]
fn deserialize_test_memory() {
    let engine = Engine::new(&Config::new().epoch_interruption(true)).unwrap();

    let path = "path-to-precomipled-bytes";

    unsafe {
        let deserialized: wasmtime::Module = Module::deserialize_file(&engine, path).unwrap();

        let time = Instant::now();
        println!("Dropping module...");
        drop(deserialized);
        println!("Dropped module {:?}", time.elapsed());
    }
}

view this post on Zulip Wasmtime GitHub notifications bot (Jun 08 2023 at 16:46):

fitzgen commented on issue #6541:

Unfortunately there is nothing we can do to speed up the implementation of system libraries.

You can avoid the need to interact with libunwind by disabling (optional but enabled-by-default) native unwind info: https://docs.rs/wasmtime/latest/wasmtime/struct.Config.html#method.native_unwind_info

This configuration option only exists to help third-party stack capturing mechanisms, such as the system’s unwinder or the backtrace crate, determine how to unwind through Wasm frames. It does not affect whether Wasmtime can capture Wasm backtraces or not.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 08 2023 at 19:19):

TimonPost commented on issue #6541:

Can the behavior of frame deregistration be decoupled from the Drop trait and be optionally triggered in a controlled manner? For example, what if wasmtime allows one to manually fetch 'unwind frames' and allow manually disposing of them. Or what would happen if the frames were pushed to a separate thread for deregistration? Are there any potential downsides or considerations to keep in mind when pursuing this approach?

While I understand the typical approach of performing cleanup on drop, having more control over the process would be really nice to have control over this. As at the moment it is very implicit behavior and it took me a while to pinpoint this exact function deep down the code stack.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 08 2023 at 19:20):

TimonPost edited a comment on issue #6541:

Can the behavior of frame deregistration be decoupled from the Drop trait and be optionally triggered in a controlled manner? For example, what if wasmtime allows one to manually fetch 'unwind frames' and allow manually disposing of them. Or what would happen if the frames were pushed to a separate thread for deregistration? Are there any potential downsides or considerations to keep in mind when pursuing this approach?

While I understand the typical approach of performing cleanup on drop, having more control over the process would be really nice to have control over this. As at the moment it is very implicit behavior and it took me a while to pinpoint this exact function deep down the code stack. And also, disabling the behavior will work but has its own downsides that we loose some unwinding information, which ideally we would want to preserve.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 08 2023 at 19:46):

bjorn3 commented on issue #6541:

Deregistration of unwind info needs to happen before the memory of the module is deallocated to prevent dangling pointers. You could move the Module to a background thread before dropping it to avoid blocking on deregistration of unwind info. This has the risk of runaway memory usage if you load new modules faster than they can be dropped however.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 08 2023 at 20:08):

jameysharp commented on issue #6541:

I may not have understood the details here, but if I did, I think @bjorn3's suggestion is the best you can do if you must use the platform-native unwinding support. To avoid unbounded memory usage I think you could use sync_channel to create a bounded queue that blocks the main threads if the background thread falls behind on dropping unused modules.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 09 2023 at 09:41):

TimonPost commented on issue #6541:

When i search for this native_unwind_infoconfiguration option it seems to never be used, only validated. It looks like the PR that introduced this option removed this line https://github.com/bytecodealliance/wasmtime/pull/4643/files#diff-908a422cef87524478865a23918af79040c970e02a3cc6862f6eb508710606f9L1436, which registered the option in cranelift compiler. When I add this manually to my config I do get 0 unwind registers for macos, if i don't do that, I still get all my unwind frames being inserted.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 09 2023 at 10:47):

TimonPost commented on issue #6541:

Opened A PR #6547, let me know if this assumption is correct, that at the moment it isn't possible to disable unwind frames. From my local debugging with this branch, it does look like this drastically improves unloading from +20 secs to a few micro secs.


Last updated: Oct 23 2024 at 20:03 UTC