alexcrichton edited issue #3523:
We have implemented this repo https://github.com/girazoki/wasmtime-try to try something that has been happening us with some latest code change we have done at moonbeam. The code change is https://github.com/PureStake/moonbeam/pull/989.
Without the PR, wasmtime takes around 15 seconds to compile the wasm. However, if we include the change (removal of the trait) wasmtime takes less than a second to compile. We wonder how these differences can be so big with such a small code change.
Both runtimes are in https://github.com/girazoki/wasmtime-try to try them out
Any help would be appreciated, thanks!
cfallin commented on issue #3523:
@alexcrichton That's a good question; ordinarily we'd definitely want to chase something like this, but if the RA2 transition happens soon-ish then the payoff of the time it would take to understand the issue and update the old regalloc, and be confident in the fix, doesn't seem worth it. Especially as many of the details of regalloc.rs have fallen out of my mental L2 cache by now...
These sorts of things do put pressure on the transition, though, so hopefully the moving pieces needed for that will come together soon!
alexcrichton commented on issue #3523:
@adamrk given your interest on this issue you might be interested in this one as well (detailed here) but if not no worries!
crystalin commented on issue #3523:
Not sure if that would help, but here are the perf (you can open in https://www.speedscope.app/ ) for when it is loading the wasm: https://drive.google.com/file/d/1LSid_h9vs8LGtbaQ8z41CD3vn4-y4g2U/view?usp=sharing
(Here is where most of the work goes it seems)
![20211111_164744](https://user-images.githubusercontent.com/329248/142000660-5bb26ba7-6458-4bda-8813-8075fd3bc00b.jpg)
.
crystalin edited a comment on issue #3523:
Not sure if that would help, but here are the perf (you can open in https://www.speedscope.app/ ) for when it is loading the wasm: https://drive.google.com/file/d/1LSid_h9vs8LGtbaQ8z41CD3vn4-y4g2U/view?usp=sharing
(Here is where most of the work goes it seems)
crystalin edited a comment on issue #3523:
Not sure if that would help, but here are the perf (you can open in https://www.speedscope.app/ ) for when it is loading the wasm: https://drive.google.com/file/d/1LSid_h9vs8LGtbaQ8z41CD3vn4-y4g2U/view?usp=sharing
(Here is where most of the work goes it seems
![image (1)](https://user-images.githubusercontent.com/329248/142000825-4c5a5f37-ba7e-41f3-9b19-31989555b883.png)
)
crystalin edited a comment on issue #3523:
Not sure if that would help, but here are the perf (you can open in https://www.speedscope.app/ ) for when it is loading the wasm: https://drive.google.com/file/d/1LSid_h9vs8LGtbaQ8z41CD3vn4-y4g2U/view?usp=sharing
(Here is where most of the work goes it seems)
![image (1)](https://user-images.githubusercontent.com/329248/142000825-4c5a5f37-ba7e-41f3-9b19-31989555b883.png)
alexcrichton labeled issue #3523:
We have implemented this repo https://github.com/girazoki/wasmtime-try to try something that has been happening us with some latest code change we have done at moonbeam. The code change is https://github.com/PureStake/moonbeam/pull/989.
Without the PR, wasmtime takes around 15 seconds to compile the wasm. However, if we include the change (removal of the trait) wasmtime takes less than a second to compile. We wonder how these differences can be so big with such a small code change.
Both runtimes are in https://github.com/girazoki/wasmtime-try to try them out
Any help would be appreciated, thanks!
alexcrichton labeled issue #3523:
We have implemented this repo https://github.com/girazoki/wasmtime-try to try something that has been happening us with some latest code change we have done at moonbeam. The code change is https://github.com/PureStake/moonbeam/pull/989.
Without the PR, wasmtime takes around 15 seconds to compile the wasm. However, if we include the change (removal of the trait) wasmtime takes less than a second to compile. We wonder how these differences can be so big with such a small code change.
Both runtimes are in https://github.com/girazoki/wasmtime-try to try them out
Any help would be appreciated, thanks!
adamrk commented on issue #3523:
Sorry, just noticed this. Yeah, I'll take a look at it.
JoshOrndorff commented on issue #3523:
@alexcrichton We have hit what appears to be the same issue in the Moonbeam project again. This time it is in this wasm file.
Following your example, @girazoki, explored this:
RUST_LOG=wasmtime_cranelift wasmtime compile moonbase_runtime.wasm ... DEBUG wasmtime_cranelift::compiler > FuncIndex(1276) translated in 13.571936365s TRACE wasmtime_cranelift::compiler > FuncIndex(1276) timing info ======== ======== ================================== Total Self Pass -------- -------- ---------------------------------- 0.510 0.510 Translate WASM function 13.056 0.030 Compilation passes 0.001 0.001 Control flow graph 0.002 0.002 Dominator tree 0.000 0.000 Loop analysis 0.001 0.001 Pre-legalization rewriting 0.005 0.005 Dead code elimination 0.009 0.009 Global value numbering 0.044 0.033 Loop invariant code motion 0.000 0.000 Remove unreachable blocks 0.210 0.210 Remove constant phi-nodes 0.222 0.222 VCode lowering 0.060 0.060 VCode post-register allocation finalization 0.029 0.029 VCode emission 0.000 0.000 VCode emission finalization 12.443 12.443 Register allocation 0.006 0.006 Binary machine code emission 0.009 0.009 Layout full renumbering ======== ======== ==================================
How did you figure out the human-readale or even the mangled name of the function to know where to put the
#[inline(never)]
?
alexcrichton commented on issue #3523:
Oh nice, thanks for that! I was able to find the function-in-question with:
RUST_LOG=wasmtime_cranelift=debug ./target/release/wasmtime compile moonbase_runtime.compact.wasm
which prints out:
... DEBUG wasmtime_cranelift::compiler > FuncIndex(1776) translated in 70.436892ms DEBUG wasmtime_cranelift::compiler > FuncIndex(679) translated in 69.31585ms DEBUG wasmtime_cranelift::compiler > FuncIndex(2050) translated in 61.266995ms DEBUG wasmtime_cranelift::compiler > FuncIndex(1574) translated in 68.276648ms DEBUG wasmtime_cranelift::compiler > FuncIndex(1622) translated in 75.956103ms DEBUG wasmtime_cranelift::compiler > FuncIndex(1983) translated in 85.31272ms DEBUG wasmtime_cranelift::compiler > FuncIndex(1347) translated in 282.235891ms DEBUG wasmtime_cranelift::compiler > FuncIndex(1276) translated in 19.01582367s
which points to function 1276 as the offender here.
Using
wasm-opt
I got:$ wasm-opt --extract-function-index=1276 ./moonbase_runtime.compact.wasm -o extract.wasm extracting <xcm_executor::XcmExecutor<Config>\20as\20xcm::v2::traits::ExecuteXcm<<Config\20as\20xcm_executor::config::Config>::Call>>::execute_xcm_in_credit::hf77207caf74f0f13
which I can confirm the extracted wasm file takes ~20s to compile, quite too long!
alexcrichton edited issue #3523:
Original Issue
We have implemented this repo https://github.com/girazoki/wasmtime-try to try something that has been happening us with some latest code change we have done at moonbeam. The code change is https://github.com/PureStake/moonbeam/pull/989.
Without the PR, wasmtime takes around 15 seconds to compile the wasm. However, if we include the change (removal of the trait) wasmtime takes less than a second to compile. We wonder how these differences can be so big with such a small code change.
Both runtimes are in https://github.com/girazoki/wasmtime-try to try them out
Any help would be appreciated, thanks!Current status
Modules with way-too-long-compile-times:
alexcrichton commented on issue #3523:
I'm going to close this given the discussion on https://github.com/bytecodealliance/wasmtime/issues/4060. The new regalloc probably improves things here but these sorts of outliers are inevitable with Cranelift's design goals.
alexcrichton closed issue #3523:
Original Issue
We have implemented this repo https://github.com/girazoki/wasmtime-try to try something that has been happening us with some latest code change we have done at moonbeam. The code change is https://github.com/PureStake/moonbeam/pull/989.
Without the PR, wasmtime takes around 15 seconds to compile the wasm. However, if we include the change (removal of the trait) wasmtime takes less than a second to compile. We wonder how these differences can be so big with such a small code change.
Both runtimes are in https://github.com/girazoki/wasmtime-try to try them out
Any help would be appreciated, thanks!Current status
Modules with way-too-long-compile-times:
Last updated: Jan 24 2025 at 00:11 UTC