adambratschikaye labeled issue #4923:
It seems that a
wasm
source with a chain of blocks each containing alocal.get
instruction can have quadratic compile time becausecan_optimize_var_lookup
will traverse the entire chain when handling eachlocal.get
instruction. As an example, the followingwat
takes 2.4 seconds to compile on my machine:(module (func (param i32) (block (local.set 0 (i32.add (i32.const 1) (local.get 0)))) (block (local.set 0 (i32.add (i32.const 1) (local.get 0)))) ... repeated 15,000 times ) )
and a flamegraph: ![flamegraph](https://user-images.githubusercontent.com/90606735/191045431-3e7626ef-7c29-4343-b784-4f9f7205d9d5.svg) shows that almost the entire time is spent in
can_optimize_var_lookup
. Note that thewasm
file is 147 KB.Steps to Reproduce
- Download the attached
blocks.wasm.gz
and unzip it (or take the abovewat
and convert it towasm
).- Compile it with
time target/release/wasmtime compile blocks.wasm
.Expected Results
I'd expect the compilation to be on the order of 100ms as it is for other modules with a single function of similar size on my machine.
Actual Results
The compilation takes over 2 seconds.
Versions and Environment
Cranelift version or commit: commit
27435ae398bcce74bf990b9683e5c47c3f6c5d51
Operating system: Ubuntu 20.04, Linux kernel 5.4.0-122-generic
Architecture:
Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 43 bits physical, 48 bits virtual CPU(s): 64 On-line CPU(s) list: 0-63 Thread(s) per core: 2 Core(s) per socket: 16 Socket(s): 2 NUMA node(s): 1 Vendor ID: AuthenticAMD CPU family: 23 Model: 49 Model name: AMD EPYC 7302 16-Core Processor
Extra Info
I'm working on a PR which should help with this.
adambratschikaye opened issue #4923:
It seems that a
wasm
source with a chain of blocks each containing alocal.get
instruction can have quadratic compile time becausecan_optimize_var_lookup
will traverse the entire chain when handling eachlocal.get
instruction. As an example, the followingwat
takes 2.4 seconds to compile on my machine:(module (func (param i32) (block (local.set 0 (i32.add (i32.const 1) (local.get 0)))) (block (local.set 0 (i32.add (i32.const 1) (local.get 0)))) ... repeated 15,000 times ) )
and a flamegraph: ![flamegraph](https://user-images.githubusercontent.com/90606735/191045431-3e7626ef-7c29-4343-b784-4f9f7205d9d5.svg) shows that almost the entire time is spent in
can_optimize_var_lookup
. Note that thewasm
file is 147 KB.Steps to Reproduce
- Download the attached
blocks.wasm.gz
and unzip it (or take the abovewat
and convert it towasm
).- Compile it with
time target/release/wasmtime compile blocks.wasm
.Expected Results
I'd expect the compilation to be on the order of 100ms as it is for other modules with a single function of similar size on my machine.
Actual Results
The compilation takes over 2 seconds.
Versions and Environment
Cranelift version or commit: commit
27435ae398bcce74bf990b9683e5c47c3f6c5d51
Operating system: Ubuntu 20.04, Linux kernel 5.4.0-122-generic
Architecture:
Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 43 bits physical, 48 bits virtual CPU(s): 64 On-line CPU(s) list: 0-63 Thread(s) per core: 2 Core(s) per socket: 16 Socket(s): 2 NUMA node(s): 1 Vendor ID: AuthenticAMD CPU family: 23 Model: 49 Model name: AMD EPYC 7302 16-Core Processor
Extra Info
I'm working on a PR which should help with this.
adambratschikaye labeled issue #4923:
It seems that a
wasm
source with a chain of blocks each containing alocal.get
instruction can have quadratic compile time becausecan_optimize_var_lookup
will traverse the entire chain when handling eachlocal.get
instruction. As an example, the followingwat
takes 2.4 seconds to compile on my machine:(module (func (param i32) (block (local.set 0 (i32.add (i32.const 1) (local.get 0)))) (block (local.set 0 (i32.add (i32.const 1) (local.get 0)))) ... repeated 15,000 times ) )
and a flamegraph: ![flamegraph](https://user-images.githubusercontent.com/90606735/191045431-3e7626ef-7c29-4343-b784-4f9f7205d9d5.svg) shows that almost the entire time is spent in
can_optimize_var_lookup
. Note that thewasm
file is 147 KB.Steps to Reproduce
- Download the attached
blocks.wasm.gz
and unzip it (or take the abovewat
and convert it towasm
).- Compile it with
time target/release/wasmtime compile blocks.wasm
.Expected Results
I'd expect the compilation to be on the order of 100ms as it is for other modules with a single function of similar size on my machine.
Actual Results
The compilation takes over 2 seconds.
Versions and Environment
Cranelift version or commit: commit
27435ae398bcce74bf990b9683e5c47c3f6c5d51
Operating system: Ubuntu 20.04, Linux kernel 5.4.0-122-generic
Architecture:
Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 43 bits physical, 48 bits virtual CPU(s): 64 On-line CPU(s) list: 0-63 Thread(s) per core: 2 Core(s) per socket: 16 Socket(s): 2 NUMA node(s): 1 Vendor ID: AuthenticAMD CPU family: 23 Model: 49 Model name: AMD EPYC 7302 16-Core Processor
Extra Info
I'm working on a PR which should help with this.
adambratschikaye closed issue #4923:
It seems that a
wasm
source with a chain of blocks each containing alocal.get
instruction can have quadratic compile time becausecan_optimize_var_lookup
will traverse the entire chain when handling eachlocal.get
instruction. As an example, the followingwat
takes 2.4 seconds to compile on my machine:(module (func (param i32) (block (local.set 0 (i32.add (i32.const 1) (local.get 0)))) (block (local.set 0 (i32.add (i32.const 1) (local.get 0)))) ... repeated 15,000 times ) )
and a flamegraph: ![flamegraph](https://user-images.githubusercontent.com/90606735/191045431-3e7626ef-7c29-4343-b784-4f9f7205d9d5.svg) shows that almost the entire time is spent in
can_optimize_var_lookup
. Note that thewasm
file is 147 KB.Steps to Reproduce
- Download the attached
blocks.wasm.gz
and unzip it (or take the abovewat
and convert it towasm
).- Compile it with
time target/release/wasmtime compile blocks.wasm
.Expected Results
I'd expect the compilation to be on the order of 100ms as it is for other modules with a single function of similar size on my machine.
Actual Results
The compilation takes over 2 seconds.
Versions and Environment
Cranelift version or commit: commit
27435ae398bcce74bf990b9683e5c47c3f6c5d51
Operating system: Ubuntu 20.04, Linux kernel 5.4.0-122-generic
Architecture:
Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 43 bits physical, 48 bits virtual CPU(s): 64 On-line CPU(s) list: 0-63 Thread(s) per core: 2 Core(s) per socket: 16 Socket(s): 2 NUMA node(s): 1 Vendor ID: AuthenticAMD CPU family: 23 Model: 49 Model name: AMD EPYC 7302 16-Core Processor
Extra Info
I'm working on a PR which should help with this.
jameysharp commented on issue #4923:
My draft fix for this is in #4939.
jameysharp reopened issue #4923:
It seems that a
wasm
source with a chain of blocks each containing alocal.get
instruction can have quadratic compile time becausecan_optimize_var_lookup
will traverse the entire chain when handling eachlocal.get
instruction. As an example, the followingwat
takes 2.4 seconds to compile on my machine:(module (func (param i32) (block (local.set 0 (i32.add (i32.const 1) (local.get 0)))) (block (local.set 0 (i32.add (i32.const 1) (local.get 0)))) ... repeated 15,000 times ) )
and a flamegraph: ![flamegraph](https://user-images.githubusercontent.com/90606735/191045431-3e7626ef-7c29-4343-b784-4f9f7205d9d5.svg) shows that almost the entire time is spent in
can_optimize_var_lookup
. Note that thewasm
file is 147 KB.Steps to Reproduce
- Download the attached
blocks.wasm.gz
and unzip it (or take the abovewat
and convert it towasm
).- Compile it with
time target/release/wasmtime compile blocks.wasm
.Expected Results
I'd expect the compilation to be on the order of 100ms as it is for other modules with a single function of similar size on my machine.
Actual Results
The compilation takes over 2 seconds.
Versions and Environment
Cranelift version or commit: commit
27435ae398bcce74bf990b9683e5c47c3f6c5d51
Operating system: Ubuntu 20.04, Linux kernel 5.4.0-122-generic
Architecture:
Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 43 bits physical, 48 bits virtual CPU(s): 64 On-line CPU(s) list: 0-63 Thread(s) per core: 2 Core(s) per socket: 16 Socket(s): 2 NUMA node(s): 1 Vendor ID: AuthenticAMD CPU family: 23 Model: 49 Model name: AMD EPYC 7302 16-Core Processor
Extra Info
I'm working on a PR which should help with this.
jameysharp closed issue #4923:
It seems that a
wasm
source with a chain of blocks each containing alocal.get
instruction can have quadratic compile time becausecan_optimize_var_lookup
will traverse the entire chain when handling eachlocal.get
instruction. As an example, the followingwat
takes 2.4 seconds to compile on my machine:(module (func (param i32) (block (local.set 0 (i32.add (i32.const 1) (local.get 0)))) (block (local.set 0 (i32.add (i32.const 1) (local.get 0)))) ... repeated 15,000 times ) )
and a flamegraph: ![flamegraph](https://user-images.githubusercontent.com/90606735/191045431-3e7626ef-7c29-4343-b784-4f9f7205d9d5.svg) shows that almost the entire time is spent in
can_optimize_var_lookup
. Note that thewasm
file is 147 KB.Steps to Reproduce
- Download the attached
blocks.wasm.gz
and unzip it (or take the abovewat
and convert it towasm
).- Compile it with
time target/release/wasmtime compile blocks.wasm
.Expected Results
I'd expect the compilation to be on the order of 100ms as it is for other modules with a single function of similar size on my machine.
Actual Results
The compilation takes over 2 seconds.
Versions and Environment
Cranelift version or commit: commit
27435ae398bcce74bf990b9683e5c47c3f6c5d51
Operating system: Ubuntu 20.04, Linux kernel 5.4.0-122-generic
Architecture:
Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 43 bits physical, 48 bits virtual CPU(s): 64 On-line CPU(s) list: 0-63 Thread(s) per core: 2 Core(s) per socket: 16 Socket(s): 2 NUMA node(s): 1 Vendor ID: AuthenticAMD CPU family: 23 Model: 49 Model name: AMD EPYC 7302 16-Core Processor
Extra Info
I'm working on a PR which should help with this.
Last updated: Jan 24 2025 at 00:11 UTC