adambratschikaye labeled issue #4923:
It seems that a
wasmsource with a chain of blocks each containing alocal.getinstruction can have quadratic compile time becausecan_optimize_var_lookupwill traverse the entire chain when handling eachlocal.getinstruction. As an example, the followingwattakes 2.4 seconds to compile on my machine:(module (func (param i32) (block (local.set 0 (i32.add (i32.const 1) (local.get 0)))) (block (local.set 0 (i32.add (i32.const 1) (local.get 0)))) ... repeated 15,000 times ) )and a flamegraph:  shows that almost the entire time is spent in
can_optimize_var_lookup. Note that thewasmfile is 147 KB.Steps to Reproduce
- Download the attached
blocks.wasm.gzand unzip it (or take the abovewatand convert it towasm).- Compile it with
time target/release/wasmtime compile blocks.wasm.Expected Results
I'd expect the compilation to be on the order of 100ms as it is for other modules with a single function of similar size on my machine.
Actual Results
The compilation takes over 2 seconds.
Versions and Environment
Cranelift version or commit: commit
27435ae398bcce74bf990b9683e5c47c3f6c5d51Operating system: Ubuntu 20.04, Linux kernel 5.4.0-122-generic
Architecture:
Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 43 bits physical, 48 bits virtual CPU(s): 64 On-line CPU(s) list: 0-63 Thread(s) per core: 2 Core(s) per socket: 16 Socket(s): 2 NUMA node(s): 1 Vendor ID: AuthenticAMD CPU family: 23 Model: 49 Model name: AMD EPYC 7302 16-Core ProcessorExtra Info
I'm working on a PR which should help with this.
adambratschikaye opened issue #4923:
It seems that a
wasmsource with a chain of blocks each containing alocal.getinstruction can have quadratic compile time becausecan_optimize_var_lookupwill traverse the entire chain when handling eachlocal.getinstruction. As an example, the followingwattakes 2.4 seconds to compile on my machine:(module (func (param i32) (block (local.set 0 (i32.add (i32.const 1) (local.get 0)))) (block (local.set 0 (i32.add (i32.const 1) (local.get 0)))) ... repeated 15,000 times ) )and a flamegraph:  shows that almost the entire time is spent in
can_optimize_var_lookup. Note that thewasmfile is 147 KB.Steps to Reproduce
- Download the attached
blocks.wasm.gzand unzip it (or take the abovewatand convert it towasm).- Compile it with
time target/release/wasmtime compile blocks.wasm.Expected Results
I'd expect the compilation to be on the order of 100ms as it is for other modules with a single function of similar size on my machine.
Actual Results
The compilation takes over 2 seconds.
Versions and Environment
Cranelift version or commit: commit
27435ae398bcce74bf990b9683e5c47c3f6c5d51Operating system: Ubuntu 20.04, Linux kernel 5.4.0-122-generic
Architecture:
Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 43 bits physical, 48 bits virtual CPU(s): 64 On-line CPU(s) list: 0-63 Thread(s) per core: 2 Core(s) per socket: 16 Socket(s): 2 NUMA node(s): 1 Vendor ID: AuthenticAMD CPU family: 23 Model: 49 Model name: AMD EPYC 7302 16-Core ProcessorExtra Info
I'm working on a PR which should help with this.
adambratschikaye labeled issue #4923:
It seems that a
wasmsource with a chain of blocks each containing alocal.getinstruction can have quadratic compile time becausecan_optimize_var_lookupwill traverse the entire chain when handling eachlocal.getinstruction. As an example, the followingwattakes 2.4 seconds to compile on my machine:(module (func (param i32) (block (local.set 0 (i32.add (i32.const 1) (local.get 0)))) (block (local.set 0 (i32.add (i32.const 1) (local.get 0)))) ... repeated 15,000 times ) )and a flamegraph:  shows that almost the entire time is spent in
can_optimize_var_lookup. Note that thewasmfile is 147 KB.Steps to Reproduce
- Download the attached
blocks.wasm.gzand unzip it (or take the abovewatand convert it towasm).- Compile it with
time target/release/wasmtime compile blocks.wasm.Expected Results
I'd expect the compilation to be on the order of 100ms as it is for other modules with a single function of similar size on my machine.
Actual Results
The compilation takes over 2 seconds.
Versions and Environment
Cranelift version or commit: commit
27435ae398bcce74bf990b9683e5c47c3f6c5d51Operating system: Ubuntu 20.04, Linux kernel 5.4.0-122-generic
Architecture:
Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 43 bits physical, 48 bits virtual CPU(s): 64 On-line CPU(s) list: 0-63 Thread(s) per core: 2 Core(s) per socket: 16 Socket(s): 2 NUMA node(s): 1 Vendor ID: AuthenticAMD CPU family: 23 Model: 49 Model name: AMD EPYC 7302 16-Core ProcessorExtra Info
I'm working on a PR which should help with this.
adambratschikaye closed issue #4923:
It seems that a
wasmsource with a chain of blocks each containing alocal.getinstruction can have quadratic compile time becausecan_optimize_var_lookupwill traverse the entire chain when handling eachlocal.getinstruction. As an example, the followingwattakes 2.4 seconds to compile on my machine:(module (func (param i32) (block (local.set 0 (i32.add (i32.const 1) (local.get 0)))) (block (local.set 0 (i32.add (i32.const 1) (local.get 0)))) ... repeated 15,000 times ) )and a flamegraph:  shows that almost the entire time is spent in
can_optimize_var_lookup. Note that thewasmfile is 147 KB.Steps to Reproduce
- Download the attached
blocks.wasm.gzand unzip it (or take the abovewatand convert it towasm).- Compile it with
time target/release/wasmtime compile blocks.wasm.Expected Results
I'd expect the compilation to be on the order of 100ms as it is for other modules with a single function of similar size on my machine.
Actual Results
The compilation takes over 2 seconds.
Versions and Environment
Cranelift version or commit: commit
27435ae398bcce74bf990b9683e5c47c3f6c5d51Operating system: Ubuntu 20.04, Linux kernel 5.4.0-122-generic
Architecture:
Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 43 bits physical, 48 bits virtual CPU(s): 64 On-line CPU(s) list: 0-63 Thread(s) per core: 2 Core(s) per socket: 16 Socket(s): 2 NUMA node(s): 1 Vendor ID: AuthenticAMD CPU family: 23 Model: 49 Model name: AMD EPYC 7302 16-Core ProcessorExtra Info
I'm working on a PR which should help with this.
jameysharp commented on issue #4923:
My draft fix for this is in #4939.
jameysharp reopened issue #4923:
It seems that a
wasmsource with a chain of blocks each containing alocal.getinstruction can have quadratic compile time becausecan_optimize_var_lookupwill traverse the entire chain when handling eachlocal.getinstruction. As an example, the followingwattakes 2.4 seconds to compile on my machine:(module (func (param i32) (block (local.set 0 (i32.add (i32.const 1) (local.get 0)))) (block (local.set 0 (i32.add (i32.const 1) (local.get 0)))) ... repeated 15,000 times ) )and a flamegraph:  shows that almost the entire time is spent in
can_optimize_var_lookup. Note that thewasmfile is 147 KB.Steps to Reproduce
- Download the attached
blocks.wasm.gzand unzip it (or take the abovewatand convert it towasm).- Compile it with
time target/release/wasmtime compile blocks.wasm.Expected Results
I'd expect the compilation to be on the order of 100ms as it is for other modules with a single function of similar size on my machine.
Actual Results
The compilation takes over 2 seconds.
Versions and Environment
Cranelift version or commit: commit
27435ae398bcce74bf990b9683e5c47c3f6c5d51Operating system: Ubuntu 20.04, Linux kernel 5.4.0-122-generic
Architecture:
Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 43 bits physical, 48 bits virtual CPU(s): 64 On-line CPU(s) list: 0-63 Thread(s) per core: 2 Core(s) per socket: 16 Socket(s): 2 NUMA node(s): 1 Vendor ID: AuthenticAMD CPU family: 23 Model: 49 Model name: AMD EPYC 7302 16-Core ProcessorExtra Info
I'm working on a PR which should help with this.
jameysharp closed issue #4923:
It seems that a
wasmsource with a chain of blocks each containing alocal.getinstruction can have quadratic compile time becausecan_optimize_var_lookupwill traverse the entire chain when handling eachlocal.getinstruction. As an example, the followingwattakes 2.4 seconds to compile on my machine:(module (func (param i32) (block (local.set 0 (i32.add (i32.const 1) (local.get 0)))) (block (local.set 0 (i32.add (i32.const 1) (local.get 0)))) ... repeated 15,000 times ) )and a flamegraph:  shows that almost the entire time is spent in
can_optimize_var_lookup. Note that thewasmfile is 147 KB.Steps to Reproduce
- Download the attached
blocks.wasm.gzand unzip it (or take the abovewatand convert it towasm).- Compile it with
time target/release/wasmtime compile blocks.wasm.Expected Results
I'd expect the compilation to be on the order of 100ms as it is for other modules with a single function of similar size on my machine.
Actual Results
The compilation takes over 2 seconds.
Versions and Environment
Cranelift version or commit: commit
27435ae398bcce74bf990b9683e5c47c3f6c5d51Operating system: Ubuntu 20.04, Linux kernel 5.4.0-122-generic
Architecture:
Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 43 bits physical, 48 bits virtual CPU(s): 64 On-line CPU(s) list: 0-63 Thread(s) per core: 2 Core(s) per socket: 16 Socket(s): 2 NUMA node(s): 1 Vendor ID: AuthenticAMD CPU family: 23 Model: 49 Model name: AMD EPYC 7302 16-Core ProcessorExtra Info
I'm working on a PR which should help with this.
Last updated: Dec 13 2025 at 19:03 UTC