github-actions[bot] commented on Issue #2165:
Subscribe to Label Action
cc @bnjbvr
<details>
This issue or pull request has been labeled: "cranelift"Thus the following users have been cc'd because of the following labels:
- bnjbvr: cranelift
To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.
Learn more.
</details>
bjorn3 commented on Issue #2165:
I used
perf record -e instructions:u -e cycles:u
and thenperf report
looking it the estimated values at the header after selecting an event. I know there is an easier way forperf
, but I can't find the exact command.
abrown commented on Issue #2165:
I usually use
perf stat
for that type of thing but I mean something different: what did you use as input for cranelift-reader when you measured?
bjorn3 commented on Issue #2165:
The two test files I was using were:
<details>
; ptrs: 100 function u0:0(i64) system_v { ; symbol _ZN9mini_core13drop_in_place17hf38f2fd3a61ef36bE ; instance Instance { def: DropGlue(DefId(1:231 ~ mini_core[8787]::drop_in_place[0]), Some([NoisyDropInner; 2])), substs: [[NoisyDropInner; 2]] } ; sig ([*mut [NoisyDropInner; 2]]; c_variadic: false)->() ss0 = explicit_slot 8 ss1 = explicit_slot 8 ss2 = explicit_slot 8 sig0 = (i64) system_v sig1 = (i64) system_v fn0 = colocated u0:0 sig0 fn1 = colocated u0:0 sig1 block0(v0: i64): stack_store v0, ss0 jump block1 block1: v1 = iconst.i64 0 v2 = stack_load.i64 ss0 brz v1, block5 jump block8 block5: v16 = iconst.i64 0 jump block4(v16) block8: v27 = stack_load.i64 ss0 v29 = iconst.i64 0 v30 = iadd v27, v29 jump block7(v27) block4(v11: i64): v13 = icmp_imm eq v11, 2 v14 = bint.i8 v13 v15 = uextend.i32 v14 brz v15, block3 jump block2 block3: v4 = stack_load.i64 ss0 v6 = imul_imm.i64 v11, 0 v7 = iadd v4, v6 v8 = iconst.i64 1 v9 = iadd.i64 v11, v8 stack_store v7, ss1 v10 = stack_load.i64 ss1 call fn0(v10) jump block4(v9) block7(v22: i64): v24 = icmp eq v22, v30 v25 = bint.i8 v24 v26 = uextend.i32 v25 brz v26, block6 jump block2 block6: v18 = iconst.i64 1 v19 = imul_imm v18, 0 v20 = iadd.i64 v22, v19 stack_store.i64 v22, ss2 v21 = stack_load.i64 ss2 call fn1(v21) jump block7(v20) block2: v100 = stack_addr.i64 ss0 return }
and
; ptrs: 0 1 3 10 11 13 function u0:0(i64) system_v { ss0 = explicit_slot 8 block0(v0: i64): v1 = iadd_imm v0, 1 v2 = iconst.i64 2 v3 = iadd v1, v2 v4 = iconst.i8 0 store v4, v3 v10 = stack_addr.i64 ss0 v11 = iadd_imm v10, 1 v12 = iconst.i64 2 v13 = iadd v11, v12 v14 = iconst.i8 0 store v14, v13 return }
</details>
The measured time is ~25% parsing. I took the two tests for a library I am working on and repeated them 10_000x in a row inside the test process.
Last updated: Dec 23 2024 at 12:05 UTC