alexcrichton opened issue #6012:
Currently Cranelift will emit a stack map for a trap instruction. This is frequently done with
.take_stack_map()
, however, and some "compound opcodes", such as float-to-int conversion on x64, have many traps throughout their body. Only one trap will get the stack map when they theoretically all need the stack map.Additionally, though, Wasmtime doesn't actually need stack maps at trap opcodes since after a trap an instance will be terminated and all stack-based roots will become unrooted and candidates for GC.
Given this I think it would be best to remove all stack maps from traps since they're otherwise just taking up space in the compiled artifact and aren't getting used for anything.
cfallin commented on issue #6012:
This seems correct to me (i.e., safe to remove); and furthermore, might result in decent codegen improvements in code using reftypes. This is because in addition to the stackmap metadata itself, the safepoint-at-every-trap implies spilling all ref values to the stack. So code that (i) has refs and (ii) has other ops that trap, like divides or non-truncating float/int conversions, might see speedups.
Last updated: Nov 22 2024 at 17:03 UTC