wasmtime / Issue #2614 Avoid creating 0-sized nops in x64... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / Issue #2614 Avoid creating 0-sized nops in x64...

Wasmtime GitHub notifications bot (Jan 28 2021 at 07:23):

Sorry, in this case the doc-comment is inaccurate (my fault for the ambiguity!) -- the requirement imposed by the emission code, and what should be documented, is that gen_nop() returns a nonzero-sized NOP if the requested size is nonzero. An infinite loop could arise only if that is violated.

I don't think it makes sense to otherwise artificially limit the interface with an assert -- a zero-sized NOP is a legitimate need in some cases (e.g. when one must return an instruction but no machine code is needed).

Would you mind changing the PR to update the doc-comment on the MachInst interface instead? Thanks!

Wasmtime GitHub notifications bot (Jan 28 2021 at 07:29):

cfallin commented on Issue #2614:

Ah, actually, there is one thing that this change fixes: gen_nop(x) with x == 16*n would produce a zero-sized NOP. That won't arise with current alignment constants, but it's a potential future issue. I think what we actually want is std::cmp::min(x, 16) (i.e., clamp to an upper bound of 16 bytes). Thanks for the scrutiny here!

Wasmtime GitHub notifications bot (Jan 28 2021 at 07:59):

kaseyc commented on Issue #2614:

Thanks! Updated the condition and the machinst comment.

Given that the 16*x edge case is currently not possible to generate, I imagine that writing a test for it is not possible.

Both the aarch64 and arm32 backends have asserted minimum sizes. Should they also be updated to allow a zero-sized return (in a follow up commit)?

Wasmtime GitHub notifications bot (Jan 28 2021 at 13:52):

bnjbvr commented on Issue #2614:

Is it possible it's actually min(x, 15), note the 15 and not 16, since we want something meaningful modulo 16? A 16-sized nop would be the same as 0 (mod 16), and there's no preferred encoding for such a nop.

Wasmtime GitHub notifications bot (Jan 28 2021 at 18:14):

cfallin commented on Issue #2614:

So I'm paging this back in more now: I note that we also had gen_zero_len_nop(). We could eliminate that and just do a gen_nop(0); I'm not sure why we had the separation in the first place.

Re: x86 max nop size -- yes, 15, not 16, agreed.

Wasmtime GitHub notifications bot (Jan 28 2021 at 21:24):

kaseyc commented on Issue #2614:

Re 15 vs 16, the nop constructor also seems to think 16 is a valid value:
https://github.com/bytecodealliance/wasmtime/blob/d1c1cb6a25c7256c82811d6b9ddc60c2a60da075/cranelift/codegen/src/isa/x64/inst/mod.rs#L569

Although the actual emitter only handles up to 9 bytes in a single instruction src

Going above 9 seems to require more special casing. AMD provides a table for up to 15, and newer Intel processors may also be able to handle it:
AMD Manual
LLVM #1
LLVM #2

This is getting slightly off topic . But it seems like 15 is a reasonable max size.

It also looks like there are no actual uses of gen_zero_len_nop, so if you prefer I could fold that into gen_nop everywhere.

Wasmtime GitHub notifications bot (Jan 29 2021 at 05:20):

kaseyc commented on Issue #2614:

I've updated the code to change the max size to 15.

gen_zero_len_nop can't be entirely removed, as it is part of the regalloc crate's Function trait. Though the MachInst implementations could dropped in favor of gen_nop(0).

Last updated: Apr 17 2025 at 15:03 UTC