wasmtime / Issue #2290 Register source locations for inst... · git-wasmtime

Stream: git-wasmtime

Topic: wasmtime / Issue #2290 Register source locations for inst...

Wasmtime GitHub notifications bot (Oct 13 2020 at 16:56):

Many instruction formats in the new x64 backend accept a RegMem parameter, implying that they may load from memory. This may cause a trap if the memory address is out-of-bounds and, in order to return the correct trap code, we should be registering the source location of the instruction, e.g.:
if let Some(srcloc) = *srcloc {
    sink.add_trap(srcloc, TrapCode::HeapOutOfBounds);
}
Currently, some formats do accept a RegMem parameter but do not correctly register the possible trap location. This currently passes the spec tests because these formats are not used in their Mem variant, but for correctness and completeness we should register the trap locations any time it is possible that an instruction may load from memory.

The current convention for doing so appends a srcloc: Option<SourceLoc> field to the format and expects the programmer to correctly choose between None and Some(ctx.srcloc(insn)) when instatiating a new instruction format. I propose we change this: instead, attach the Option<SourceLoc>, or perhaps even a plain SourceLoc, to SyntheticAmode. This approach has advantages:

any programmer attempting to do a load/store will be forced to decide how the trap will be reported (but not when only accessing registers)

perhaps the helper functions in lower.rs can automatically attach a SourceLoc to any SyntheticAmodes created

the Inst formats no longer need an extra field filled with None in many cases

If we agree that the above approach is a good one, this issue should be closed not only

when that refactoring is complete but

also when any instruction formats that can load/store properly registers traps for their RegMem::Mem variants

Wasmtime GitHub notifications bot (Oct 13 2020 at 16:57):

abrown commented on Issue #2290:

cc: @julian-seward1, @bnjbvr, @cfallin

Wasmtime GitHub notifications bot (Oct 13 2020 at 17:03):

abrown edited Issue #2290:

Many instruction formats in the new x64 backend accept a RegMem parameter, implying that they may load from memory. This may cause a trap if the memory address is out-of-bounds and, in order to return the correct trap code, we should be registering the source location of the instruction, e.g.:
if let Some(srcloc) = *srcloc {
    sink.add_trap(srcloc, TrapCode::HeapOutOfBounds);
}
Currently, some formats do accept a RegMem parameter but do not correctly register the possible trap location. This passes the spec tests because these formats are not used in their RegMem::Mem variant, but for correctness and completeness we should register the trap locations any time it is possible that an instruction may load from memory.

The current convention for doing so appends a srcloc: Option<SourceLoc> field to the format and expects the programmer to correctly choose between None and Some(ctx.srcloc(insn)) when instatiating a new instruction format. I propose we change this: instead, attach the Option<SourceLoc>, or perhaps even a plain SourceLoc, to SyntheticAmode. This approach has advantages:

any programmer attempting to do a load/store will be forced to decide how the trap will be reported (but not when only accessing registers)

perhaps the helper functions in lower.rs can automatically attach a SourceLoc to any SyntheticAmodes created

the Inst formats no longer need an extra field filled with None in many cases

If we agree that the above approach is a good one, this issue should be closed not only

when that refactoring is complete but

also when any instruction formats that can load/store properly registers traps for their RegMem::Mem variants

Wasmtime GitHub notifications bot (Oct 26 2020 at 14:43):

bnjbvr commented on Issue #2290:

I agree that there's an issue with the current design, and indeed the RegMem::Mem variant could have its source location. (No need for Option<SourceLoc> in fact, since SourceLoc already has a default value meaning invalid that we could reuse to mean "undefined".) This would become a large issue when input_to_reg_mem can actually embed a load in an address-mode operand.

Unfortunately, this also means that the SyntheticAmode would then become even much bigger, going against the general plan of making Vcode's inst data fit into 32 bytes or fewer. It would be nice to find a way to properly pack here without inflating the RegMem too much.

To wit: these source locations are also only useful when not using implicit bounds checking (via signal handling and the 4GB memory page trick). When using implicit bounds checking, this data information has no use at all. We could imagine adding a trait to lowering, to decide whether we're using explicit bounds checks or not, and that trait would select a different type for RegMem and/or SyntheticAmode, but that lead to a lot of binary code duplication in the Cranelift-embedding binary, I expect...

Wasmtime GitHub notifications bot (Oct 28 2020 at 04:02):

cfallin commented on Issue #2290:

So at a high level, I think this issue implies several sub-problems to solve:

We want to design the API for creating MachInsts such that it is not possible to accidentally create an inst that should have a SourceLoc attached without that SourceLoc. I think that we can probably achieve this by auditing the helpers that we create.

We want to be sure to register the trap-codes and trap-locations at the appropriate offsets in the sink

We want to store the SourceLocs efficiently, that is, ideally not inflate the size of every RegMem if we can help it.

It occurred to me that we could solve the last two issues simultaneously by keeping source-location / trap info in a side-table, indexed by instruction (i.e., a sparse instruction-index to info map), and emitting this metadata to the sink as we iterate over the insts, outside of the machine backend's particular MachInstEmit method. We already have a notion of metadata outside the instruction, in the form of the "is this a safepoint" bit; perhaps this is another.

(In a sense, both of these bits are orthogonal to, and outside the scope of, the machine instruction itself; the MachInst is some instruction/operation that updates machine state, and the safepoint bit and trap info are metadata carried by the emission pipeline out to side-tables on the machine code.)

So perhaps we have a notion of InsnMetadata records; each instruction has zero or more such records; they are held by the VCode container in a sparse manner, sorted by instruction index (i.e., kept in the same order as the insts); and the VCode container promises to keep the order in sync, if e.g. it reorders or removes insts (e.g., move elision after RA).

Then we have some trait that the InsnMetadata implements that "emits" the metadata to the MachBuffer sink; these emit-calls are interleaved with instruction emit calls as appropriate.

Thoughts? I can sketch this in more detail if there's general agreement that we should have a framework something like this.

Wasmtime GitHub notifications bot (Oct 28 2020 at 16:22):

abrown commented on Issue #2290:

When we call Inst::foo(...), how would Inst have a way to stuff the metadata into into VCode? I don't see how they reach one another.

(As an aside about Inst size, I have wondered why we are not using #[repr(u8)] more liberally to fit stuff into 32 bytes: in the longest case I can think of, we would have an Inst with the following conceptual fields, {bool, opcode, source loc, dst reg, src from shifted amode}. If we applied #[repr(u8)], we could get this case down to inst tag/1 + bool/1 + opcode/2 + source loc/4 + dst reg/4 + src tag/1 + SytheticAMode tag/1 + Amode tag/1 + simm32/4, + base reg/4 + index reg/4 + shift/1 which equals 28 bytes. I'm not saying we shouldn't store stuff in a side table, that may still make sense, but why not pack things tighter? Or perhaps Rust won't pack things this way?)

Wasmtime GitHub notifications bot (Oct 28 2020 at 16:40):

cfallin commented on Issue #2290:

Re: #[repr(u8)]: my understanding from @bnjbvr's explanation is that there are alignment/padding restrictions that come along with nested enums -- so if every field were in one struct, we could indeed pack more tightly, but we haven't gone there. (It could be an option if we have a way to keep the ergonomics somehow; maybe a "packed" and "unpacked" form; needs more thought.)

When we call Inst::foo(...), how would Inst have a way to stuff the metadata into into VCode? I don't see how they reach one another.

Indeed, the idea is that the creator of the Inst would attach the metadata as it passes the instruction to the LowerCtx. Right now we have (slightly hackishly) ctx.emit() and ctx.emit_safepoint(); perhaps instead we could have ctx.emit_with_metadata(), and then have various constructors for the metadata info/flags (InsnMetadata::builder().with_mem_trap_loc(...).build() maybe, and .is_safepoint() as well to set the safepoint bit?).

Anyway, just a thought; perhaps it's too much infrastructure for just a few things; but it at least enables sparsity and gives the ability to only optionally include the data (with zero cost if not), which is good :-)

Wasmtime GitHub notifications bot (Oct 28 2020 at 16:51):

abrown commented on Issue #2290:

Makes sense to me. @bnjbvr, @julian-seward1?

Wasmtime GitHub notifications bot (Oct 28 2020 at 17:57):

bnjbvr commented on Issue #2290:

Re: #[repr(u8)]: my understanding from @bnjbvr's explanation is that there are alignment/padding restrictions that come along with nested enums -- so if every field were in one struct, we could indeed pack more tightly, but we haven't gone there. (It could be an option if we have a way to keep the ergonomics somehow; maybe a "packed" and "unpacked" form; needs more thought.)

Yes, exactly this: each different enum adds a new word for the discriminant (not sure if it applies under certain conditions like "when the enum may be heap-allocated" or not; this matches what I've observed with the Inst enum, though). Even if we flattened all the enums into a single one, and then used a newtype as suggested by @fitzgen, we'd not have enough space; the ideal packing would need the RegMem SIB shift offset (3 bits) to be on the same byte as the overall emulated discriminant, if I remember correctly my back-of-the-envelope computation.

Indeed, the idea is that the creator of the Inst would attach the metadata as it passes the instruction to the LowerCtx. Right now we have (slightly hackishly) ctx.emit() and ctx.emit_safepoint(); perhaps instead we could have ctx.emit_with_metadata(), and then have various constructors for the metadata info/flags (InsnMetadata::builder().with_mem_trap_loc(...).build() maybe, and .is_safepoint() as well to set the safepoint bit?).

The Metadata struct sounds good to me! If it has only two fields (is_safepoint + trap's sourceloc), and in no cases both are needed at the same time, the Builder pattern may be overhead, and it may increase Rustc compile times for the already-quite-large lower functions (but that's still another issue, that can be handled separately).

Wasmtime GitHub notifications bot (Nov 11 2020 at 01:05):

cfallin closed Issue #2290:

Many instruction formats in the new x64 backend accept a RegMem parameter, implying that they may load from memory. This may cause a trap if the memory address is out-of-bounds and, in order to return the correct trap code, we should be registering the source location of the instruction, e.g.:
if let Some(srcloc) = *srcloc {
    sink.add_trap(srcloc, TrapCode::HeapOutOfBounds);
}
Currently, some formats do accept a RegMem parameter but do not correctly register the possible trap location. This passes the spec tests because these formats are not used in their RegMem::Mem variant, but for correctness and completeness we should register the trap locations any time it is possible that an instruction may load from memory.

The current convention for doing so appends a srcloc: Option<SourceLoc> field to the format and expects the programmer to correctly choose between None and Some(ctx.srcloc(insn)) when instatiating a new instruction format. I propose we change this: instead, attach the Option<SourceLoc>, or perhaps even a plain SourceLoc, to SyntheticAmode. This approach has advantages:

any programmer attempting to do a load/store will be forced to decide how the trap will be reported (but not when only accessing registers)

perhaps the helper functions in lower.rs can automatically attach a SourceLoc to any SyntheticAmodes created

the Inst formats no longer need an extra field filled with None in many cases

If we agree that the above approach is a good one, this issue should be closed not only

when that refactoring is complete but

also when any instruction formats that can load/store properly registers traps for their RegMem::Mem variants

Last updated: Oct 23 2024 at 20:03 UTC