Stream: git-wasmtime

Topic: wasmtime / PR #1956 machinst x64: branch table, spill/rel...


view this post on Zulip Wasmtime GitHub notifications bot (Jul 01 2020 at 17:04):

bnjbvr opened PR #1956 from x64 to main:

This makes it possible to run a basic C hello world with wasmtime. It's a large PR but the commit splitting should make it easier to review. I'll add emit tests for div/idiv before landing, but reviewing shouldn't be blocked on this.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 01 2020 at 17:04):

bnjbvr requested cfallin and julian-seward1 for a review on PR #1956.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 01 2020 at 17:04):

bnjbvr requested cfallin and julian-seward1 for a review on PR #1956.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 01 2020 at 19:24):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 01 2020 at 19:24):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 01 2020 at 19:24):

cfallin created PR Review Comment:

Maybe rename JmpRel32 to PCRel32 (since this is not a jump)?

view this post on Zulip Wasmtime GitHub notifications bot (Jul 01 2020 at 19:24):

cfallin created PR Review Comment:

Perhaps u32::try_from(offset).expect("rip-relative can't hold >= U32_MAX values")?

view this post on Zulip Wasmtime GitHub notifications bot (Jul 01 2020 at 19:24):

cfallin created PR Review Comment:

Maybe add a clarifying note here that off_into_table is an addend and the label-use resolution adds the target-relative-to-this-entry to the addend so that the result is target-relative-to-JT-start?

(I neglected to document this logic in the analogous over on the aarch64 side; feel free to do a drive-by edit with the same comment over there...)

view this post on Zulip Wasmtime GitHub notifications bot (Jul 01 2020 at 19:24):

cfallin created PR Review Comment:

Maybe add a note here that we don't need to worry about islands, despite exceeding the "worst-case inst size" (with the potentially large inline targets array), because on x64 our only label-use type right now has a 2GB range? If we later consider using shorter-range label references, we'll have to revisit this.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 01 2020 at 19:24):

cfallin created PR Review Comment:

I think there might be a slightly shorter way to test for zero/nonzero: test Rdivisor, Rdivisor / jnz .... Then we don't need the zero-immediate. Saves a byte, anyway...

view this post on Zulip Wasmtime GitHub notifications bot (Jul 01 2020 at 19:24):

cfallin created PR Review Comment:

Suggestion: could we make the ShowWithRRU pretty-print format here use assembler-parseable syntax, i.e., lea 0(%rip), %rdi? This makes it slightly more mechanical (hence easier to do by hand, and easier to possibly automate later) to verify that the golden bytes match the instruction.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 01 2020 at 19:24):

cfallin created PR Review Comment:

It might be worthwhile to go ahead and box up / out-of-line some of this, like we did with JTSequenceInfo, to keep the Inst size down.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 01 2020 at 19:24):

cfallin created PR Review Comment:

Perhaps we should have an enum for this, like the InstSize enum over in aarch64? (We could make it #[repr(u8)] and assign the arms the above numeric values.)

view this post on Zulip Wasmtime GitHub notifications bot (Jul 01 2020 at 19:24):

cfallin created PR Review Comment:

Use the test instruction (as above) to save an immediate byte?

view this post on Zulip Wasmtime GitHub notifications bot (Jul 01 2020 at 19:24):

cfallin created PR Review Comment:

I wonder if there's a way to merge these signed cases with the above unsigned to reduce duplication a bit. Especially if the unsigned checks (and divide itself) also become a meta-instruction, the lowering is more similar above and below, I think?

view this post on Zulip Wasmtime GitHub notifications bot (Jul 01 2020 at 19:24):

cfallin created PR Review Comment:

On a little further thought after our conversation yesterday: I think we should actually probably avoid this fixed offset, because there is a chance that instructions are inserted between the use (jump) and destination (after trap). In particular, we might get spills/reloads; or a later optimization pass, if we eventually have some on the VCode, could rearrange things. So I think we should do here what you did for the signed case below, and wrap up all of the checks into one VCode meta-instruction.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 01 2020 at 19:24):

cfallin created PR Review Comment:

Yes, I'd suggest something like the NarrowValueMode stuff, but deferring that till later is definitely OK!

view this post on Zulip Wasmtime GitHub notifications bot (Jul 01 2020 at 19:24):

cfallin created PR Review Comment:

It seems in some places the backend measures size in bits, while in others (e.g. size field in a divide VCode instruction) it is in bytes. Might be good to standardize on bits (given that all the types, e.g. I8, are named as such), and use the enum suggested above here as well; it could have a from_bits as on InstSize in aarch64.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 02 2020 at 10:36):

bnjbvr submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 02 2020 at 10:36):

bnjbvr created PR Review Comment:

Interestingly, there's also a PCRel32 in addition to JmpRel32: the former adds the value read at the offset before patching, while the latter does subtract 4 to the read value before patching (because control flow instructions are relative to the start of the next instruction on x64). This is one of these cases here. Should i rename JmpRel32 to PCRel32Disp4 or something like this instead?

view this post on Zulip Wasmtime GitHub notifications bot (Jul 02 2020 at 14:43):

bnjbvr created PR Review Comment:

Yep, it'll probably be part of a greater refactoring though; some instructions use a boolean to distinguish 32 bits encodings from 64 bits encodings, not implementing all the other ones. Maybe cg_clif wants the smaller sizes encodings? But if not, implementing only 32 and 64 bits might be sufficient in general.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 02 2020 at 14:43):

bnjbvr submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 02 2020 at 14:45):

bnjbvr submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 02 2020 at 14:45):

bnjbvr created PR Review Comment:

Added to the todo list, will probably make more sense once all the instructions have been implemented.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 02 2020 at 16:47):

bnjbvr updated PR #1956 from x64 to main:

This makes it possible to run a basic C hello world with wasmtime. It's a large PR but the commit splitting should make it easier to review. I'll add emit tests for div/idiv before landing, but reviewing shouldn't be blocked on this.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 03 2020 at 05:09):

julian-seward1 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 03 2020 at 05:09):

julian-seward1 created PR Review Comment:

As a general comment, I think we should try to avoid generating 8- and 16- bit instructions as much as possible, and instead generate 32-bit insns where we ignore the upper 24 or 16 bits of the result. The reason is to avoid partial register stalls. These are documented in the Intel Opt manual, in my copy Sec 3.5.2.4 (Partial Register Stalls), with similar comments in following sections (Partial XMM Register Stalls, Partial Flag Register Stalls).

Also, 16-bit instructions which contain an immediate field are subject to the stalls described in Sec 3.4.2.3 (Length Changing Prefixes).

view this post on Zulip Wasmtime GitHub notifications bot (Jul 03 2020 at 05:49):

julian-seward1 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 03 2020 at 05:49):

julian-seward1 created PR Review Comment:

Maybe LabelUses need to be architecture-specific, then, if they are not already?

view this post on Zulip Wasmtime GitHub notifications bot (Jul 03 2020 at 05:53):

julian-seward1 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 03 2020 at 05:53):

julian-seward1 created PR Review Comment:

I agree; but I also don't think this is critical. We could quite reasonably change cmp $0, .. into tst .. (and maybe some other patterns) in future work to optimise the x64 isel, so we don't delay landing this one further.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 03 2020 at 06:02):

julian-seward1 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 03 2020 at 06:02):

julian-seward1 created PR Review Comment:

Or have a naming scheme that makes the units explicit? For example, offset_szW -- offset in words. offset_szB -- offset in bytes. offset_sz -- offset in bits. Also, useful variants -- offset_szBlg2 -- log 2 of size in bytes, etc.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 03 2020 at 06:47):

julian-seward1 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 03 2020 at 06:47):

julian-seward1 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 03 2020 at 06:47):

julian-seward1 created PR Review Comment:

also for and, or, xor, mul

view this post on Zulip Wasmtime GitHub notifications bot (Jul 03 2020 at 06:47):

julian-seward1 created PR Review Comment:

I think this needs to show the inverse of trap_code (viz, trap_code ^ 1) since what this currently displays is "skip the trap if trap_code is true".

view this post on Zulip Wasmtime GitHub notifications bot (Jul 03 2020 at 06:47):

julian-seward1 created PR Review Comment:

Please could you add here, a comment showing what instructions are actually emitted? It's pretty hard to infer from reading the code. Same for SignExtendRaxRdx et al.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 03 2020 at 06:47):

julian-seward1 created PR Review Comment:

Given that cwd is 6690, cdq is 90 and cqo is 4890, (or something like that), and there's only three possibilities, I think it would be clearer to the reader just to write the relevant sink.putN calls directly. These widening insns are ancient 1970s-era stuff and don't fit in the "standard" RM->R scheme.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 03 2020 at 06:47):

julian-seward1 created PR Review Comment:

Per comments I added to one of Chris's comments, I think we should generate 32-bit insns instead of 8- or 16-bit insns except where we absolutely can't avoid them (eg, stores), so as to avoid partial register stall hazards.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 03 2020 at 09:21):

bnjbvr submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 03 2020 at 09:21):

bnjbvr created PR Review Comment:

They are architecture-specific, already: this is defined in inst/mod.rs.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 03 2020 at 09:47):

bnjbvr submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 03 2020 at 09:47):

bnjbvr created PR Review Comment:

Good point, I'll revisit this by re-checking all the lowering.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 03 2020 at 11:58):

bnjbvr updated PR #1956 from x64 to main:

This makes it possible to run a basic C hello world with wasmtime. It's a large PR but the commit splitting should make it easier to review. I'll add emit tests for div/idiv before landing, but reviewing shouldn't be blocked on this.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 03 2020 at 12:33):

bnjbvr merged PR #1956.


Last updated: Jan 24 2025 at 00:11 UTC