Stream: git-wasmtime

Topic: wasmtime / PR #1718 Rework of MachInst isel, branch fixup...


view this post on Zulip Wasmtime GitHub notifications bot (May 16 2020 at 02:08):

cfallin opened PR #1718 from machinst-codebuffer to master:

This patch includes:

Overall, on bz2.wasm, the results are:

    wasmtime full run (compile + runtime) of bz2:

    baseline:   9774M insns, 9742M cycles, 3.918s
    w/ changes: 7012M insns, 6888M cycles, 2.958s  (24.5% faster, 28.3% fewer insns)

    clif-util wasm compile bz2:

    baseline:   2633M insns, 3278M cycles, 1.034s
    w/ changes: 2366M insns, 2920M cycles, 0.923s  (10.7% faster, 10.1% fewer insns)

    All numbers are averages of two runs on an Ampere eMAG.

Creating this PR now to start the review, but I still need to do the following before merging:

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (May 16 2020 at 02:08):

cfallin requested bnjbvr and julian-seward1 for a review on PR #1718.

view this post on Zulip Wasmtime GitHub notifications bot (May 16 2020 at 02:08):

cfallin requested bnjbvr and julian-seward1 for a review on PR #1718.

view this post on Zulip Wasmtime GitHub notifications bot (May 16 2020 at 02:09):

cfallin edited PR #1718 from machinst-codebuffer to master:

tl;dr: new new-isel; better block-ordering, handling branches in one pass. 24% faster compile+run on bz2 (28% fewer instructions); 10% faster compile (10% fewer instructions).

This patch includes:

Overall, on bz2.wasm, the results are:

    wasmtime full run (compile + runtime) of bz2:

    baseline:   9774M insns, 9742M cycles, 3.918s
    w/ changes: 7012M insns, 6888M cycles, 2.958s  (24.5% faster, 28.3% fewer insns)

    clif-util wasm compile bz2:

    baseline:   2633M insns, 3278M cycles, 1.034s
    w/ changes: 2366M insns, 2920M cycles, 0.923s  (10.7% faster, 10.1% fewer insns)

    All numbers are averages of two runs on an Ampere eMAG.

Creating this PR now to start the review, but I still need to do the following before merging:

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (May 16 2020 at 06:22):

cfallin updated PR #1718 from machinst-codebuffer to master:

tl;dr: new new-isel; better block-ordering, handling branches in one pass. 24% faster compile+run on bz2 (28% fewer instructions); 10% faster compile (10% fewer instructions).

This patch includes:

Overall, on bz2.wasm, the results are:

    wasmtime full run (compile + runtime) of bz2:

    baseline:   9774M insns, 9742M cycles, 3.918s
    w/ changes: 7012M insns, 6888M cycles, 2.958s  (24.5% faster, 28.3% fewer insns)

    clif-util wasm compile bz2:

    baseline:   2633M insns, 3278M cycles, 1.034s
    w/ changes: 2366M insns, 2920M cycles, 0.923s  (10.7% faster, 10.1% fewer insns)

    All numbers are averages of two runs on an Ampere eMAG.

Creating this PR now to start the review, but I still need to do the following before merging:

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (May 16 2020 at 08:02):

cfallin updated PR #1718 from machinst-codebuffer to master:

tl;dr: new new-isel; better block-ordering, handling branches in one pass. 24% faster compile+run on bz2 (28% fewer instructions); 10% faster compile (10% fewer instructions).

This patch includes:

Overall, on bz2.wasm, the results are:

    wasmtime full run (compile + runtime) of bz2:

    baseline:   9774M insns, 9742M cycles, 3.918s
    w/ changes: 7012M insns, 6888M cycles, 2.958s  (24.5% faster, 28.3% fewer insns)

    clif-util wasm compile bz2:

    baseline:   2633M insns, 3278M cycles, 1.034s
    w/ changes: 2366M insns, 2920M cycles, 0.923s  (10.7% faster, 10.1% fewer insns)

    All numbers are averages of two runs on an Ampere eMAG.

Creating this PR now to start the review, but I still need to do the following before merging:

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (May 17 2020 at 05:03):

cfallin updated PR #1718 from machinst-codebuffer to master:

tl;dr: new new-isel; better block-ordering, handling branches in one pass. 24% faster compile+run on bz2 (28% fewer instructions); 10% faster compile (10% fewer instructions).

This patch includes:

Overall, on bz2.wasm, the results are:

    wasmtime full run (compile + runtime) of bz2:

    baseline:   9774M insns, 9742M cycles, 3.918s
    w/ changes: 7012M insns, 6888M cycles, 2.958s  (24.5% faster, 28.3% fewer insns)

    clif-util wasm compile bz2:

    baseline:   2633M insns, 3278M cycles, 1.034s
    w/ changes: 2366M insns, 2920M cycles, 0.923s  (10.7% faster, 10.1% fewer insns)

    All numbers are averages of two runs on an Ampere eMAG.

Creating this PR now to start the review, but I still need to do the following before merging:

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (May 17 2020 at 06:11):

cfallin updated PR #1718 from machinst-codebuffer to master:

tl;dr: new new-isel; better block-ordering, handling branches in one pass. 24% faster compile+run on bz2 (28% fewer instructions); 10% faster compile (10% fewer instructions).

This patch includes:

Overall, on bz2.wasm, the results are:

    wasmtime full run (compile + runtime) of bz2:

    baseline:   9774M insns, 9742M cycles, 3.918s
    w/ changes: 7012M insns, 6888M cycles, 2.958s  (24.5% faster, 28.3% fewer insns)

    clif-util wasm compile bz2:

    baseline:   2633M insns, 3278M cycles, 1.034s
    w/ changes: 2366M insns, 2920M cycles, 0.923s  (10.7% faster, 10.1% fewer insns)

    All numbers are averages of two runs on an Ampere eMAG.

Creating this PR now to start the review, but I still need to do the following before merging:

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 01:17):

cfallin updated PR #1718 from machinst-codebuffer to master:

tl;dr: new new-isel; better block-ordering, handling branches in one pass. 24% faster compile+run on bz2 (28% fewer instructions); 10% faster compile (10% fewer instructions).

This patch includes:

Overall, on bz2.wasm, the results are:

    wasmtime full run (compile + runtime) of bz2:

    baseline:   9774M insns, 9742M cycles, 3.918s
    w/ changes: 7012M insns, 6888M cycles, 2.958s  (24.5% faster, 28.3% fewer insns)

    clif-util wasm compile bz2:

    baseline:   2633M insns, 3278M cycles, 1.034s
    w/ changes: 2366M insns, 2920M cycles, 0.923s  (10.7% faster, 10.1% fewer insns)

    All numbers are averages of two runs on an Ampere eMAG.

Creating this PR now to start the review, but I still need to do the following before merging:

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 01:18):

cfallin updated PR #1718 from machinst-codebuffer to master:

tl;dr: new new-isel; better block-ordering, handling branches in one pass. 24% faster compile+run on bz2 (28% fewer instructions); 10% faster compile (10% fewer instructions).

This patch includes:

Overall, on bz2.wasm, the results are:

    wasmtime full run (compile + runtime) of bz2:

    baseline:   9774M insns, 9742M cycles, 3.918s
    w/ changes: 7012M insns, 6888M cycles, 2.958s  (24.5% faster, 28.3% fewer insns)

    clif-util wasm compile bz2:

    baseline:   2633M insns, 3278M cycles, 1.034s
    w/ changes: 2366M insns, 2920M cycles, 0.923s  (10.7% faster, 10.1% fewer insns)

    All numbers are averages of two runs on an Ampere eMAG.

Creating this PR now to start the review, but I still need to do the following before merging:

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 01:20):

cfallin edited PR #1718 from machinst-codebuffer to master:

tl;dr: new new-isel; better block-ordering, handling branches in one pass. 24% faster compile+run on bz2 (28% fewer instructions); 10% faster compile (10% fewer instructions).

This patch includes:

Overall, on bz2.wasm, the results are:

    wasmtime full run (compile + runtime) of bz2:

    baseline:   9774M insns, 9742M cycles, 3.918s
    w/ changes: 7012M insns, 6888M cycles, 2.958s  (24.5% faster, 28.3% fewer insns)

    clif-util wasm compile bz2:

    baseline:   2633M insns, 3278M cycles, 1.034s
    w/ changes: 2366M insns, 2920M cycles, 0.923s  (10.7% faster, 10.1% fewer insns)

    All numbers are averages of two runs on an Ampere eMAG.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 11:45):

julian-seward1 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 11:45):

julian-seward1 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 11:45):

julian-seward1 created PR Review Comment:

What does this mean? Can you clarify the semantics? When is it used?

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 11:45):

julian-seward1 created PR Review Comment:

nit: it would be better not to use the word Load here since it's not a load. Maybe Compute ?

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 11:45):

julian-seward1 created PR Review Comment:

Could you add some details to say what the island consists of? More generally, is there a top level description of the islands-and-deadlines algorithm somewhere?

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 11:45):

julian-seward1 created PR Review Comment:

What if ty is a vector type? Then Inst::load_constant doesn't sound right to me. Is there some guarantee that this won't get called with such a type? If not, can you assert/panic it out?

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 11:45):

julian-seward1 created PR Review Comment:

Is there any way that this can be automatically cross-checked with reality? This sounds to me like something that could be violated somewhere down the line, but that would not break anything except in some extremely rare huge-function input, which will make it hard to track down. So some (any?) kind of cross-check scheme would be a Good Thing.

If not possible, at least add a load comment at the top of the insn emitter to the effect that it must comply with what is claimed here.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 11:45):

julian-seward1 created PR Review Comment:

Interpreted signed or unsigned?

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 11:45):

julian-seward1 created PR Review Comment:

That doesn't read quite right; is it correct?

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 11:45):

julian-seward1 created PR Review Comment:

This is a bit unclear; could you make it more precise? Is there a 1:1 mapping from CLIR blocks to VCode blocks? The use of "subgraphs" implies there isn't, but there's no clarification of the meaning of "subgraphs" here.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 11:45):

julian-seward1 created PR Review Comment:

Does (emit island with guard jump if needed) refer to the 6 insns that follow (I think so), or does it denote further insns that need to be emitted (I think not) ? It would be good to make this clearer in the comment.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 11:45):

julian-seward1 created PR Review Comment:

"freely permuted" .. surely they'd have to maintain the same data dependency relationships?

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 11:45):

julian-seward1 created PR Review Comment:

Since forgetting to do this might be a common mistake, can you say here how the system will fail should one forget to do that?

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 11:45):

julian-seward1 created PR Review Comment:

For the sake of clarity, could you add "CLIR" before "instructions" ?

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 11:45):

julian-seward1 created PR Review Comment:

Can you add a 1 liner comment saying what this does?

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 11:45):

julian-seward1 created PR Review Comment:

tmp doesn't give a big enough hint what this does. A better name would be new_vreg.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 12:17):

julian-seward1 edited PR Review Comment.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 12:31):

julian-seward1 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 12:31):

julian-seward1 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 12:31):

julian-seward1 created PR Review Comment:

Is this definition of is64 right? That seems like it's an unsigned criterion, but the general rule on Intel for 32-bit immediate fields is that they are sign extended to 64 bits as appropriate. If this logic simply moved from elsewhere in this patch, then leave it as is; but otherwise maybe change to the signed variant, using low32willSXto64 ?

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 12:31):

julian-seward1 created PR Review Comment:

This change concerns me somewhat. What guarantees that this assertion can't fail now?

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 12:31):

julian-seward1 created PR Review Comment:

I feel like it's a shame to lose this, because it means losing the ability to easily differentiate legitimate failures due to non-implementation of a target-independent CLIR insn, vs bugs resulting in machine-specific CLIRs being handed to us.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 15:52):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 15:52):

cfallin created PR Review Comment:

Oh, these X86* opcodes went away in the latest master, so this is just a rebase-related change. I think the rule is still (as has always been) never have a fallthrough in the big-opcode-match, and handle additions or deletions as they come, indicated by compile errors -- so if any machine-specific ops are added back in the future, we'll figure out what to do then.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 22:38):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 22:38):

cfallin created PR Review Comment:

Nothing guarantees it, but this form (ResolvedOffset) is used only when the lowering explicitly selects it, now; ordinary branches don't go through this code (the LabelUse handles them instead, and we can implement 64-bit long-form veneers there if we think we need >2GB code-size).

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 22:39):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 22:39):

cfallin created PR Review Comment:

I think so, or at least, this was existing logic in the Iconst lowering prior to this patch:

https://github.com/bytecodealliance/wasmtime/blob/a75377565f830052094aa8aa72c5e7a6b787fa18/cranelift/codegen/src/isa/x64/lower.rs#L116

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 22:40):

cfallin updated PR #1718 from machinst-codebuffer to master:

tl;dr: new new-isel; better block-ordering, handling branches in one pass. 24% faster compile+run on bz2 (28% fewer instructions); 10% faster compile (10% fewer instructions).

This patch includes:

Overall, on bz2.wasm, the results are:

    wasmtime full run (compile + runtime) of bz2:

    baseline:   9774M insns, 9742M cycles, 3.918s
    w/ changes: 7012M insns, 6888M cycles, 2.958s  (24.5% faster, 28.3% fewer insns)

    clif-util wasm compile bz2:

    baseline:   2633M insns, 3278M cycles, 1.034s
    w/ changes: 2366M insns, 2920M cycles, 0.923s  (10.7% faster, 10.1% fewer insns)

    All numbers are averages of two runs on an Ampere eMAG.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 22:40):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 22:40):

cfallin created PR Review Comment:

Was part of the old API, but no reason not to rename here :-) Now it's alloc_tmp().

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 22:40):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 22:40):

cfallin created PR Review Comment:

Done!

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 22:41):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 22:41):

cfallin created PR Review Comment:

Done.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 22:41):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 22:41):

cfallin created PR Review Comment:

Yep, modulo true deps; clarified.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 22:41):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 22:41):

cfallin created PR Review Comment:

Was correct but probably too terse; fixed.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 22:42):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 22:42):

cfallin created PR Review Comment:

Hopefully the ASCII art and additional explanation help! This is pretty subtle so I'm happy to clarify further if needed.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 22:42):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 22:42):

cfallin created PR Review Comment:

Added some more docs on EmitIsland to clarify.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 22:42):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 22:42):

cfallin created PR Review Comment:

Signed (clarified).

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 22:43):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 22:43):

cfallin created PR Review Comment:

Good idea! I added a debug assert to Inst::emit() that verifies that no more than worst_case_size() bytes were emitted.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 22:43):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 22:43):

cfallin created PR Review Comment:

Added assert.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 23:04):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 23:04):

cfallin created PR Review Comment:

Added to the top of machinst/buffer.rs.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 23:04):

cfallin updated PR #1718 from machinst-codebuffer to master:

tl;dr: new new-isel; better block-ordering, handling branches in one pass. 24% faster compile+run on bz2 (28% fewer instructions); 10% faster compile (10% fewer instructions).

This patch includes:

Overall, on bz2.wasm, the results are:

    wasmtime full run (compile + runtime) of bz2:

    baseline:   9774M insns, 9742M cycles, 3.918s
    w/ changes: 7012M insns, 6888M cycles, 2.958s  (24.5% faster, 28.3% fewer insns)

    clif-util wasm compile bz2:

    baseline:   2633M insns, 3278M cycles, 1.034s
    w/ changes: 2366M insns, 2920M cycles, 0.923s  (10.7% faster, 10.1% fewer insns)

    All numbers are averages of two runs on an Ampere eMAG.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 23:04):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 23:04):

cfallin created PR Review Comment:

Done.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 23:05):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 23:05):

cfallin created PR Review Comment:

Clarified; this is the same as the original "lowered" form, but we just call out the actual semantics and purpose a bit more explicitly now rather than conflating it with the branch-lowering process.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 23:05):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 23:05):

cfallin created PR Review Comment:

Done.

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 23:06):

cfallin updated PR #1718 from machinst-codebuffer to master:

tl;dr: new new-isel; better block-ordering, handling branches in one pass. 24% faster compile+run on bz2 (28% fewer instructions); 10% faster compile (10% fewer instructions).

This patch includes:

Overall, on bz2.wasm, the results are:

    wasmtime full run (compile + runtime) of bz2:

    baseline:   9774M insns, 9742M cycles, 3.918s
    w/ changes: 7012M insns, 6888M cycles, 2.958s  (24.5% faster, 28.3% fewer insns)

    clif-util wasm compile bz2:

    baseline:   2633M insns, 3278M cycles, 1.034s
    w/ changes: 2366M insns, 2920M cycles, 0.923s  (10.7% faster, 10.1% fewer insns)

    All numbers are averages of two runs on an Ampere eMAG.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 23:07):

cfallin updated PR #1718 from machinst-codebuffer to master:

tl;dr: new new-isel; better block-ordering, handling branches in one pass. 24% faster compile+run on bz2 (28% fewer instructions); 10% faster compile (10% fewer instructions).

This patch includes:

Overall, on bz2.wasm, the results are:

    wasmtime full run (compile + runtime) of bz2:

    baseline:   9774M insns, 9742M cycles, 3.918s
    w/ changes: 7012M insns, 6888M cycles, 2.958s  (24.5% faster, 28.3% fewer insns)

    clif-util wasm compile bz2:

    baseline:   2633M insns, 3278M cycles, 1.034s
    w/ changes: 2366M insns, 2920M cycles, 0.923s  (10.7% faster, 10.1% fewer insns)

    All numbers are averages of two runs on an Ampere eMAG.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (May 18 2020 at 23:25):

cfallin updated PR #1718 from machinst-codebuffer to master:

tl;dr: new new-isel; better block-ordering, handling branches in one pass. 24% faster compile+run on bz2 (28% fewer instructions); 10% faster compile (10% fewer instructions).

This patch includes:

Overall, on bz2.wasm, the results are:

    wasmtime full run (compile + runtime) of bz2:

    baseline:   9774M insns, 9742M cycles, 3.918s
    w/ changes: 7012M insns, 6888M cycles, 2.958s  (24.5% faster, 28.3% fewer insns)

    clif-util wasm compile bz2:

    baseline:   2633M insns, 3278M cycles, 1.034s
    w/ changes: 2366M insns, 2920M cycles, 0.923s  (10.7% faster, 10.1% fewer insns)

    All numbers are averages of two runs on an Ampere eMAG.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2020 at 04:36):

julian-seward1 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2020 at 14:17):

cfallin merged PR #1718.


Last updated: Nov 22 2024 at 17:03 UTC