Stream: git-wasmtime

Topic: wasmtime / PR #2142 x64 new backend: port ABI implementat...


view this post on Zulip Wasmtime GitHub notifications bot (Aug 18 2020 at 23:01):

cfallin opened PR #2142 from machinst-abi-x64 to main:

Previously, in #2128, we factored out a common "vanilla 64-bit ABI"
implementation from the AArch64 ABI code, with the idea that this should
be largely compatible with x64. This PR alters the new x64 backend to
make use of the shared infrastructure, removing the duplication that
existed previously. The generated code is nearly (not exactly) the same;
the only difference relates to how the clobber-save region is padded in
the prologue.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (Aug 18 2020 at 23:01):

cfallin requested bnjbvr for a review on PR #2142.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 18 2020 at 23:23):

cfallin updated PR #2142 from machinst-abi-x64 to main:

Previously, in #2128, we factored out a common "vanilla 64-bit ABI"
implementation from the AArch64 ABI code, with the idea that this should
be largely compatible with x64. This PR alters the new x64 backend to
make use of the shared infrastructure, removing the duplication that
existed previously. The generated code is nearly (not exactly) the same;
the only difference relates to how the clobber-save region is padded in
the prologue.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (Aug 19 2020 at 06:54):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 19 2020 at 06:54):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 19 2020 at 06:54):

bjorn3 created PR Review Comment:

*for the

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 00:18):

cfallin updated PR #2142 from machinst-abi-x64 to main:

Previously, in #2128, we factored out a common "vanilla 64-bit ABI"
implementation from the AArch64 ABI code, with the idea that this should
be largely compatible with x64. This PR alters the new x64 backend to
make use of the shared infrastructure, removing the duplication that
existed previously. The generated code is nearly (not exactly) the same;
the only difference relates to how the clobber-save region is padded in
the prologue.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 00:18):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 00:18):

cfallin created PR Review Comment:

Thanks!

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 00:54):

cfallin updated PR #2142 from machinst-abi-x64 to main:

Previously, in #2128, we factored out a common "vanilla 64-bit ABI"
implementation from the AArch64 ABI code, with the idea that this should
be largely compatible with x64. This PR alters the new x64 backend to
make use of the shared infrastructure, removing the duplication that
existed previously. The generated code is nearly (not exactly) the same;
the only difference relates to how the clobber-save region is padded in
the prologue.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 15:06):

bnjbvr submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 15:06):

bnjbvr submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 15:06):

bnjbvr created PR Review Comment:

I wasn't there when the trait was designed, but: would it make sense to pass a context, or the consumer of these insts directly here? (I remember some cases where it wasn't possible to do so, because it was used in both the lowering and code emission contexts, so maybe that's irrelevant.)
This would avoid the smallvec allocation entirely here.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 15:06):

bnjbvr created PR Review Comment:

nit (here and below many times): can you use pub(crate) instead?

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 15:06):

bnjbvr created PR Review Comment:

Please expand GV here.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 15:06):

bnjbvr created PR Review Comment:

Same question above wrt i32 vs i64.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 15:06):

bnjbvr created PR Review Comment:

Is it possible that the imm is actually large enough to trigger this? If so, would it make sense to materialize the immediate into a temp register, and use a reg/reg add below? If the imm can't be this large, could the trait function accept an u32 instead of an u64?

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 15:06):

bnjbvr created PR Review Comment:

same question for i32 vs i64 in signature

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 15:06):

bnjbvr created PR Review Comment:

Just reading this doesn't help understanding why it's fine to do so, which makes me think the trait functions should be redesigned so the temp reg is also returned from gen_add_imm, or something like this.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 15:06):

bnjbvr created PR Review Comment:

same remark about smallvec

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 15:06):

bnjbvr created PR Review Comment:

nit: this comment can be removed, i think

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 15:06):

bnjbvr created PR Review Comment:

This is a bit feeble to use a bool in this case, could it be its own enum, to enhance readability?

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 15:06):

bnjbvr created PR Review Comment:

ditto

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 15:06):

bnjbvr created PR Review Comment:

nit: could you use expect instead of unwrap, here and below? it likely adds cases where we have a very big offset, and it just wouldn't be implemented here.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 15:06):

bnjbvr created PR Review Comment:

Something I've meaning to change here, just to avoid dummy memory reallocations: can you use a vec! declaration here, instead of multiple pushes? This method could even return a plain &[], since in both cases (baldrdash/not baldrdash), these arrays are statically known.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 15:06):

bnjbvr created PR Review Comment:

We can use an unstable sort here, not that it matters...

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 15:06):

bnjbvr created PR Review Comment:

Maybe we could rename this method get_spillslot_byte_size, then?

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 15:06):

bnjbvr created PR Review Comment:

I don't know if this has an importance in terms of speed of gen'd code, but Spidermonkey does do a single sp+imm operation here, then use stores to sp+offset for each saved register. It would also avoid the weird push 0 to keep the stack 16-bytes aligned.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 20:46):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 20:46):

cfallin created PR Review Comment:

I thought about this, but (as you suspected) this is called in contexts where we have no LowerCtx or other direct consumer of callbacks. We could statically monomorphize on a closure argument that receives instructions, but that seems unnecessarily complex and would duplicate code in the binary; I made sure to size the SmallVecs so they should fit x64 and aarch64 cases in their inline storage, so the cost is (just) copying a few Insts instead. Happy to go the other way though if you'd prefer!

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 20:46):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 20:46):

cfallin created PR Review Comment:

u32 instead is much cleaner, thanks!

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 20:58):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 20:58):

cfallin created PR Review Comment:

Thanks for pointing this out; yeah, it is a bit convoluted. I've updated the documentation on the trait definition to clearly describe the requirements placed on the machine backend's implementation and register choices, and I've renamed get_fixed_tmp_reg() to get_stacklimit_reg() to make its purpose more explicit.

I thought for a bit if there might be a better way to do this, but I haven't managed to find one, unless we push the whole "compute this GlobalValue using only caller-saves" logic into each machine backend. gen_add_imm() could return info about what it clobbers, but that still doesn't help the machine-independent implementation choose another register on its own. Really the machine-backend author has to choose fixed scratch registers and a corresponding add-with-large-immediate lowering (on RISCs at least) and just provide them to the machine-independent part.

On x64, anyway, we can always have a full 32-bit immediate on an add, so the more tricky requirements are moot; all we need is some arbitrary caller-save register here. Hopefully simple enough but I'm open to other ideas :-)

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 21:05):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 21:05):

cfallin created PR Review Comment:

Changed all of these to i32.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 21:08):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 21:08):

cfallin created PR Review Comment:

Good idea!

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 21:10):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 21:10):

cfallin created PR Review Comment:

The interface back to the lowering code reasons in terms of number of spillslots, so I'd rather not convert to/from bytes internally; but I renamed this to get_number_of_spillslots_for_value(), which is hopefully more clear!

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 21:15):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 21:15):

cfallin created PR Review Comment:

Done! Returning a 'static slice would require refactoring some other things to be const functions, I think, including in regalloc.rs (Writable::from_reg), or else adding lazy-inits, so I'm not sure if that's completely feasible at the moment, but perhaps we could think about this more...

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 21:17):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 21:17):

cfallin created PR Review Comment:

Good point! (I had initially thought we wanted a stable sort to ensure deterministic code output, but of course the registers will be unique, so it doesn't matter...)

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 21:18):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 21:18):

cfallin created PR Review Comment:

Done!

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 21:47):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 21:47):

cfallin created PR Review Comment:

Good point! Done.

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 21:58):

cfallin updated PR #2142 from machinst-abi-x64 to main:

Previously, in #2128, we factored out a common "vanilla 64-bit ABI"
implementation from the AArch64 ABI code, with the idea that this should
be largely compatible with x64. This PR alters the new x64 backend to
make use of the shared infrastructure, removing the duplication that
existed previously. The generated code is nearly (not exactly) the same;
the only difference relates to how the clobber-save region is padded in
the prologue.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (Aug 20 2020 at 22:35):

cfallin updated PR #2142 from machinst-abi-x64 to main:

Previously, in #2128, we factored out a common "vanilla 64-bit ABI"
implementation from the AArch64 ABI code, with the idea that this should
be largely compatible with x64. This PR alters the new x64 backend to
make use of the shared infrastructure, removing the duplication that
existed previously. The generated code is nearly (not exactly) the same;
the only difference relates to how the clobber-save region is padded in
the prologue.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (Aug 21 2020 at 14:47):

bnjbvr requested bnjbvr for a review on PR #2142.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 07 2020 at 16:31):

bnjbvr submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 07 2020 at 16:31):

bnjbvr submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 07 2020 at 16:31):

bnjbvr created PR Review Comment:

Should we add "not yet implemented" to this error message? This may happen in real-world programs. Alternatively, should we report an Err Result here, with an impl limit exceeded error?

view this post on Zulip Wasmtime GitHub notifications bot (Sep 07 2020 at 16:31):

bnjbvr created PR Review Comment:

Can we add an assertion that this is caller-save? (This depends on the calling convention, not sure if it's available from here.)

view this post on Zulip Wasmtime GitHub notifications bot (Sep 07 2020 at 16:31):

bnjbvr created PR Review Comment:

Sorry if this is not the right place/time for this suggestion, but: I am a bit confused when it comes to distinguish Body from Call, by just looking at the names. The comments really help, for that matter. Alternatively, what do you think of renaming ABIBodyImpl to CalleeABIImpl, and ABICallImpl to CallerABIImpl? (I just realize the consecutive i make it a bit hard to decipher, with any casing... so maybe not such a great idea.)

view this post on Zulip Wasmtime GitHub notifications bot (Sep 07 2020 at 16:31):

bnjbvr created PR Review Comment:

(Another belated suggestion) Impl in a name (here, in ABIMachineImpl) makes me think that we have a concrete type, while it's actually a trait. Is there any other name that could avoid this confusion? ABIMachineSpec for instance?

view this post on Zulip Wasmtime GitHub notifications bot (Sep 07 2020 at 16:31):

bnjbvr created PR Review Comment:

Real world programs could have large offsets (here and below) right? Should this message also contain "not implemented yet", or should the function return a Result, and here it'd be an Err with impl limit exceeded?

view this post on Zulip Wasmtime GitHub notifications bot (Sep 09 2020 at 00:27):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 09 2020 at 00:27):

cfallin created PR Review Comment:

The stack offset here is limited by the stack-frame size, which we clamp to 128 MB in compute_arg_locs(), so I don't think we need to bubble up a Result (since it should not be reachable code for any input); but I've clarified the error message. Thanks!

view this post on Zulip Wasmtime GitHub notifications bot (Sep 09 2020 at 00:27):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 09 2020 at 00:27):

cfallin created PR Review Comment:

Done; just asserting that it's a caller-save in systemv and baldrdash.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 09 2020 at 00:27):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 09 2020 at 00:27):

cfallin created PR Review Comment:

Analogous to above, shouldn't happen due to stack-frame size limit; clarified messages.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 09 2020 at 00:36):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 09 2020 at 00:36):

cfallin created PR Review Comment:

Good idea!

view this post on Zulip Wasmtime GitHub notifications bot (Sep 09 2020 at 00:36):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Sep 09 2020 at 00:37):

cfallin updated PR #2142 from machinst-abi-x64 to main:

Previously, in #2128, we factored out a common "vanilla 64-bit ABI"
implementation from the AArch64 ABI code, with the idea that this should
be largely compatible with x64. This PR alters the new x64 backend to
make use of the shared infrastructure, removing the duplication that
existed previously. The generated code is nearly (not exactly) the same;
the only difference relates to how the clobber-save region is padded in
the prologue.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (Sep 09 2020 at 00:37):

cfallin created PR Review Comment:

Ah, yes, I like the callee/caller distinction better than "body" / "call"; it's much more intuitive. I've gone ahead with these names (I think the ...ABIImpl is fine IMHO.) Thanks!

view this post on Zulip Wasmtime GitHub notifications bot (Sep 09 2020 at 00:59):

cfallin updated PR #2142 from machinst-abi-x64 to main:

Previously, in #2128, we factored out a common "vanilla 64-bit ABI"
implementation from the AArch64 ABI code, with the idea that this should
be largely compatible with x64. This PR alters the new x64 backend to
make use of the shared infrastructure, removing the duplication that
existed previously. The generated code is nearly (not exactly) the same;
the only difference relates to how the clobber-save region is padded in
the prologue.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (Sep 09 2020 at 01:35):

cfallin merged PR #2142.


Last updated: Dec 23 2024 at 12:05 UTC