abrown requested cfallin for a review on PR #9505.
abrown requested wasmtime-compiler-reviewers for a review on PR #9505.
abrown opened PR #9505 from abrown:more-byte-sink
to bytecodealliance:main
:
In looking at adding auto-generated assembler code to Cranelift, I've noticed that we pass
MachBuffer
down when it is not always necessary. TheByteSink
trait (whichMachBuffer
implements) is a simplified view toput*
bytes into the buffer and it is a bit simpler to use when testing since it is also implemented byVec<u8>
. This change propagates the trait to therex
module.<!--
Please make sure you include the following information:
If this work has been discussed elsewhere, please include a link to that
conversation. If it was discussed in an issue, just mention "issue #...".Explain why this change is needed. If the details are in an issue already,
this can be brief.Our development process is documented in the Wasmtime book:
https://docs.wasmtime.dev/contributing-development-process.htmlPlease ensure all communication follows the code of conduct:
https://github.com/bytecodealliance/wasmtime/blob/main/CODE_OF_CONDUCT.md
-->
Similar to #9504, this is just an attempt at trying to simplify what is needed to actually assemble an instruction. We still need
MachBuffer
for instructions that touch memory; thinking about it, it may just be unavoidable.
abrown submitted PR review.
abrown created PR review comment:
Oh, actually let me remove this: this is for the new assembler but isn't used yet.
abrown updated PR #9505.
abrown submitted PR review.
abrown created PR review comment:
...and a little bit of code movement.
cfallin submitted PR review:
This looks fine, thanks!
Re: handling
Amode
s and the helper that briefly appeared here: I think we'll needMachBuffer
unavoidably once we have amodes that can refer to labels; that's the semantic thing that MachBuffer gives us above an arbitrary byte sequence. But totally fine to trim the close coupling at lower levels.I'd love to talk more about the assembler design w.r.t. dependencies -- intuitively I would have thought that
MachBuffer
would be an integral part of it, i.e. it's the assembler's way of providing its output (given that assemblers need label references, and to flow through other metadata like trap info and relocs from insts to the code blob), but maybe there's another use-case or need for generalization I'm missing.
cfallin has enabled auto merge for PR #9505.
intuitively I would have thought that MachBuffer would be an integral part of it
I mean, that's how it has to be today. I'm kind of wondering if it should, though: it's nice to be able to "just emit" an instruction into a byte buffer during testing and that's what I've been thinking about. The setup in
emit_tests.rs
just to be able to emit an instruction is a lot, e.g. If we're proptest-ing or fuzzing we also want the "emit a single instruction" path to be relatively quick.On the other hand, sometimes its unavoidable: I've notice that we register traps and labels using a
MachBuffer
in the encoding logic and that simply cannot be done withByteSink
. MaybeByteSink
can gain extra trait functions that do nothing for theVec<u8>
case or something like that...
abrown edited a comment on PR #9505:
intuitively I would have thought that MachBuffer would be an integral part of it
I mean, that's how it has to be today. I'm kind of wondering if it should, though: it's nice to be able to "just emit" an instruction into a byte buffer during testing and that's what I've been thinking about. The setup in
emit_tests.rs
just to be able to emit an instruction is a lot, e.g. If we're proptest-ing or fuzzing we also want the "emit a single instruction" path to be relatively quick.On the other hand, sometimes it _is_ unavoidable: I've notice that we register traps and labels using a
MachBuffer
in the encoding logic and that simply cannot be done withByteSink
. MaybeByteSink
can gain extra trait functions that do nothing for theVec<u8>
case or something like that...
cfallin commented on PR #9505:
Maybe ByteSink can gain extra trait functions that do nothing for the Vec<u8> case or something like that...
Ah! Yeah, that's an interesting idea: lift the whole interface of MachBuffer into a trait, and provide an implementation of that trait for a
Vec<u8>
.There are interesting semantic implementations of this though when label references are involved: what should the assembled bytes from a
mov rax, [rip+label]
instruction be? Zeroes in the RIP offset (i.e. labels not resolved)? One aspect of theMachBuffer
that is nice is that its types essentially force one to be aware one is referencing addresses in the buffer: you have to ask for a label to refer to it, the various amodes require a label, and if you don't bind it, (IIRC?) it should panic when finalizing.Maybe another way of reading this friction is that we need some convenience APIs? It should be possible to write an
assemble(inst: Inst) -> Vec<u8>
that is something like:fn assemble(inst: Inst) -> Vec<u8> { let mut buffer = MachBuffer::new(); inst.emit(&mut inst, ...); buffer.finalize(&VCodeConstants::default(), &ControlPlane::default()).apply_base_srcloc(SourceLoc::new(0)).data.into() }
(whew, that is a bit of a mouthful, I see why you want it!)
cfallin edited a comment on PR #9505:
Maybe ByteSink can gain extra trait functions that do nothing for the Vec<u8> case or something like that...
Ah! Yeah, that's an interesting idea: lift the whole interface of MachBuffer into a trait, and provide an implementation of that trait for a
Vec<u8>
.There are interesting semantic implementations of this though when label references are involved: what should the assembled bytes from a
mov rax, [rip+label]
instruction be? Zeroes in the RIP offset (i.e. labels not resolved)? One aspect of theMachBuffer
that is nice is that its types essentially force one to be aware one is referencing addresses in the buffer: you have to ask for a label to refer to it, the various amodes require a label, and if you don't bind it, (IIRC?) it should panic when finalizing.Maybe another way of reading this friction is that we need some convenience APIs? It should be possible to write an
assemble(inst: Inst) -> Vec<u8>
that is something like:fn assemble(inst: Inst) -> Vec<u8> { let mut buffer = MachBuffer::new(); inst.emit(&mut buffer, ...); buffer.finalize(&VCodeConstants::default(), &ControlPlane::default()) .apply_base_srcloc(SourceLoc::new(0)) .data .into() // SmallVec into Vec }
(whew, that is a bit of a mouthful, I see why you want it!)
cfallin merged PR #9505.
Last updated: Jan 24 2025 at 00:11 UTC