saulecabrera edited PR #4907 from winch-skeleton
to main
:
This change introduces the initial skeleton for Winch, the "baseline" compiler, discussed https://github.com/bytecodealliance/rfcs/pull/28
The skeleton contains the just the necessary code for the main abstractions to support the following:
- Function prologue and epilogue (excluding stack checks)
- Calculation of local slots
- An initial pass on a generic MacroAssembler interface
- An initial pass on a x64 Assembler
- A simple CLI tools to compile WebAssembly programs (
crates/winch/src/main.rs
)- String based emission for programs that use the following WebAssembly instructions:
i32.add
,local.get
,local.set
As of this change, the high-level structure of the compiler is as follows:
+-----------------------+ | | | +-----------+ +---------------------+ | | Compiler | ---> | ISA | ---------------------------------------------------------------+ | +-----------+ +---------------------+ | | | | | | | | v | | +-----------+-----+----------------------+ +----------------+ +-------+----------------+-------+ | +> | Registers | ABI | CodeGen | --> | CodeGenContext | --> | Frame | MacroAssembler | Stack | | +-----------+-----+----------------------+ +----------------+ +-------+----------------+-------+ | ^ | | +-----+------------------------------------------------------------------------------------+ | | | v +---------------------+ | Register Allocation | +---------------------+
Next steps
As mentioned above, this change doesn't use any of Cranelift's backends for code emission, yet. These are the immediate next steps after landing this change:
- Extract the relevant Cranelift pieces into the
cranelift_asm
crate, which will be used by Winch- Swap the string based implementation of the x64 assembler and use the assembler provided by
cranelift_asm
- Add initial support for arm64 through
cranelift_asm
- Increase test coverage
Misc
Winch's CLI can be used by running `cargo run -- path/to/file.wat --target=<target>
Here's a set of WebAssembly programs and their corresponding x64 output:
<details>
<summary>Basic i32 add</summary>(module (export "main" (func $main)) (func $main (result i32) (i32.const 10) (i32.const 20) i32.add) )
push rbp mov rbp, rsp mov eax, 10 add eax, 20 pop rbp ret
</details>
<details>
<summary> i32 add with local operators </summary>(module (export "main" (func $main)) (func $main (result i32) (local $foo i32) (local $bar i32) (i32.const 10) (local.set $foo) (i32.const 20) (local.set $bar) (local.get $foo) (local.get $bar) i32.add ) )
push rbp mov rbp, rsp sub rsp, 8 mov qword [rsp], 0 mov eax, 10 mov [rsp + 4], eax mov ecx, 20 mov [rsp], ecx mov eax, [rsp] mov edx, [rsp + 4] add edx, eax mov rax, rdx add rsp, 8 pop rbp ret
</details>
<details>
<summary>Basic i32 with function arguments </summary>(module (export "main" (func $main)) (func $main (param i32) (param i32) (result i32) (local.get 0) (local.get 1) i32.add ) )
push rbp mov rbp, rsp sub rsp, 8 mov [rsp + 4], edi mov [rsp], esi mov eax, [rsp] mov ecx, [rsp + 4] add ecx, eax mov rax, rcx add rsp, 8 pop rbp ret
</details>
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
- [ ] This has been discussed in issue #..., or if not, please tell us why
here.- [ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.- [ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the
[message truncated]
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera edited PR #4907 from winch-skeleton
to main
:
This change introduces the initial skeleton for Winch, the "baseline" compiler, discussed https://github.com/bytecodealliance/rfcs/pull/28
The skeleton contains the just the necessary code for the main abstractions to support the following:
- Function prologue and epilogue (excluding stack checks)
- Calculation of local slots
- An initial pass on a generic MacroAssembler interface
- An initial pass on a x64 Assembler
- A simple CLI tools to compile WebAssembly programs (
crates/winch/src/main.rs
)- String based emission for programs that use the following WebAssembly instructions:
i32.add
,local.get
,local.set
As of this change, the high-level structure of the compiler is as follows:
+-----------------------+ | | | +-----------+ +---------------------+ | | Compiler | ---> | ISA | ---------------------------------------------------------------+ | +-----------+ +---------------------+ | | | | | | | | v | | +-----------+-----+----------------------+ +----------------+ +-------+----------------+-------+ | +> | Registers | ABI | CodeGen | --> | CodeGenContext | --> | Frame | MacroAssembler | Stack | | +-----------+-----+----------------------+ +----------------+ +-------+----------------+-------+ | ^ | | +-----+------------------------------------------------------------------------------------+ | | | v +---------------------+ | Register Allocation | +---------------------+
Next steps
As mentioned above, this change doesn't use any of Cranelift's backends for code emission, yet. These are the immediate next steps after landing this change:
- Extract the relevant Cranelift pieces into the
cranelift_asm
crate, which will be used by Winch- Swap the string based implementation of the x64 assembler and use the assembler provided by
cranelift_asm
- Add initial support for arm64 through
cranelift_asm
- Increase test coverage, and include tests in CI. Opted to not do that in this change since the emission mechanism will change, so it might be easier to wait and add verification tests once we are using Cranelift's backends.
Misc
Winch's CLI can be used by running `cargo run -- path/to/file.wat --target=<target>
Here's a set of WebAssembly programs and their corresponding x64 output:
<details>
<summary>Basic i32 add</summary>(module (export "main" (func $main)) (func $main (result i32) (i32.const 10) (i32.const 20) i32.add) )
push rbp mov rbp, rsp mov eax, 10 add eax, 20 pop rbp ret
</details>
<details>
<summary> i32 add with local operators </summary>(module (export "main" (func $main)) (func $main (result i32) (local $foo i32) (local $bar i32) (i32.const 10) (local.set $foo) (i32.const 20) (local.set $bar) (local.get $foo) (local.get $bar) i32.add ) )
push rbp mov rbp, rsp sub rsp, 8 mov qword [rsp], 0 mov eax, 10 mov [rsp + 4], eax mov ecx, 20 mov [rsp], ecx mov eax, [rsp] mov edx, [rsp + 4] add edx, eax mov rax, rdx add rsp, 8 pop rbp ret
</details>
<details>
<summary>Basic i32 with function arguments </summary>(module (export "main" (func $main)) (func $main (param i32) (param i32) (result i32) (local.get 0) (local.get 1) i32.add ) )
push rbp mov rbp, rsp sub rsp, 8 mov [rsp + 4], edi mov [rsp], esi mov eax, [rsp] mov ecx, [rsp + 4] add ecx, eax mov rax, rcx add rsp, 8 pop rbp ret
</details>
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
- [ ] This has been discussed in issue #..., or if not, please tell us why
here.- [ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.- [ ] This PR contains test cases,
[message truncated]
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera edited PR #4907 from winch-skeleton
to main
:
This change introduces the initial skeleton for Winch, the "baseline" compiler, discussed https://github.com/bytecodealliance/rfcs/pull/28
The skeleton contains the just the necessary code for the main abstractions to support the following:
- Function prologue and epilogue (excluding stack checks)
- Calculation of local slots
- An initial pass on a generic MacroAssembler interface
- An initial pass on a x64 Assembler
- A simple CLI tools to compile WebAssembly programs (
crates/winch/src/main.rs
)- String based emission for programs that use the following WebAssembly instructions:
i32.add
,local.get
,local.set
As of this change, the high-level structure of the compiler is as follows:
+-----------------------+ | | | +-----------+ +---------------------+ | | Compiler | ---> | ISA | ---------------------------------------------------------------+ | +-----------+ +---------------------+ | | | | | | | | v | | +-----------+-----+----------------------+ +----------------+ +-------+----------------+-------+ | +> | Registers | ABI | CodeGen | --> | CodeGenContext | --> | Frame | MacroAssembler | Stack | | +-----------+-----+----------------------+ +----------------+ +-------+----------------+-------+ | ^ | | +-----+------------------------------------------------------------------------------------+ | | | v +---------------------+ | Register Allocation | +---------------------+
Next steps
As mentioned above, this change doesn't use any of Cranelift's backends for code emission, yet. These are the immediate next steps after landing this change:
- Extract the relevant Cranelift pieces into the
cranelift_asm
crate, which will be used by Winch- Swap the string based implementation of the x64 assembler and use the assembler provided by
cranelift_asm
- Add initial support for arm64 through
cranelift_asm
- Increase test coverage, and include tests in CI. Opted to not do that in this change since the emission mechanism will change, so it might be easier to wait and add verification tests once we are using Cranelift's backends.
Misc
Winch's CLI can be used by running
cargo run -- path/to/file.wat --target=<target>
Here's a set of WebAssembly programs and their corresponding x64 output:
<details>
<summary>Basic i32 add</summary>(module (export "main" (func $main)) (func $main (result i32) (i32.const 10) (i32.const 20) i32.add) )
push rbp mov rbp, rsp mov eax, 10 add eax, 20 pop rbp ret
</details>
<details>
<summary> i32 add with local operators </summary>(module (export "main" (func $main)) (func $main (result i32) (local $foo i32) (local $bar i32) (i32.const 10) (local.set $foo) (i32.const 20) (local.set $bar) (local.get $foo) (local.get $bar) i32.add ) )
push rbp mov rbp, rsp sub rsp, 8 mov qword [rsp], 0 mov eax, 10 mov [rsp + 4], eax mov ecx, 20 mov [rsp], ecx mov eax, [rsp] mov edx, [rsp + 4] add edx, eax mov rax, rdx add rsp, 8 pop rbp ret
</details>
<details>
<summary>Basic i32 with function arguments </summary>(module (export "main" (func $main)) (func $main (param i32) (param i32) (result i32) (local.get 0) (local.get 1) i32.add ) )
push rbp mov rbp, rsp sub rsp, 8 mov [rsp + 4], edi mov [rsp], esi mov eax, [rsp] mov ecx, [rsp + 4] add ecx, eax mov rax, rcx add rsp, 8 pop rbp ret
</details>
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
- [ ] This has been discussed in issue #..., or if not, please tell us why
here.- [ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.- [ ] This PR contains test cases,
[message truncated]
saulecabrera edited PR #4907 from winch-skeleton
to main
:
This change introduces the initial skeleton for Winch, the "baseline" compiler, discussed https://github.com/bytecodealliance/rfcs/pull/28
The skeleton contains the just the necessary code for the main abstractions to support the following:
- Function prologue and epilogue (excluding stack checks)
- Calculation of local slots
- An initial pass on a generic MacroAssembler interface
- An initial pass on a x64 Assembler
- A simple CLI tools to compile WebAssembly programs (
crates/winch/src/main.rs
)- String based emission for programs that use the following WebAssembly instructions:
i32.add
,local.get
,local.set
As of this change, the high-level structure of the compiler is as follows:
+--------------------------------------+ | | | | | +-----------------------+ | | | | | | | +-----------+ +--------+ | | | | Compiler | ---> | ISA | -+ | | +-----------+ +--------+ | | | | | | | | v | | +-----------+-----+---------+ +---------------------+ +----------------+ +-------+----------------+-------+ | +> | Registers | ABI | CodeGen | --> | Register Allocation | --> | CodeGenContext | --> | Frame | MacroAssembler | Stack | | +-----------+-----+---------+ +---------------------+ +----------------+ +-------+----------------+-------+ | ^ | ^ +----------------------+ +------------------------------------------+
Next steps
As mentioned above, this change doesn't use any of Cranelift's backends for code emission, yet. These are the immediate next steps after landing this change:
- Extract the relevant Cranelift pieces into the
cranelift_asm
crate, which will be used by Winch- Swap the string based implementation of the x64 assembler and use the assembler provided by
cranelift_asm
- Add initial support for arm64 through
cranelift_asm
- Increase test coverage, and include tests in CI. Opted to not do that in this change since the emission mechanism will change, so it might be easier to wait and add verification tests once we are using Cranelift's backends.
Misc
Winch's CLI can be used by running
cargo run -- path/to/file.wat --target=<target>
Here's a set of WebAssembly programs and their corresponding x64 output:
<details>
<summary>Basic i32 add</summary>(module (export "main" (func $main)) (func $main (result i32) (i32.const 10) (i32.const 20) i32.add) )
push rbp mov rbp, rsp mov eax, 10 add eax, 20 pop rbp ret
</details>
<details>
<summary> i32 add with local operators </summary>(module (export "main" (func $main)) (func $main (result i32) (local $foo i32) (local $bar i32) (i32.const 10) (local.set $foo) (i32.const 20) (local.set $bar) (local.get $foo) (local.get $bar) i32.add ) )
push rbp mov rbp, rsp sub rsp, 8 mov qword [rsp], 0 mov eax, 10 mov [rsp + 4], eax mov ecx, 20 mov [rsp], ecx mov eax, [rsp] mov edx, [rsp + 4] add edx, eax mov rax, rdx add rsp, 8 pop rbp ret
</details>
<details>
<summary>Basic i32 with function arguments </summary>(module (export "main" (func $main)) (func $main (param i32) (param i32) (result i32) (local.get 0) (local.get 1) i32.add ) )
push rbp mov rbp, rsp sub rsp, 8 mov [rsp + 4], edi mov [rsp], esi mov eax, [rsp] mov ecx, [rsp + 4] add ecx, eax mov rax, rcx add rsp, 8 pop rbp ret
</details>
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
- [ ] This has been discussed in issue #..., or if not, please tell us why
here.- [ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.- [ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
saulecabrera edited PR #4907 from winch-skeleton
to main
:
This change introduces the initial skeleton for Winch, the "baseline" compiler, discussed https://github.com/bytecodealliance/rfcs/pull/28
The skeleton contains the just the necessary code for the main abstractions to support the following:
- Function prologue and epilogue (excluding stack checks)
- Calculation of local slots
- An initial pass on a generic MacroAssembler interface
- An initial pass on a x64 Assembler
- A simple CLI tool to compile WebAssembly programs (
crates/winch/src/main.rs
)- String based emission for programs that use the following WebAssembly instructions:
i32.add
,local.get
,local.set
As of this change, the high-level structure of the compiler is as follows:
+--------------------------------------+ | | | | | +-----------------------+ | | | | | | | +-----------+ +--------+ | | | | Compiler | ---> | ISA | -+ | | +-----------+ +--------+ | | | | | | | | v | | +-----------+-----+---------+ +---------------------+ +----------------+ +-------+----------------+-------+ | +> | Registers | ABI | CodeGen | --> | Register Allocation | --> | CodeGenContext | --> | Frame | MacroAssembler | Stack | | +-----------+-----+---------+ +---------------------+ +----------------+ +-------+----------------+-------+ | ^ | ^ +----------------------+ +------------------------------------------+
Next steps
As mentioned above, this change doesn't use any of Cranelift's backends for code emission, yet. These are the immediate next steps after landing this change:
- Extract the relevant Cranelift pieces into the
cranelift_asm
crate, which will be used by Winch- Swap the string based implementation of the x64 assembler and use the assembler provided by
cranelift_asm
- Add initial support for arm64 through
cranelift_asm
- Increase test coverage, and include tests in CI. Opted to not do that in this change since the emission mechanism will change, so it might be easier to wait and add verification tests once we are using Cranelift's backends.
Misc
Winch's CLI can be used by running
cargo run -- path/to/file.wat --target=<target>
Here's a set of WebAssembly programs and their corresponding x64 output:
<details>
<summary>Basic i32 add</summary>(module (export "main" (func $main)) (func $main (result i32) (i32.const 10) (i32.const 20) i32.add) )
push rbp mov rbp, rsp mov eax, 10 add eax, 20 pop rbp ret
</details>
<details>
<summary> i32 add with local operators </summary>(module (export "main" (func $main)) (func $main (result i32) (local $foo i32) (local $bar i32) (i32.const 10) (local.set $foo) (i32.const 20) (local.set $bar) (local.get $foo) (local.get $bar) i32.add ) )
push rbp mov rbp, rsp sub rsp, 8 mov qword [rsp], 0 mov eax, 10 mov [rsp + 4], eax mov ecx, 20 mov [rsp], ecx mov eax, [rsp] mov edx, [rsp + 4] add edx, eax mov rax, rdx add rsp, 8 pop rbp ret
</details>
<details>
<summary>Basic i32 with function arguments </summary>(module (export "main" (func $main)) (func $main (param i32) (param i32) (result i32) (local.get 0) (local.get 1) i32.add ) )
push rbp mov rbp, rsp sub rsp, 8 mov [rsp + 4], edi mov [rsp], esi mov eax, [rsp] mov ecx, [rsp + 4] add ecx, eax mov rax, rcx add rsp, 8 pop rbp ret
</details>
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
- [ ] This has been discussed in issue #..., or if not, please tell us why
here.- [ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.- [ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera edited PR #4907 from winch-skeleton
to main
:
This change introduces the initial skeleton for Winch, the "baseline" compiler, discussed https://github.com/bytecodealliance/rfcs/pull/28
The skeleton contains the just the necessary code for the main abstractions to support the following:
- Function prologue and epilogue (excluding stack checks)
- Calculation of local slots
- An initial pass on a generic MacroAssembler interface
- An initial pass on a x64 Assembler
- A simple CLI tool to compile WebAssembly programs (
crates/winch/src/main.rs
)- String based emission for programs that use the following WebAssembly instructions:
i32.add
,local.get
,local.set
As of this change, the high-level structure of the compiler is as follows:
+--------------------------------------+ | | | | | +-----------------------+ | | | | | | | +-----------+ +--------+ | | | | Compiler | ---> | ISA | -+ | | +-----------+ +--------+ | | | | | | | | v | | +-----------+-----+---------+ +---------------------+ +----------------+ +-------+----------------+-------+ | +> | Registers | ABI | CodeGen | --> | Register Allocation | --> | CodeGenContext | --> | Frame | MacroAssembler | Stack | | +-----------+-----+---------+ +---------------------+ +----------------+ +-------+----------------+-------+ | ^ | ^ +----------------------+ +------------------------------------------+
Next steps
As mentioned above, this change doesn't use any of Cranelift's backends for code emission, yet. These are the immediate next steps after landing this change:
- Extract the relevant Cranelift pieces into the
cranelift_asm
crate, which will be used by Winch- Swap the string based implementation of the x64 assembler and use the assembler provided by
cranelift_asm
- Add initial support for arm64 through
cranelift_asm
- Increase test coverage, and include tests in CI. Opted to not do that in this change since the emission mechanism will change, so it might be easier to wait and add verification tests once we are using Cranelift's backends.
- There are a handful of
TODO
s in the code that are expected to be addressed in the next iterations. I left them there as a reminder mostly; but if preferred they could be removed and placed into issues instead.
Misc
Winch's CLI can be used by running
cargo run -- path/to/file.wat --target=<target>
Here's a set of WebAssembly programs and their corresponding x64 output:
<details>
<summary>Basic i32 add</summary>(module (export "main" (func $main)) (func $main (result i32) (i32.const 10) (i32.const 20) i32.add) )
push rbp mov rbp, rsp mov eax, 10 add eax, 20 pop rbp ret
</details>
<details>
<summary> i32 add with local operators </summary>(module (export "main" (func $main)) (func $main (result i32) (local $foo i32) (local $bar i32) (i32.const 10) (local.set $foo) (i32.const 20) (local.set $bar) (local.get $foo) (local.get $bar) i32.add ) )
push rbp mov rbp, rsp sub rsp, 8 mov qword [rsp], 0 mov eax, 10 mov [rsp + 4], eax mov ecx, 20 mov [rsp], ecx mov eax, [rsp] mov edx, [rsp + 4] add edx, eax mov rax, rdx add rsp, 8 pop rbp ret
</details>
<details>
<summary>Basic i32 with function arguments </summary>(module (export "main" (func $main)) (func $main (param i32) (param i32) (result i32) (local.get 0) (local.get 1) i32.add ) )
push rbp mov rbp, rsp sub rsp, 8 mov [rsp + 4], edi mov [rsp], esi mov eax, [rsp] mov ecx, [rsp + 4] add ecx, eax mov rax, rcx add rsp, 8 pop rbp ret
</details>
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
- [ ] This has been discussed in issue #..., or if not, please tell us why
here.- [ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.- [ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
saulecabrera edited PR #4907 from winch-skeleton
to main
:
This change introduces the initial skeleton for Winch, the "baseline" compiler, discussed https://github.com/bytecodealliance/rfcs/pull/28
The skeleton contains the just the necessary code for the main abstractions to support the following:
- Function prologue and epilogue (excluding stack checks)
- Calculation of local slots
- An initial pass on a generic MacroAssembler interface
- An initial pass on a x64 Assembler
- A simple CLI tool to compile WebAssembly programs (
crates/winch/src/main.rs
)- String based emission for programs that use the following WebAssembly instructions:
i32.add
,local.get
,local.set
As of this change, the high-level structure of the compiler is as follows:
+--------------------------------------+ | | | | | +-----------------------+ | | | | | | | +-----------+ +--------+ | | | | Compiler | ---> | ISA | -+ | | +-----------+ +--------+ | | | | | | | | v | | +-----------+-----+---------+ +---------------------+ +----------------+ +-------+----------------+-------+ | +> | Registers | ABI | CodeGen | --> | Register Allocation | --> | CodeGenContext | --> | Frame | MacroAssembler | Stack | | +-----------+-----+---------+ +---------------------+ +----------------+ +-------+----------------+-------+ | ^ | ^ +----------------------+ +------------------------------------------+
Next steps
As mentioned above, this change doesn't use any of Cranelift's backends for code emission, yet. These are the immediate next steps after landing this change:
- Extract the relevant Cranelift pieces into the
cranelift_asm
crate, which will be used by Winch- Swap the string based implementation of the x64 assembler and use the assembler provided by
cranelift_asm
- Add initial support for arm64 through
cranelift_asm
- Improve error handling
- Increase test coverage, and include tests in CI. Opted to not do that in this change since the emission mechanism will change, so it might be easier to wait and add verification tests once we are using Cranelift's backends.
- There are a handful of
TODO
s in the code that are expected to be addressed in the next iterations. I left them there as a reminder mostly; but if preferred they could be removed and placed into issues instead.
Misc
Winch's CLI can be used by running
cargo run -- path/to/file.wat --target=<target>
Here's a set of WebAssembly programs and their corresponding x64 output:
<details>
<summary>Basic i32 add</summary>(module (export "main" (func $main)) (func $main (result i32) (i32.const 10) (i32.const 20) i32.add) )
push rbp mov rbp, rsp mov eax, 10 add eax, 20 pop rbp ret
</details>
<details>
<summary> i32 add with local operators </summary>(module (export "main" (func $main)) (func $main (result i32) (local $foo i32) (local $bar i32) (i32.const 10) (local.set $foo) (i32.const 20) (local.set $bar) (local.get $foo) (local.get $bar) i32.add ) )
push rbp mov rbp, rsp sub rsp, 8 mov qword [rsp], 0 mov eax, 10 mov [rsp + 4], eax mov ecx, 20 mov [rsp], ecx mov eax, [rsp] mov edx, [rsp + 4] add edx, eax mov rax, rdx add rsp, 8 pop rbp ret
</details>
<details>
<summary>Basic i32 with function arguments </summary>(module (export "main" (func $main)) (func $main (param i32) (param i32) (result i32) (local.get 0) (local.get 1) i32.add ) )
push rbp mov rbp, rsp sub rsp, 8 mov [rsp + 4], edi mov [rsp], esi mov eax, [rsp] mov ecx, [rsp + 4] add ecx, eax mov rax, rcx add rsp, 8 pop rbp ret
</details>
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
- [ ] This has been discussed in issue #..., or if not, please tell us why
here.- [ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.- [ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera has marked PR #4907 as ready for review.
saulecabrera edited PR #4907 from winch-skeleton
to main
:
This change introduces the initial skeleton for Winch, the "baseline" compiler, discussed https://github.com/bytecodealliance/rfcs/pull/28
The skeleton contains the just the necessary code for the main abstractions to support the following:
- Function prologue and epilogue (excluding stack checks)
- Calculation of local slots
- An initial pass on a generic MacroAssembler interface
- An initial pass on a x64 Assembler
- A simple CLI tool to compile WebAssembly programs (
crates/winch/src/main.rs
)- String based emission for programs that use the following WebAssembly instructions:
i32.add
,local.get
,local.set
This change doesn't fully integrate Winch with Wasmtime yet.
Overview
As of this change, the high-level structure of the compiler is as follows:
+--------------------------------------+ | | | | | +-----------------------+ | | | | | | | +-----------+ +--------+ | | | | Compiler | ---> | ISA | -+ | | +-----------+ +--------+ | | | | | | | | v | | +-----------+-----+---------+ +---------------------+ +----------------+ +-------+----------------+-------+ | +> | Registers | ABI | CodeGen | --> | Register Allocation | --> | CodeGenContext | --> | Frame | MacroAssembler | Stack | | +-----------+-----+---------+ +---------------------+ +----------------+ +-------+----------------+-------+ | ^ | ^ +----------------------+ +------------------------------------------+
Next steps
As mentioned above, this change doesn't use any of Cranelift's backends for code emission, yet. These are the immediate next steps after landing this change:
- Extract the relevant Cranelift pieces into the
cranelift_asm
crate, which will be used by Winch- Swap the string based implementation of the x64 assembler and use the assembler provided by
cranelift_asm
- Add initial support for arm64 through
cranelift_asm
- Improve error handling
- Increase test coverage, and include tests in CI. Opted to not do that in this change since the emission mechanism will change, so it might be easier to wait and add verification tests once we are using Cranelift's backends.
- There are a handful of
TODO
s in the code that are expected to be addressed in the next iterations. I left them there as a reminder mostly; but if preferred they could be removed and placed into issues instead.
Misc
Winch's CLI can be used by running
cargo run -- path/to/file.wat --target=<target>
Here's a set of WebAssembly programs and their corresponding x64 output:
<details>
<summary>Basic i32 add</summary>(module (export "main" (func $main)) (func $main (result i32) (i32.const 10) (i32.const 20) i32.add) )
push rbp mov rbp, rsp mov eax, 10 add eax, 20 pop rbp ret
</details>
<details>
<summary> i32 add with local operators </summary>(module (export "main" (func $main)) (func $main (result i32) (local $foo i32) (local $bar i32) (i32.const 10) (local.set $foo) (i32.const 20) (local.set $bar) (local.get $foo) (local.get $bar) i32.add ) )
push rbp mov rbp, rsp sub rsp, 8 mov qword [rsp], 0 mov eax, 10 mov [rsp + 4], eax mov ecx, 20 mov [rsp], ecx mov eax, [rsp] mov edx, [rsp + 4] add edx, eax mov rax, rdx add rsp, 8 pop rbp ret
</details>
<details>
<summary>Basic i32 with function arguments </summary>(module (export "main" (func $main)) (func $main (param i32) (param i32) (result i32) (local.get 0) (local.get 1) i32.add ) )
push rbp mov rbp, rsp sub rsp, 8 mov [rsp + 4], edi mov [rsp], esi mov eax, [rsp] mov ecx, [rsp + 4] add ecx, eax mov rax, rcx add rsp, 8 pop rbp ret
</details>
<!--
Please ensure that the following steps are all taken care of before submitting
the PR.
- [ ] This has been discussed in issue #..., or if not, please tell us why
here.- [ ] A short description of what this does, why it is needed; if the
description becomes long, the matter should probably be discussed in an issue
first.- [ ] This PR contains test cases, if meaningful.
- [ ] A reviewer from the core maintainer team has been assigned for this PR.
If you don't know who could review this, please indicate so. The list of
suggested reviewers on the right can help you.Please ensure all communication adheres to the code of conduct.
-->
saulecabrera updated PR #4907 from winch-skeleton
to main
.
cfallin submitted PR review.
cfallin submitted PR review.
cfallin created PR review comment:
I think there are two different ways we could go here: a
winch
crate in its own subdirectory, ascranelift
is, or awasmtime-winch
crate incrates/winch
with a version number synchronized with the rest of the Wasmtime crates. Given the dependency onwasmtime-environ
below I think it's probably more properly considered part of the Wasmtime family of crates, but if eventually factor that out, then it at least in theory could be its own thing usable separately (in the way that Cranelift is).but tl;dr, a crate in
crates/$NAME
is part of Wasmtime and is named aswasmtime-$NAME
...
cfallin created PR review comment:
Comment here too that this is a temporary dependency?
cfallin created PR review comment:
word_size
should probably be obtained from theMacroAssembler
or other platform-specific trait?
cfallin created PR review comment:
Can we factor out this logic into a
memset
helper on theMacroAssembler
, that either unrolls a short sequence of stores or calls a helper?
cfallin created PR review comment:
I'll note here but applies generally: overall in doc-comments we've tried to stick to capitalized-with-periods-at-end style; many comments here don't end with a
.
.
cfallin created PR review comment:
If we want to break the dependency on Wasmtime, the way to do that would be to have a
crates/winch
crate analogous tocrates/cranelift
(wasmtime-cranelift
) that pulls in the actual Winch toplevel crate (justwinch
, orwinch-codegen
or somesuch) and impls this Wasmtime trait. We can always refactor in that direction later, but if you wanted to do it now I think that could be reasonable as well.
cfallin created PR review comment:
s/Aarch/AArch64/
cfallin created PR review comment:
References to SpiderMonkey are fine here but for the benefit of readers later, links to the source (via searchfox) might be helpful -- so e.g. starting out with "This algorithm is similar to SpiderMonkey's frame layout logic in LINK... The main differences are: ..."
cfallin created PR review comment:
We should assert here that there is only one result type, right? (I think that wasmparser probably accepts multi-value Wasm by default now)
Doing the fully general thing is fine to defer to later, but we'll need to support it at some point; in Cranelift our ABI uses the arg regs (x0-x7 on aarch64) as eight returns and then has a "return value area" on the stack for more than that...
cfallin created PR review comment:
We could also have dynamic settings on the
TargetIsa
and skip the builder abstraction entirely -- in general I'm open to reconsidering complexity here (as more important than exact analogues to Cranelift design).
cfallin created PR review comment:
s/endianess/endianness/ (not a real word or in dictionaries! But that's at least how Wikipedia spells it, in addition to
target_lexicon
)
cfallin created PR review comment:
name should be
aarch64
here?
cfallin created PR review comment:
s/byets/bytes/
cfallin created PR review comment:
A general thought about two-arg vs three-arg forms (I recall having some sort of this conversation before, but don't remember all of our conclusions): would it make sense maybe to define the three-arg form as canonical in the trait, and then add a move here in the x64 implementation (mov dst, src1; add dst, src2)? And skip the move if dst == src1? That to me seems a little more general, as x64 can support it with no reduction in code quality (the move otherwise has to be generated by the translator driving the macroassembler) while it allows aarch64 and other three-arg RISCs to possibly generate better code.
cfallin created PR review comment:
If we move this crate into
winch/codegen/
, then thewinch/
subdirectory can contain a cratewinch-cli
orwinch-tools
, just like Cranelift, andwinch/src/main.rs
is the main entrypoint for awinch-cli
command. I like that a little more than having amain.rs
inside Wasmtime'scrates/
that is unrelated to Wasmtime. Thoughts?
cfallin created PR review comment:
It's worth writing a bit more about how this will extend to cases where we really do want machine-specific lowerings for Wasm ops; SIMD will be a forcing function there, if not before, I think. E.g., do we have a way for the Wasm-op translator to delegate to some machine-specific sequence of calls to masm? Or maybe we just put direct Wasm-op equivalents on MacroAssembler in that case and do the right thing on each platform. (Actually I like the latter approach best, so maybe it's OK that everything goes through this with no other conditionals/hooks.)
cfallin created PR review comment:
Ah, here we go; commented on this above but now that I read this, I'm glad it's already captured! Happy to see this kept as a TODO, if you don't want to do the refactor right away (though I suspect it may be an easy change and easier now than later as the instructions are built out)
cfallin created PR review comment:
Ah, now I understand why two-arg
add
is kind of OK actually: Wasmadd
(and other binops) always consume their args, so it's fine to overwritedst
. That actually makes a lot of sense to me, so maybe it's fine to keep two-arg forms of the masm methods, unless it later hinders lowerings of e.g. SIMD ops or address computations or ... that use temps.
cfallin created PR review comment:
A short doc-comment on the approach would be helpful here -- something about how it's a single-pass allocator, taking registers from a freelist until we run out then spill everything (or something like that)?
cfallin created PR review comment:
I'm wondering if it's possible to avoid the complexity of these macros -- I don't like them for several reasons:
- It duplicates knowledge here (the whole set of Wasm opcodes) unnecessarily;
- It's very unreadable;
- The dispatching mechanism feels somewhat "magic" as a result, in a bad way (I can't trace what's going on).
I think I'd prefer a toplevel translation driver loop that's something like:
for op in operators { match op { Operator::Add => self.visit_add(); ... } }
maybe with that
match
in a handwrittenfn visit()
. In other words, just expand out what this macro would generate by hand.Or said another way: it seems these macros are only used once. Given that, why not expand them in the place that they're used, to reduce complexity?
cfallin created PR review comment:
(Ah, reading further below to the x64 version, I see an assert there, so I think it's just missing here)
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera submitted PR review.
saulecabrera created PR review comment:
I'm wondering if it's possible to avoid the complexity of these macros
Yeah, definitely possible. I've addressed this in https://github.com/bytecodealliance/wasmtime/pull/4907/commits/147f901ea630bb39b2a59fa956293c6c232f4e41
FWIW, I like the visitor pattern recently introduced in
wasmparser
, but when I was exploring introducing it I couldn't find a cleaner way to manage the support/unsupported operator sets, which led to duplication. I'd still like to keep the implementation of thevisit
function and each of the emitters in thevisit.rs
module though; I expect that this file is going to get very big as more operators are added.
saulecabrera edited PR review comment.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Given that winch's goal is compilation speed I might caution against going this direction since the
Operator
-based approach to both validation and translation is known to be double-digit percentage slower when at least purely validating and would likely have similar impact to the design of winch as well.I won't disagree that the enum-based approach is clearer to read when only a small handful of opcodes are implemented, but if you'd like I can try to help manage the macro-complexity. For example a reasonable approach might be:
- Use the macro to generate a visitor which recursively calls two other visitors. The first of the two visitors would be the validator itself and the second visitor could be a winch-defined visitor for per-op translations.
- The winch-defined visitor would be defined as usual with manually-written methods per op. You would then need to also maintain a macro with a list of handwritten ops so it could serve as acting as a catch-all for all unimplemented ops to return an error.
Overall I'm hesitant to prioritize readability ergonomics here given the impact on speed. This is all equally applicable to Cranelift-based compilation but the speedup there is likely small-to-unmeasurable given that validation is such a small portion of translation time.
saulecabrera submitted PR review.
saulecabrera created PR review comment:
I think it's reasonable to do this since this is common for all ISAs and will be cleaner if we put this in the masm. A few clarification questions:
- With this logic you mean the entire piece to zero out a locals range?
- By helper, is your idea to have this in the masm module, but not necessarily part of the trait?
saulecabrera submitted PR review.
saulecabrera created PR review comment:
Thanks for chiming in here @alexcrichton; I wasn't aware of the speed benefits; I wrongly assumed the change was mostly around ergonomics. But taking a closer look at the details and some of the comments, your comment makes sense to me.
I'll try out your suggestion, hopefully it'll be less complex that my previous attempt.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
I do realize that macros are nontrivial to work with for sure though and I'm still not 100% satisfied with where
wasmparser
ended up. If you'd like some help I'm happy to assist, or additionally at this point getting something working is probably more prudent than the highest levels of optimization, so I'm happy to try to come work on this small part as a follow-up if you'd prefer too
cfallin submitted PR review.
cfallin created PR review comment:
Given that winch's goal is compilation speed I might caution against going this direction since the Operator-based approach to both validation and translation is known to be double-digit percentage slower when at least purely validating and would likely have similar impact to the design of winch as well.
Ah, OK, I think we're slightly talking past each other, and my hand expansion into a
match
was based on my misunderstanding of what this was generating (sorry!). My main point though wasn't about the actual matching/control-flow strategy, but about the way we generate it.Basically my point can be reduced to: there are some macros here; each macro is invoked once; can we inline the macros' expansion at that point instead? (Since this doesn't change what eventually gets compiled into the winch crate, it should be completely performance-neutral.) It didn't seem like the redundancy that the macros reduced (method args, ..) was that significant compared to the complexity of having the macros at all. And my confusion on what they were actually doing is only slightly more evidence of (either my denseness or...) the confusion imparted by macro use :-)
saulecabrera submitted PR review.
saulecabrera created PR review comment:
Basically my point can be reduced to: there are some macros here; each macro is invoked once; can we inline the macros' expansion at that point instead?
I think that in this case, the approach would be to manually define each method expected by the visitor trait, and error out on the ones that are not supported. This satisfies the readability and the performance aspects, at the cost of having to manually define a method per operator. I'd would still like to explore if there's a middle ground in which we are able to use macros without hindering too much the readability, (e.g. following Alex's suggestion of only having a single list of unsupported operators) and perhaps the complexity can be reduced further with comments? And, thinking a bit more, if we go with this approach, as we support more and more operators, this "unsupported" list is going to shrink over time.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera created PR review comment:
Yeah this is a good idea, I've completed this and I'm happy how things are shaping up.
saulecabrera submitted PR review.
saulecabrera submitted PR review.
saulecabrera created PR review comment:
I've tackled this too, I think that it's better to do this earlier rather than later.
saulecabrera created PR review comment:
I've tackled this as part of restructuring the multiple Winch crates, so now we have this tool in it's own directory, not related with Wasmtime at all.
ref: https://github.com/bytecodealliance/wasmtime/commit/3c6f23acb5b992fe936ec4bd9dd28f8077bd8580
saulecabrera submitted PR review.
saulecabrera created PR review comment:
Had a chat with Alex (thanks for the help Alex!) in which we were able to simplify the macro. @cfallin let me know if you have further thoughts on this; I'd like to keep a macro if possible, assuming that it will temporary as more operators are supported. If not I'm happy to manually define all the Visitor traits in a follow up PR. In general I feel that this is cleaner than my previous attempt.
saulecabrera submitted PR review.
cfallin created PR review comment:
Right, the entire bit to zero a range of memory; and I think it'd be pretty reasonable to have it as a trait method on
MacroAssembler
that has a default body (which does some basic lowering to either unrolled stores or a loop that will work on any architecture). That also leaves room for more efficient implementations per machine later (e.g.rep movsb
on x86 machines with ERMS where that's fast, or something using SIMD, or ...).
cfallin submitted PR review.
saulecabrera edited PR review comment.
cfallin submitted PR review.
cfallin created PR review comment:
OK, yeah, I'm happy with how this looks -- thank you for iterating on this!
saulecabrera edited PR review comment.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera created PR review comment:
Makes sense to me; I've factored this out into a method in the trait with a default implementation. I've also made the code a bit simpler.
saulecabrera submitted PR review.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera submitted PR review.
saulecabrera created PR review comment:
Yes, it's actually obtained from each ISA's ABI. I've modified this to make more obvious, so instead of passing in a u32 to the
CodeGen
constructor I'm passing in theABI
.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera edited PR review comment.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera submitted PR review.
saulecabrera created PR review comment:
Done, thanks! Sorry for not noticing this earlier!
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera created PR review comment:
That's indeed something that came to mind when looking at this piece of code. I didn't go really deep on this idea though; but I'm going to keep in mind for whe we need to introduce this for Winch.
saulecabrera submitted PR review.
saulecabrera edited PR review comment.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera submitted PR review.
saulecabrera created PR review comment:
Added, thanks!
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera created PR review comment:
I recall having some sort of this conversation before, but don't remember all of our conclusions
Yeah we concluded that three-arg form is probably better and that we should go with it. My original intention was to do this in the next iteration once I add some actual codegen for
aarch64
but while doing the refactoring I took care of it.
saulecabrera submitted PR review.
saulecabrera created PR review comment:
Agreed, I've added this, thanks!
saulecabrera submitted PR review.
saulecabrera requested cfallin for a review on PR #4907.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
cfallin submitted PR review.
cfallin submitted PR review.
cfallin created PR review comment:
Here we need an assert that storing just one i32 will bump the start address up to
word_size
-aligned, right? (I guess one can show this is always the case on 64-bit systems; if not 8-byte aligned, but 4-byte aligned, one can be at most 4 bytes off; but that conclusion isn't generic to word size so I'd rather at least assert it here.)
saulecabrera updated PR #4907 from winch-skeleton
to main
.
saulecabrera created PR review comment:
Fixed; and yeah good catch; definitely I had assumed what you pointed out, that it's always the case in 64-bit systems.
saulecabrera submitted PR review.
saulecabrera updated PR #4907 from winch-skeleton
to main
.
cfallin submitted PR review.
cfallin merged PR #4907.
Last updated: Nov 22 2024 at 16:03 UTC