Stream: git-wasmtime

Topic: wasmtime / Issue #1728 Cranelift: improper handling of sh...


view this post on Zulip Wasmtime GitHub notifications bot (May 19 2020 at 22:28):

peterhuene opened Issue #1728:

repro.clif:

set opt_level=speed_and_size
set is_pic
target x86_64 haswell

function %foo() windows_fastcall {
    fn0 = %bar(i64)
    ss0 = explicit_slot 8
block0:
    v1 = stack_addr.i64 ss0
    call fn0(v1)
    return
}
$ clif-util compile -D repro.clif

Actual output:

.byte 85, 72, 137, 229, 87, 72, 131, 236, 40, 72, 141, 132, 36, 0, 0, 0, 0, 72, 137, 199, 232, 0, 0, 0, 0, 72, 131, 196, 40, 95, 93, 195

Disassembly of 32 bytes:
   0:   55                      push    rbp
   1:   48 89 e5                mov     rbp, rsp
   4:   57                      push    rdi
   5:   48 83 ec 28             sub     rsp, 0x28
   9:   48 8d 84 24 00 00 00 00 lea     rax, [rsp]
  11:   48 89 c7                mov     rdi, rax
  14:   e8 00 00 00 00          call    0x19
  19:   48 83 c4 28             add     rsp, 0x28
  1d:   5f                      pop     rdi
  1e:   5d                      pop     rbp
  1f:   c3                      ret

Note the instruction at offset 9 that is loading the address of the explicit stack slot: lea rax, [rsp]. The address loaded is actually inside the callee's 32 bytes of shadow space.

I expect this instruction to actually account for the calling convention's shadow space: lea rax, [rsp+0x20].

Because of how we're currently saving callee-saved FPRs, we're actually saving the FPRs into this area of the stack as well, meaning that the callee might trash the caller's saved FPRs as they're allowed to use this space freely.

Because Cranelift itself doesn't make use of the caller-provided shadow space, this won't happen for Cranelift-to-Cranelift calls; however, calling into external functions will certainly make use of the shadow space.

26e06297957bbc06b3244bc96d2e92b4246e8d9b

Rust: 1.43.1
Operating system: Arch Linux
Arch: x86-64

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2020 at 22:28):

peterhuene labeled Issue #1728:

repro.clif:

set opt_level=speed_and_size
set is_pic
target x86_64 haswell

function %foo() windows_fastcall {
    fn0 = %bar(i64)
    ss0 = explicit_slot 8
block0:
    v1 = stack_addr.i64 ss0
    call fn0(v1)
    return
}
$ clif-util compile -D repro.clif

Actual output:

.byte 85, 72, 137, 229, 87, 72, 131, 236, 40, 72, 141, 132, 36, 0, 0, 0, 0, 72, 137, 199, 232, 0, 0, 0, 0, 72, 131, 196, 40, 95, 93, 195

Disassembly of 32 bytes:
   0:   55                      push    rbp
   1:   48 89 e5                mov     rbp, rsp
   4:   57                      push    rdi
   5:   48 83 ec 28             sub     rsp, 0x28
   9:   48 8d 84 24 00 00 00 00 lea     rax, [rsp]
  11:   48 89 c7                mov     rdi, rax
  14:   e8 00 00 00 00          call    0x19
  19:   48 83 c4 28             add     rsp, 0x28
  1d:   5f                      pop     rdi
  1e:   5d                      pop     rbp
  1f:   c3                      ret

Note the instruction at offset 9 that is loading the address of the explicit stack slot: lea rax, [rsp]. The address loaded is actually inside the callee's 32 bytes of shadow space.

I expect this instruction to actually account for the calling convention's shadow space: lea rax, [rsp+0x20].

Because of how we're currently saving callee-saved FPRs, we're actually saving the FPRs into this area of the stack as well, meaning that the callee might trash the caller's saved FPRs as they're allowed to use this space freely.

Because Cranelift itself doesn't make use of the caller-provided shadow space, this won't happen for Cranelift-to-Cranelift calls; however, calling into external functions will certainly make use of the shadow space.

26e06297957bbc06b3244bc96d2e92b4246e8d9b

Rust: 1.43.1
Operating system: Arch Linux
Arch: x86-64

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2020 at 22:28):

peterhuene labeled Issue #1728:

repro.clif:

set opt_level=speed_and_size
set is_pic
target x86_64 haswell

function %foo() windows_fastcall {
    fn0 = %bar(i64)
    ss0 = explicit_slot 8
block0:
    v1 = stack_addr.i64 ss0
    call fn0(v1)
    return
}
$ clif-util compile -D repro.clif

Actual output:

.byte 85, 72, 137, 229, 87, 72, 131, 236, 40, 72, 141, 132, 36, 0, 0, 0, 0, 72, 137, 199, 232, 0, 0, 0, 0, 72, 131, 196, 40, 95, 93, 195

Disassembly of 32 bytes:
   0:   55                      push    rbp
   1:   48 89 e5                mov     rbp, rsp
   4:   57                      push    rdi
   5:   48 83 ec 28             sub     rsp, 0x28
   9:   48 8d 84 24 00 00 00 00 lea     rax, [rsp]
  11:   48 89 c7                mov     rdi, rax
  14:   e8 00 00 00 00          call    0x19
  19:   48 83 c4 28             add     rsp, 0x28
  1d:   5f                      pop     rdi
  1e:   5d                      pop     rbp
  1f:   c3                      ret

Note the instruction at offset 9 that is loading the address of the explicit stack slot: lea rax, [rsp]. The address loaded is actually inside the callee's 32 bytes of shadow space.

I expect this instruction to actually account for the calling convention's shadow space: lea rax, [rsp+0x20].

Because of how we're currently saving callee-saved FPRs, we're actually saving the FPRs into this area of the stack as well, meaning that the callee might trash the caller's saved FPRs as they're allowed to use this space freely.

Because Cranelift itself doesn't make use of the caller-provided shadow space, this won't happen for Cranelift-to-Cranelift calls; however, calling into external functions will certainly make use of the shadow space.

26e06297957bbc06b3244bc96d2e92b4246e8d9b

Rust: 1.43.1
Operating system: Arch Linux
Arch: x86-64

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2020 at 22:28):

peterhuene assigned Issue #1728:

repro.clif:

set opt_level=speed_and_size
set is_pic
target x86_64 haswell

function %foo() windows_fastcall {
    fn0 = %bar(i64)
    ss0 = explicit_slot 8
block0:
    v1 = stack_addr.i64 ss0
    call fn0(v1)
    return
}
$ clif-util compile -D repro.clif

Actual output:

.byte 85, 72, 137, 229, 87, 72, 131, 236, 40, 72, 141, 132, 36, 0, 0, 0, 0, 72, 137, 199, 232, 0, 0, 0, 0, 72, 131, 196, 40, 95, 93, 195

Disassembly of 32 bytes:
   0:   55                      push    rbp
   1:   48 89 e5                mov     rbp, rsp
   4:   57                      push    rdi
   5:   48 83 ec 28             sub     rsp, 0x28
   9:   48 8d 84 24 00 00 00 00 lea     rax, [rsp]
  11:   48 89 c7                mov     rdi, rax
  14:   e8 00 00 00 00          call    0x19
  19:   48 83 c4 28             add     rsp, 0x28
  1d:   5f                      pop     rdi
  1e:   5d                      pop     rbp
  1f:   c3                      ret

Note the instruction at offset 9 that is loading the address of the explicit stack slot: lea rax, [rsp]. The address loaded is actually inside the callee's 32 bytes of shadow space.

I expect this instruction to actually account for the calling convention's shadow space: lea rax, [rsp+0x20].

Because of how we're currently saving callee-saved FPRs, we're actually saving the FPRs into this area of the stack as well, meaning that the callee might trash the caller's saved FPRs as they're allowed to use this space freely.

Because Cranelift itself doesn't make use of the caller-provided shadow space, this won't happen for Cranelift-to-Cranelift calls; however, calling into external functions will certainly make use of the shadow space.

26e06297957bbc06b3244bc96d2e92b4246e8d9b

Rust: 1.43.1
Operating system: Arch Linux
Arch: x86-64

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2020 at 22:28):

github-actions[bot] commented on Issue #1728:

Subscribe to Label Action

cc @bnjbvr

<details>
This issue or pull request has been labeled: "cranelift"

Thus the following users have been cc'd because of the following labels:

To subscribe or unsubscribe from this label, edit the <code>.github/subscribe-to-label.json</code> configuration file.

Learn more.
</details>

view this post on Zulip Wasmtime GitHub notifications bot (May 19 2020 at 22:34):

peterhuene commented on Issue #1728:

Note that I'm currently in the process of fixing where we store callee saved FPRs as they should not be stored as explicit slots.

view this post on Zulip Wasmtime GitHub notifications bot (May 22 2020 at 19:06):

peterhuene closed Issue #1728 (assigned to peterhuene):

repro.clif:

set opt_level=speed_and_size
set is_pic
target x86_64 haswell

function %foo() windows_fastcall {
    fn0 = %bar(i64)
    ss0 = explicit_slot 8
block0:
    v1 = stack_addr.i64 ss0
    call fn0(v1)
    return
}
$ clif-util compile -D repro.clif

Actual output:

.byte 85, 72, 137, 229, 87, 72, 131, 236, 40, 72, 141, 132, 36, 0, 0, 0, 0, 72, 137, 199, 232, 0, 0, 0, 0, 72, 131, 196, 40, 95, 93, 195

Disassembly of 32 bytes:
   0:   55                      push    rbp
   1:   48 89 e5                mov     rbp, rsp
   4:   57                      push    rdi
   5:   48 83 ec 28             sub     rsp, 0x28
   9:   48 8d 84 24 00 00 00 00 lea     rax, [rsp]
  11:   48 89 c7                mov     rdi, rax
  14:   e8 00 00 00 00          call    0x19
  19:   48 83 c4 28             add     rsp, 0x28
  1d:   5f                      pop     rdi
  1e:   5d                      pop     rbp
  1f:   c3                      ret

Note the instruction at offset 9 that is loading the address of the explicit stack slot: lea rax, [rsp]. The address loaded is actually inside the callee's 32 bytes of shadow space.

I expect this instruction to actually account for the calling convention's shadow space: lea rax, [rsp+0x20].

Because of how we're currently saving callee-saved FPRs, we're actually saving the FPRs into this area of the stack as well, meaning that the callee might trash the caller's saved FPRs as they're allowed to use this space freely.

Because Cranelift itself doesn't make use of the caller-provided shadow space, this won't happen for Cranelift-to-Cranelift calls; however, calling into external functions will certainly make use of the shadow space.

26e06297957bbc06b3244bc96d2e92b4246e8d9b

Rust: 1.43.1
Operating system: Arch Linux
Arch: x86-64


Last updated: Nov 22 2024 at 17:03 UTC