Stream: git-wasmtime

Topic: wasmtime / issue #9402 Move constant pools in compiled co...


view this post on Zulip Wasmtime GitHub notifications bot (Oct 08 2024 at 21:23):

alexcrichton opened issue #9402:

Today Cranelift's constant-pools are located in the .text section of the executable, typically located after the function itself. While convenient for code generation this exposes a possible attack vector in Wasmtime where it's trivial to put a "gadget" somewhere in memory. For example using a sequence of v128.const it would be pretty easy to assemble "machine code" at the end of a function. In the face of a bug in Cranelift this could make it possibly easier to amplify into a sandbox escape perhaps.

As a defense-in-depth measure we should try to move the constant pools out of the .text section and into a .data or otherwise read-only section. (not writable or executable). This won't be trivial to do due to the fact that relocations from the text section point at the data section and the relocation range may not always be large enough for the entire text section. Regardless though I wanted to file an issue about this idea.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 08 2024 at 21:30):

cfallin commented on issue #9402:

It would definitely be nice to have support for this -- in principle we could return two blobs of bytes as the result of per-function compilation instead of one, and have a relocation type that is "offset from start of this function's code to start of this function's constants".

Out of curiosity, do you happen to know how ld handles .rodata references today for very large aarch64/riscv64/... binaries? I wonder if it uses its support for relaxation (assuming most pessimistic range sequence then shrinking if able) -- it'd be unfortunate to have to use adrp/adr/ldr rather than the immediate-pcrel form of ldr for every constant. I'm not able to find anything on this at the moment...

view this post on Zulip Wasmtime GitHub notifications bot (Oct 08 2024 at 22:52):

alexcrichton commented on issue #9402:

That's an excellent question, and one I don't know the answer to myself. I can try to play around with an assembler though and see what happens perhaps!

view this post on Zulip Wasmtime GitHub notifications bot (Oct 08 2024 at 23:13):

cfallin commented on issue #9402:

I tried briefly to trigger something interesting, but got stuck at trying to get clang (on macOS/aarch64) to use the short-form LDR-with-immediate instruction; for any load from rodata it seems to use an adrp/adr pair.

For example with (separate files to avoid a neat optimization where clang const-folds the load of constant data):

% cat test.c
extern const char* s;

int foo() {
    return *((int*)s);
}

% cat data.c
const char* s = "1234";

% cat main.c
#include <stdio.h>
extern int foo();
int main() {
    printf("%d\n", foo());
}

I see _foo's body as

0000000100003f44 <_foo>:
100003f44: b0000028     adrp    x8, 0x100008000 <_s>
100003f48: 91000108     add x8, x8, #0x0
100003f4c: f9400108     ldr x8, [x8]
100003f50: b9400100     ldr w0, [x8]
100003f54: d65f03c0     ret

Maybe it wouldn't be so bad to unconditionally emit that form actually; loads from constant pools will be relatively rare. It does burn a register though to compute the address.

view this post on Zulip Wasmtime GitHub notifications bot (Oct 09 2024 at 00:00):

alexcrichton commented on issue #9402:

Good point!

Looks like

#[no_mangle]
pub extern "C" fn foo() ->f64 {
    1.3484
}

generates

.LCPI0_0:
        .xword  0x3ff5930be0ded289
foo:
        adrp    x8, .LCPI0_0
        ldr     d0, [x8, :lo12:.LCPI0_0]
        ret

so yeah it looks like we may want to be a tiny bit clever (don't always "just" materialize the address) but otherwise looks like solving this issue would involve always doing adrp on aarch64 and the equivalent on riscv64

view this post on Zulip Wasmtime GitHub notifications bot (Oct 10 2024 at 22:54):

alexcrichton added the cranelift:area:security label to Issue #9402.


Last updated: Nov 22 2024 at 16:03 UTC