akirilov-arm opened PR #17 from cfi_improvements
to main
:
This RFC proposes to improve control flow integrity for compiled WebAssembly code by utilizing two technologies from the Arm instruction set architecture - Pointer Authentication and Branch Target Identification.
cfallin submitted PR review.
cfallin created PR review comment:
It might be good to understand what other userspace context-switching implementations do here: do e.g.
setjmp()
/longjmp()
on PAuth-enabled systems somehow sign the saved state?
cfallin submitted PR review.
cfallin created PR review comment:
x86 has
ENDBR
instructions which I think are the equivalent of aarch64'sBTI
. It also has the ability to use a shadow stack to guard return addresses; I think that pointer auth covers this on aarch64 (?).Of course I don't think we need to require more than one platform implementation initially; but it's good to see that it should be pretty easy to adapt/port whatever is done here to x86.
(see e.g. this presentation for some details; I'm just learning about this too)
cfallin created PR review comment:
My instinct here is that the overhead (performance and/or code size) is probably non-negligible enough that we'll want this to be a configuration option; the overhead in comparison of testing with and without seems relatively minor (and maybe we don't need to fully double the tests; needs more thought).
akirilov-arm submitted PR review.
akirilov-arm created PR review comment:
The concern here is mainly about environments that do not support PAuth and/or BTI, which are going to see extra
NOP
s - we should be able to quantify this right now, assuming that we have a good set of benchmarks. Perhaps Sightglass could help?
akirilov-arm created PR review comment:
Yes,
ENDBR
seems quite similar to BTI, however from the presentation it is not clear to me if it can be used in the same fine-grained way, i.e. the usage of BTI can be controlled on a per-page basis, so it is possible to have a JIT runtime that generates hardened code, for instance, while the runtime implementation itself does not use the same protection, thus easing deployment for us.While the x86 shadow stack and PAuth have the same goal (back-edge CFI), they are quite different. It seems that the shadow stack will have a minimal impact on code generation (indeed, probably none at all), while unwinding will become more complicated.
akirilov-arm submitted PR review.
akirilov-arm created PR review comment:
Well, here are Apple's recommendations:
setjmp
andlongjmp
should sign and authenticate the core registers (SP, FP, PC, and LR), but they do not need to worry about intermediate values becausesetjmp
can only be called synchronously, and the compiler should never schedule pointer-authentication operations interleaved with arbitrary calls.However, a general remark I have is that the users of
setjmp()
, usually operating in a traditional statically compiled environment without any intra-process security boundaries, do not necessarily have the same threat model as a WebAssembly runtime that manages several security contexts within the same process, each of which executes arbitrary untrusted code. I think that the document I linked to above and indeed the implementations one is likely to encounter in C libraries is not written with the latter use case in mind.
akirilov-arm submitted PR review.
akirilov-arm edited PR review comment.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
FWIW I think @cfallin's question is inspired by how we use setjmp/longjmp in Wasmtime to implement traps as well as for the
wasmtime-fiber
crate and the async implementation we have userland stack-switching code, so it sounds like we'll probably need to at least have some updates to the aarch64 implementation ofwasmtime-fiber
but setjmp/longjmp would otherwise be handled by the system libraries?
akirilov-arm updated PR #17 from cfi_improvements
to main
.
akirilov-arm submitted PR review.
akirilov-arm created PR review comment:
I added a note about the x86 alternative to the proposal text.
akirilov-arm submitted PR review.
akirilov-arm created PR review comment:
I investigated this issue a bit, and I have some further remarks:
... it sounds like we'll probably need to at least have some updates to the aarch64 implementation of
wasmtime-fiber
...Actually we do not need to change anything to make things work (refer to bytecodealliance/wasmtime#3606), but that would leave a gap in the CFI implementation.
... but setjmp/longjmp would otherwise be handled by the system libraries?
Yes, that is correct, and it looks like on Linux the system libraries do not do anything except maybe authenticate the return address, but I do not think that it is by design - maybe it is just an oversight.
As for Apple's recommendation, it appears that I was partially right - it is written in the context of their pointer authentication ABI variant as far as I can tell, which mandates that all pointers be signed immediately after being generated, so no raw pointers would be kept in registers (or writable memory). In that situation it makes sense that only the set of core registers mentioned above needs special handling. We would not have the same guarantee in our stack switching code initially, so we should consider something more sophisticated, but we could start with the same approach, especially since it should be easier to implement - it could be done with just the
NOP
versions of the instructions, I think.
akirilov-arm submitted PR review.
akirilov-arm created PR review comment:
... we could start with the same approach, especially since it should be easier to implement - it could be done with just the
NOP
versions of the instructions, I think.Now bytecodealliance/wasmtime#3606 has an initial implementation of this bit as well.
akirilov-arm updated PR #17 from cfi_improvements
to main
.
akirilov-arm submitted PR review.
akirilov-arm created PR review comment:
I have changed the text, so that now the usage of both PAuth and BTI is disabled by default, and the values of the respective configuration options are decoupled from the capabilities of the host CPU.
akirilov-arm submitted PR review.
akirilov-arm created PR review comment:
And the proposal text reflects it as well.
bjorn3 submitted PR review.
PiotrSikora submitted PR review.
radu-matei submitted PR review.
akirilov-arm merged PR #17.
Last updated: Dec 23 2024 at 12:05 UTC