akirilov-arm edited PR #3693 from bti
to main
:
This pull request is meant to illustrate the RFC proposal to improve control flow integrity for compiled WebAssembly code by using the Branch Target Identification extension to the Arm instruction set architecture (bytecodealliance/rfcs#17), so it is not ready to be merged yet.
akirilov-arm edited PR #3693 from bti
to main
:
This pull request is meant to illustrate the RFC proposal to improve control flow integrity for compiled WebAssembly code by using the Branch Target Identification extension to the Arm instruction set architecture (bytecodealliance/rfcs#17), so it is not ready to be merged yet.
P.S. The RFC proposal has now been merged, and the changes in this PR have been updated the reflect the final version of the proposal, so they are now ready.
akirilov-arm edited PR #3693 from bti
to main
:
This pull request is meant to illustrate the RFC proposal to improve control flow integrity for compiled WebAssembly code by using the Branch Target Identification (BTI) extension to the Arm instruction set architecture (bytecodealliance/rfcs#17), so it is not ready to be merged yet.
P.S. The RFC proposal has now been merged, and the changes in this PR have been updated the reflect the final version of the proposal, so they are now ready.
akirilov-arm updated PR #3693 from bti
to main
.
akirilov-arm updated PR #3693 from bti
to main
.
akirilov-arm updated PR #3693 from bti
to main
.
akirilov-arm updated PR #3693 from bti
to main
.
bjorn3 submitted PR review.
bjorn3 created PR review comment:
The bti feature being detected doesn't necessarily mean that the kernel supports it, right? Maybe you could try using
PROT_BTI
and if that fails try again without?
bjorn3 submitted PR review.
bjorn3 created PR review comment:
This is somewhat unfortunate code duplication.
akirilov-arm submitted PR review.
akirilov-arm created PR review comment:
No, that's exactly what it means. The reason is that the 64-bit Arm architecture does not provide any mechanism to unprivileged software, i.e. userspace, to detect the features supported by the processor while bypassing the kernel (or more privileged software); indeed, it does not provide such a mechanism at all (irrespective of the kernel involvement), leaving it to the software environment to define one.
akirilov-arm updated PR #3693 from bti
to main
.
akirilov-arm updated PR #3693 from bti
to main
.
akirilov-arm updated PR #3693 from bti
to main
.
akirilov-arm requested alexcrichton for a review on PR #3693.
akirilov-arm requested cfallin for a review on PR #3693.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Could this use
io::Error::last_os_error()
as well to forward along the original error message?
alexcrichton created PR review comment:
Could this perhaps move to a method on
Engine
?
alexcrichton created PR review comment:
Instead of ferrying around this
bool
as a return value from this function could theis_branch_protection_enabled
testing go as a method onEngine
? Most contexts that are being given thisbool
already have theEngine
so they should be able to query the engine itself for this
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Could this perhaps be unconditionally done? I would expect that the detection of the
bti
feature would happen at a higher level than this (e.g. at the WasmtimeConfig
layer) whereenable_branch_protection
is onlytrue
if the machine actually hasbti
.Although I think it's ok to still have
cfg!(target_arch = "aarch64")
in the condition since this is still arm64-specific.
akirilov-arm updated PR #3693 from bti
to main
.
akirilov-arm submitted PR review.
akirilov-arm created PR review comment:
Could this perhaps be unconditionally done? I would expect that the detection of the
bti
feature would happen at a higher level than this (e.g. at the WasmtimeConfig
layer) whereenable_branch_protection
is onlytrue
if the machine actually hasbti
.My other comment provides a bit of context, but I didn't touch the higher-level layers on purpose, so that we support as much flexibility as possible, e.g. compile ahead-of-time a module with BTI on a machine that does not support BTI, or load a module that has been compiled ahead-of-time without BTI, while the engine is set to use BTI. I think that we will need to be more careful if we are making these decisions at a higher level - for instance, distinguish between
wasmtime compile
andwasmtime run
.
akirilov-arm requested alexcrichton for a review on PR #3693.
alexcrichton submitted PR review.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Given that
isa_flags
always come from theEngine
itself could this parameter be dropped and internally this usesself.compiler().isa_flags()
?
alexcrichton created PR review comment:
Could this be dropped here since it can be inferred from the
&Engine
argument?
alexcrichton created PR review comment:
Could this perhaps still fall through to
rustix
below but useMprotectFlags::from_u32
until there's a constant forPROT_BTI
?
alexcrichton created PR review comment:
Since this an be inferred from
&Engine
-the-argument could this return value be dropped?
alexcrichton created PR review comment:
Also if it's a perf concern it might be prudent to add
engine.compiler().is_branch_protection_enabled(..)
sinceisa_flags()
returns a whole newBTreeMap
each time it's called I think. Or otherwise watch it get enabled onConfig
and ferry through thatbool
to here or similar.
akirilov-arm updated PR #3693 from bti
to main
.
akirilov-arm submitted PR review.
akirilov-arm created PR review comment:
There is one case in which the flags come from a previously compiled module - line 230 in
crates/wasmtime/src/module/serialization.rs
. However, I can create a wrapper function, sayis_branch_protection_enabled_globally()
, that simply calls this one with a fixed parameter,&engine.compiler().isa_flags()
, and change most call sites to invoke it instead.
akirilov-arm created PR review comment:
See my comment about line 540 in
crates/wasmtime/src/engine.rs
.
akirilov-arm submitted PR review.
akirilov-arm created PR review comment:
This is a bit tricky, but the reason is the cache implementation on line 291, and in particular the call to
SerializedModule::into_parts()
on line 314, which, given the declaration ofModuleCacheEntry::get_data_raw()
, enforces a particular function signature forModule::build_artifacts()
. I tried changing the closure on line 311, so that it didn't return whateverSerializedModule::into_parts()
's return value was directly by defining a mutable flagenable_branch_protection
insideModule::from_binary()
, but then the Rust compiler complained that it could not accept a closure that captured variables in that location.
akirilov-arm submitted PR review.
akirilov-arm updated PR #3693 from bti
to main
.
akirilov-arm updated PR #3693 from bti
to main
.
akirilov-arm submitted PR review.
akirilov-arm created PR review comment:
@alexcrichton I am not too happy about this part, but I tried separating the parsing of the ELF image and the changing of the memory protection done by
CodeMemory::publish()
and I ran into various borrow checker and lifetime issues that seemed beyond my Rust skills, so I went ahead with this suboptimal, but nevertheless correct solution.
alexcrichton submitted PR review.
alexcrichton created PR review comment:
Oh dear, I'll work on improving this with some refactorings but otherwise this PR looks good to me, thanks for the updates!
alexcrichton edited PR review comment.
alexcrichton created PR review comment:
Ah I ended up with https://github.com/akirilov-arm/wasmtime/pull/1 which was easier than I thought
alexcrichton submitted PR review.
akirilov-arm updated PR #3693 from bti
to main
.
akirilov-arm submitted PR review.
akirilov-arm created PR review comment:
Yeah, parsing the ELF header twice definitely helps.
akirilov-arm updated PR #3693 from bti
to main
.
akirilov-arm updated PR #3693 from bti
to main
.
akirilov-arm edited PR review comment.
cfallin submitted PR review.
cfallin created PR review comment:
Rather than take a single
bool
here, can we take an options struct? It may seem overkill for one flag butenable_branch_protection
would not be my first guess for the meaning of a barefalse
hanging out in aMemory::new
call :-)
cfallin submitted PR review.
cfallin created PR review comment:
This logic feels a little bit brittle to me. If I understand correctly, it is determining what the "real" CLIF-level block for this target is meant to be, even if we have inserted an edge block, so that we can tell if any indirect branch might reach it; is that right?
If so, I think what we really should have is a set of lowered block indices reachable via an indirect branch, computed by the lowering-order code. We can compute this when we traverse the lowered-block's successors if we know that we're visiting out-edges of an indirect branch, I think.
cfallin created PR review comment:
Can we add a comment here noting what the
bool
arg to the closure is for?
akirilov-arm updated PR #3693 from bti
to main
.
akirilov-arm submitted PR review.
akirilov-arm created PR review comment:
I decided to define a new
enum
and pass it instead. I noticed that Windows defined some memory protection flags that were related to Control Flow Guard, its own CFI implementation, and this approach seems like a good way to leave the possibility of adding support for them in the future (if we decide to do so), while keeping the current implementation simple.
akirilov-arm updated PR #3693 from bti
to main
.
cfallin submitted PR review.
alexcrichton submitted PR review.
alexcrichton merged PR #3693.
Last updated: Jan 24 2025 at 00:11 UTC