Stream: git-wasmtime

Topic: wasmtime / PR #1930 Spectre mitigation on heap access ove...


view this post on Zulip Wasmtime GitHub notifications bot (Jun 26 2020 at 21:08):

cfallin opened PR #1930 from spectre-heap to main:

This PR adds a conditional move following a heap bounds check through
which the address to be accessed flows. This conditional move ensures
that even if the branch is mispredicted (access is actually out of
bounds, but speculation goes down in-bounds path), the acually accessed
address is zero (a NULL pointer) rather than the out-of-bounds address.

The technique used is well-known and fairly standard: see, e.g., SpiderMonkey's heap access bounds check (thanks to @bnjbvr for locating this).

The mitigation is controlled by a flag that is off by default, but can
be set by the embedding, to avoid unexpected performance impact for
existing users.

Note that the mitigation is unneccessary when we use the "huge heap"
technique on 64-bit systems, in which we allocate a range of virtual
address space such that no 32-bit offset can reach other data. Hence,
this only affects small-heap configurations.

The generated code is not great on aarch64 because the overflow flag must be materialized as a boolean in a register, rather than reusing flags (because the dataflow crosses basic blocks); in theory we could do much better with a later lowering, codegenning one comparison and two ops that use its flags.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (Jun 26 2020 at 21:08):

cfallin edited PR #1930 from spectre-heap to main:

This PR adds a conditional move following a heap bounds check through which the address to be accessed flows. This conditional move ensures that even if the branch is mispredicted (access is actually out of bounds, but speculation goes down in-bounds path), the acually accessed address is zero (a NULL pointer) rather than the out-of-bounds address.

The technique used is well-known and fairly standard: see, e.g., SpiderMonkey's heap access bounds check (thanks to @bnjbvr for locating this).

The mitigation is controlled by a flag that is off by default, but can be set by the embedding, to avoid unexpected performance impact for existing users.

Note that the mitigation is unneccessary when we use the "huge heap" technique on 64-bit systems, in which we allocate a range of virtual address space such that no 32-bit offset can reach other data. Hence, this only affects small-heap configurations.

The generated code is not great on aarch64 because the overflow flag must be materialized as a boolean in a register, rather than reusing flags (because the dataflow crosses basic blocks); in theory we could do much better with a later lowering, codegenning one comparison and two ops that use its flags.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (Jun 26 2020 at 21:08):

cfallin requested bnjbvr for a review on PR #1930.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 29 2020 at 08:37):

bnjbvr submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 29 2020 at 08:37):

bnjbvr submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 29 2020 at 08:37):

bnjbvr created PR Review Comment:

Generating IR for the Spectre check seems quite dangerous, at this point; any optimization could reorganize the code (move it around in the best case, delete it in the worst case), so I don't think this is the right solution here.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 29 2020 at 08:37):

bnjbvr created PR Review Comment:

The risk of causing harm by disabling Spectre mitigations by default seems much worse than the performance hit that will be caused by its being set all the time, so I think we should enable it by default, explicit the performance overhead here, and say that while it's enabled by default, there are certain environments in which it's fine to disable it.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 29 2020 at 08:42):

bjorn3 submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 29 2020 at 08:42):

bjorn3 created PR Review Comment:

Even worse, there is a legalization for select that uses branches when there is no encoding that uses cmov:

https://github.com/bytecodealliance/wasmtime/blob/c9a3f05afd45961b0b397f97c4ad79cd7a7c807d/cranelift/codegen/src/legalizer/mod.rs#L476-L513

view this post on Zulip Wasmtime GitHub notifications bot (Jun 29 2020 at 21:08):

cfallin updated PR #1930 from spectre-heap to main:

This PR adds a conditional move following a heap bounds check through which the address to be accessed flows. This conditional move ensures that even if the branch is mispredicted (access is actually out of bounds, but speculation goes down in-bounds path), the acually accessed address is zero (a NULL pointer) rather than the out-of-bounds address.

The technique used is well-known and fairly standard: see, e.g., SpiderMonkey's heap access bounds check (thanks to @bnjbvr for locating this).

The mitigation is controlled by a flag that is off by default, but can be set by the embedding, to avoid unexpected performance impact for existing users.

Note that the mitigation is unneccessary when we use the "huge heap" technique on 64-bit systems, in which we allocate a range of virtual address space such that no 32-bit offset can reach other data. Hence, this only affects small-heap configurations.

The generated code is not great on aarch64 because the overflow flag must be materialized as a boolean in a register, rather than reusing flags (because the dataflow crosses basic blocks); in theory we could do much better with a later lowering, codegenning one comparison and two ops that use its flags.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (Jun 29 2020 at 21:08):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 29 2020 at 21:08):

cfallin created PR Review Comment:

Changed to use a specific, effectful instruction that the optimizer and legalizations do not know about.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 29 2020 at 21:14):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 29 2020 at 21:14):

cfallin created PR Review Comment:

See below: will need to be false by default unless we also implement for the legacy x86 backend (or until we delete it), which (unless I'm mistaken) seems to be missing some of the necessary bits for cmovs.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 30 2020 at 23:52):

cfallin updated PR #1930 from spectre-heap to main:

This PR adds a conditional move following a heap bounds check through which the address to be accessed flows. This conditional move ensures that even if the branch is mispredicted (access is actually out of bounds, but speculation goes down in-bounds path), the acually accessed address is zero (a NULL pointer) rather than the out-of-bounds address.

The technique used is well-known and fairly standard: see, e.g., SpiderMonkey's heap access bounds check (thanks to @bnjbvr for locating this).

The mitigation is controlled by a flag that is off by default, but can be set by the embedding, to avoid unexpected performance impact for existing users.

Note that the mitigation is unneccessary when we use the "huge heap" technique on 64-bit systems, in which we allocate a range of virtual address space such that no 32-bit offset can reach other data. Hence, this only affects small-heap configurations.

The generated code is not great on aarch64 because the overflow flag must be materialized as a boolean in a register, rather than reusing flags (because the dataflow crosses basic blocks); in theory we could do much better with a later lowering, codegenning one comparison and two ops that use its flags.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (Jun 30 2020 at 23:53):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jun 30 2020 at 23:53):

cfallin created PR Review Comment:

Now on by default with latest version.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 01 2020 at 12:33):

bnjbvr submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 01 2020 at 12:33):

bnjbvr created PR Review Comment:

nit: for symmetry with the other function, can you use a destructuring assignment here too? let (cc, offset, bound) = ...

view this post on Zulip Wasmtime GitHub notifications bot (Jul 01 2020 at 12:33):

bnjbvr submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 01 2020 at 15:40):

cfallin updated PR #1930 from spectre-heap to main:

This PR adds a conditional move following a heap bounds check through which the address to be accessed flows. This conditional move ensures that even if the branch is mispredicted (access is actually out of bounds, but speculation goes down in-bounds path), the acually accessed address is zero (a NULL pointer) rather than the out-of-bounds address.

The technique used is well-known and fairly standard: see, e.g., SpiderMonkey's heap access bounds check (thanks to @bnjbvr for locating this).

The mitigation is controlled by a flag that is off by default, but can be set by the embedding, to avoid unexpected performance impact for existing users.

Note that the mitigation is unneccessary when we use the "huge heap" technique on 64-bit systems, in which we allocate a range of virtual address space such that no 32-bit offset can reach other data. Hence, this only affects small-heap configurations.

The generated code is not great on aarch64 because the overflow flag must be materialized as a boolean in a register, rather than reusing flags (because the dataflow crosses basic blocks); in theory we could do much better with a later lowering, codegenning one comparison and two ops that use its flags.

<!--

Please ensure that the following steps are all taken care of before submitting
the PR.

Please ensure all communication adheres to the code of conduct.
-->

view this post on Zulip Wasmtime GitHub notifications bot (Jul 01 2020 at 15:40):

cfallin submitted PR Review.

view this post on Zulip Wasmtime GitHub notifications bot (Jul 01 2020 at 15:40):

cfallin created PR Review Comment:

Done!

view this post on Zulip Wasmtime GitHub notifications bot (Jul 01 2020 at 16:23):

cfallin merged PR #1930.


Last updated: Nov 22 2024 at 16:03 UTC