akirilov-arm labeled issue #3340:
Cranelift's AArch64 backend currently uses an unallocated encoding for the
Inst::Udf
VCode instruction. There is a risk that the encoding could be allocated in the future, so it might no longer be an undefined instruction, resulting in incorrect code generation. On the other hand, there is an instruction,UDF
, which is permanently undefined and which avoids this issue.In a previous discussion, @bnjbvr has explained the reasons for the current approach:
Yes, a few reasons come to mind. None of them seem particularly inevitable, and they could all be worked around, if we decided to use a new value, but it was simpler to choose this one for a few reasons:
- this is the same value Spidermonkey uses for triggering SIGILL, with the reasons explained there: https://searchfox.org/mozilla-central/source/js/src/jit/arm64/vixl/Constants-vixl.h#2668-2690 . If we didn't use the same value as Spidermonkey did use, either Cranelift would need a compiler option to allow selecting which undefined instruction is used (or select it among a list of predefined), or Spidermonkey should refactor its code to support different undefined instruction encodings in their aarch64 backend.
- there might be an argument that a native compiler (rustc/llvm) would use the default UDF instruction, and that using a different instruction in cranelift-generated code makes it easier to spot what's the actual issue in a test case. For instance, for the initial issue discussed here in particular, we could directly infer that it was a cranelift-generated undefined encoding not being properly handled by the OS. A sigill with another undefined encoding could have been this problem _or_ another illegal encoding problem (coming from llvm codegen in native code).
Note that
UDF
has an immediate parameter, which could be used to address the second point to an extent, e.g.UDF #0xBAD
could signify Cranelift-generated code.
akirilov-arm opened issue #3340:
Cranelift's AArch64 backend currently uses an unallocated encoding for the
Inst::Udf
VCode instruction. There is a risk that the encoding could be allocated in the future, so it might no longer be an undefined instruction, resulting in incorrect code generation. On the other hand, there is an instruction,UDF
, which is permanently undefined and which avoids this issue.In a previous discussion, @bnjbvr has explained the reasons for the current approach:
Yes, a few reasons come to mind. None of them seem particularly inevitable, and they could all be worked around, if we decided to use a new value, but it was simpler to choose this one for a few reasons:
- this is the same value Spidermonkey uses for triggering SIGILL, with the reasons explained there: https://searchfox.org/mozilla-central/source/js/src/jit/arm64/vixl/Constants-vixl.h#2668-2690 . If we didn't use the same value as Spidermonkey did use, either Cranelift would need a compiler option to allow selecting which undefined instruction is used (or select it among a list of predefined), or Spidermonkey should refactor its code to support different undefined instruction encodings in their aarch64 backend.
- there might be an argument that a native compiler (rustc/llvm) would use the default UDF instruction, and that using a different instruction in cranelift-generated code makes it easier to spot what's the actual issue in a test case. For instance, for the initial issue discussed here in particular, we could directly infer that it was a cranelift-generated undefined encoding not being properly handled by the OS. A sigill with another undefined encoding could have been this problem _or_ another illegal encoding problem (coming from llvm codegen in native code).
Note that
UDF
has an immediate parameter, which could be used to address the second point to an extent, e.g.UDF #0xBAD
could signify Cranelift-generated code.
akirilov-arm labeled issue #3340:
Cranelift's AArch64 backend currently uses an unallocated encoding for the
Inst::Udf
VCode instruction. There is a risk that the encoding could be allocated in the future, so it might no longer be an undefined instruction, resulting in incorrect code generation. On the other hand, there is an instruction,UDF
, which is permanently undefined and which avoids this issue.In a previous discussion, @bnjbvr has explained the reasons for the current approach:
Yes, a few reasons come to mind. None of them seem particularly inevitable, and they could all be worked around, if we decided to use a new value, but it was simpler to choose this one for a few reasons:
- this is the same value Spidermonkey uses for triggering SIGILL, with the reasons explained there: https://searchfox.org/mozilla-central/source/js/src/jit/arm64/vixl/Constants-vixl.h#2668-2690 . If we didn't use the same value as Spidermonkey did use, either Cranelift would need a compiler option to allow selecting which undefined instruction is used (or select it among a list of predefined), or Spidermonkey should refactor its code to support different undefined instruction encodings in their aarch64 backend.
- there might be an argument that a native compiler (rustc/llvm) would use the default UDF instruction, and that using a different instruction in cranelift-generated code makes it easier to spot what's the actual issue in a test case. For instance, for the initial issue discussed here in particular, we could directly infer that it was a cranelift-generated undefined encoding not being properly handled by the OS. A sigill with another undefined encoding could have been this problem _or_ another illegal encoding problem (coming from llvm codegen in native code).
Note that
UDF
has an immediate parameter, which could be used to address the second point to an extent, e.g.UDF #0xBAD
could signify Cranelift-generated code.
akirilov-arm labeled issue #3340:
Cranelift's AArch64 backend currently uses an unallocated encoding for the
Inst::Udf
VCode instruction. There is a risk that the encoding could be allocated in the future, so it might no longer be an undefined instruction, resulting in incorrect code generation. On the other hand, there is an instruction,UDF
, which is permanently undefined and which avoids this issue.In a previous discussion, @bnjbvr has explained the reasons for the current approach:
Yes, a few reasons come to mind. None of them seem particularly inevitable, and they could all be worked around, if we decided to use a new value, but it was simpler to choose this one for a few reasons:
- this is the same value Spidermonkey uses for triggering SIGILL, with the reasons explained there: https://searchfox.org/mozilla-central/source/js/src/jit/arm64/vixl/Constants-vixl.h#2668-2690 . If we didn't use the same value as Spidermonkey did use, either Cranelift would need a compiler option to allow selecting which undefined instruction is used (or select it among a list of predefined), or Spidermonkey should refactor its code to support different undefined instruction encodings in their aarch64 backend.
- there might be an argument that a native compiler (rustc/llvm) would use the default UDF instruction, and that using a different instruction in cranelift-generated code makes it easier to spot what's the actual issue in a test case. For instance, for the initial issue discussed here in particular, we could directly infer that it was a cranelift-generated undefined encoding not being properly handled by the OS. A sigill with another undefined encoding could have been this problem _or_ another illegal encoding problem (coming from llvm codegen in native code).
Note that
UDF
has an immediate parameter, which could be used to address the second point to an extent, e.g.UDF #0xBAD
could signify Cranelift-generated code.
bjorn3 commented on issue #3340:
Maybe use
0xC11F
as immediate? (clif) Or encode the trap code directly without requiring a side table like it does right now?
cfallin commented on issue #3340:
The requirement seems mainly to come from our desire to conform to SpiderMonkey's convention when used in that context, but I don't think we need to completely freeze the bits of our design that overlap with the integration if it prevents us from "doing the right thing" otherwise. It does seem more idiomatic / in line with what the ISA design and other tools expect to actually use the canonical "undefined" instruction for traps; and from the SM point of view, this is a slight adjustment (catch a different signal or similar) in line with other API-change and similar updates that need to happen in the normal course of updating to new crate versions.
cc @bnjbvr for any updated thoughts on this? Also @julian-seward1, if you're around: what's the latest (if any) on maintaining usage of Cranelift in SpiderMonkey, and if it's maintained, would it be a significant blocker to adapt to the use of
udf
for traps rather than the arbitrarily-selected instruction used now?
bnjbvr commented on issue #3340:
Agreed that keeping a custom convention that's required by a single Cranelift user makes it less relevant for other users, not much to add on the topic otherwise.
akirilov-arm closed issue #3340:
Cranelift's AArch64 backend currently uses an unallocated encoding for the
Inst::Udf
VCode instruction. There is a risk that the encoding could be allocated in the future, so it might no longer be an undefined instruction, resulting in incorrect code generation. On the other hand, there is an instruction,UDF
, which is permanently undefined and which avoids this issue.In a previous discussion, @bnjbvr has explained the reasons for the current approach:
Yes, a few reasons come to mind. None of them seem particularly inevitable, and they could all be worked around, if we decided to use a new value, but it was simpler to choose this one for a few reasons:
- this is the same value Spidermonkey uses for triggering SIGILL, with the reasons explained there: https://searchfox.org/mozilla-central/source/js/src/jit/arm64/vixl/Constants-vixl.h#2668-2690 . If we didn't use the same value as Spidermonkey did use, either Cranelift would need a compiler option to allow selecting which undefined instruction is used (or select it among a list of predefined), or Spidermonkey should refactor its code to support different undefined instruction encodings in their aarch64 backend.
- there might be an argument that a native compiler (rustc/llvm) would use the default UDF instruction, and that using a different instruction in cranelift-generated code makes it easier to spot what's the actual issue in a test case. For instance, for the initial issue discussed here in particular, we could directly infer that it was a cranelift-generated undefined encoding not being properly handled by the OS. A sigill with another undefined encoding could have been this problem _or_ another illegal encoding problem (coming from llvm codegen in native code).
Note that
UDF
has an immediate parameter, which could be used to address the second point to an extent, e.g.UDF #0xBAD
could signify Cranelift-generated code.
Last updated: Dec 23 2024 at 13:07 UTC