I have started playing around with approaches to verifying the correctness of ISLE rewrite rules using an SMT solver, and wanted to post some initial notes here for anyone interested. Some written notes live here. A manually-constructed SMTLIB file for a single rule that rewrites an iadd
to a subtract with immediate is here. This file currently hard-codes that the type is b64 and doesn't do anything interesting yet for register reads/writes.
Some initial open questions:
imm12_from_negated_value
feels like it should take a 64-bit value, negative it, check if it fits in 12 bits, then return the 12-bit value (and this is, after some indirection, what the external Rust implementation does). Instead, the ISLE definition is this: (decl imm12_from_negated_value (Imm12) Value)
. A question for verification is whether to specify the contract of these extractors in the direction of the ISLE function signature or the Rust function signature. This also has implications for which variables we quantify over: in this example, all 12-bit values that could be returned, or all 64-bit values that would possibly "fit". If we currently consider implementations of extractors in the trusted compute base, the types of bugs these two approaches can find are probably the same.Very cool!
(TODO: is this a rule abort or a dynamic failure?)
This is a dynamic failure, it is an assert
that will panic if it fails.
For quantifying over bitvector types, will it be easier/faster to discharge separate queries for each possible type, or encode the possible types into a single SMT query?
I guess I was imagining that we would have a separate query per type, but that is only because I didn't think it would be possible to fold it all together into one query. Does SMTLIB let you say "this is a bitvector of a dynamic bit width" and then feed a variable defined elsewhere in the query in as a value's bit width? How would you fold multiple bit-widths into the same query?
A question for verification is whether to specify the contract of these extractors in the direction of the ISLE function signature or the Rust function signature.
As discussed during the cranelift meeting today, I think we would want to specify extractors in the same order as the Rust: from "return" type to "parameter" types. This is because extractors can extract multiple values (e.g. has_type
extracts both a type and an inst from an inst).
It is also worth noting that terms can have both extractors and constructors: https://github.com/bytecodealliance/wasmtime/blob/fd2636c25afd3af8711ce5eb6ec002d009c59294/cranelift/codegen/src/clif.isle#L102-L104
The thing about multiple parameter types is a pretty strong argument… just to state the obvious once again, another argument in favor of "the Rust direction" for specifying extractors is that it would (maybe?) create an easier path to someday verifying that the Rust implementations are correct.
@sampsyo just to make sure we're on the same page -- this is a notational question I think? I'm not sure how (decl A (T U) V)
vs (decl A (V) (T U))
affects whether it's possible to e.g. verify the Rust, but possibly I'm missing some nuance here
I meant something about the "guts" of the spec rather than the notation itself: taking @Alexa VanHattum's imm12_from_negated_value
example above, one direction consists of extending (and negating) a 12-bit value, while the other consists of truncating (and negating) a 64-bit value. The idea is that the verifier could either quantify over the 64-bit value in the original CLIF instruction and rely on the "forward" spec to obtain the 12-bit value, or it could quantify over the 12-bit value and use the "backward" spec to obtain the corresponding original CLIF literal. You can imagine various efficiency & "trustworthiness" trade-offs depending on the choice.
Ah, yes, this makes more sense, thanks!
At least the intention at the semantic level is for the meaning to be the same in both directions -- I had been thinking of things in a sort of Prolog way, where we would say imm12_from_negated_value(a, b)
and this is just a predicate on a
and b
; one could imagine a translation from that to SMT formulae or whatnot. So I guess I see the ISLE terms as "directionless"; a Rust implementation of one direction or the other could then be verified against that (always quantify over the Rust args, get symbolic result, check against predicate). But you all will have more detailed thoughts on the implementation/lowering than me; just wanted to communicate the language-level intent above :-)
Right, that directionless/Prolog-esque idea very much came through when reading through ISLE files! It's a really neat idea (and makes for an interesting decision about how to set up the verifier).
Making good progress on an SMT-condition generator! Here's the output (modulo cleaning up some comments/formatting) for two iadd
rules. Currently, this is hard-coding the term semantics rather than the eventual goal of pulling them from annotations. This is also using a fairly high level model for RHS terms, which we'll want to replace soon.
The overall strategy: we have a symbolic interpreter for Patterns
for the LHS of rules that recursively lowers terms to bit vectors, while building up a set of bound SMT variables and assumptions on those variables. Starting with the outer-most instruction pattern, this creates fresh variables for each argument and maps ISLE term indices to bound names. Then a symbolic interpreter for Exprs
lowers RHS expressions to bitvector, using the assumption context built from the LHS.
Currently, we run a separate query for each bit width, so has_type
mostly falls out without reasoning about the type itself. In the example below, we run for 32-bit. Eventually, we'll want the RHS to always reason on the appropriate 32-bit or 64-bit BV, even if the query's type is something like i8.
;; ----------------------------------simple iadd-----------------------------------
;; Running verification for rule:
(rule
(lower (has_type (fits_in_64 ty)
(iadd x y)))
(value_reg (add ty (put_in_reg x) (put_in_reg y))))
;; Declaring constants:
ty : BitVector(32)
arg_0 : BitVector(32)
arg_1 : BitVector(32)
;; Adding assumptions:
(= (_ bv32 32) ty)
;; Running query:
;; Query structure: we want `<assumptions> implies <LSH> = <RSH>`
;; Negate the query, unsat implies no counterexamples exist.
(not
;; Implication
(=>
;; Assumptions
(and
(= (_ bv32 32) ty))
;; LHS = RHS
(= (bvadd arg_0 arg_1) (bvadd arg_0 arg_1))))
;; Verification succeeded
;; ----------------------------------iadd to sub-----------------------------------
;; Running verification for rule:
(rule
(lower (has_type (fits_in_64 ty)
(iadd x (imm12_from_negated_value y))))
(value_reg (sub_imm ty (put_in_reg x) y)))
;; Declaring constants:
ty : BitVector(32)
arg_0 : BitVector(32)
arg_1 : BitVector(32)
arg_imm12_from_negated_value : BitVector(12)
;; Adding assumptions:
(= (_ bv32 32) ty)
(= (bvand (bvnot (_ bv4095 32)) arg_1) (_ bv0 32))
(= arg_1 (bvneg ((_ zero_extend 20) arg_imm12_from_negated_value)))
;; Running query:
;; Query structure: we want `<assumptions> implies <LSH> = <RSH>`
;; Negate the query, unsat implies no counterexamples exist.
(not
;; Implication
(=>
;; Assumptions
(and
(= (_ bv32 32) ty)
(= (bvand (bvnot (_ bv4096 32)) arg_1) (_ bv0 32))
(= arg_1 (bvneg ((_ zero_extend 20) arg_imm12_from_negated_value))))
;; LHS = RHS
(=
(bvadd arg_0 (bvneg ((_ zero_extend 20) arg_imm12_from_negated_value)))
(bvsub arg_0 ((_ zero_extend 20) arg_imm12_from_negated_value)))))
;; Verification succeeded
I can try and wrap this up into a neater demo to share on Monday's meeting, if there's interest.
Alexa VanHattum said:
I can try and wrap this up into a neater demo to share on Monday's meeting, if there's interest.
I would be interested! This looks fantastic!
Eventually, we'll want the RHS to always reason on the appropriate 32-bit or 64-bit BV, even if the query's type is something like i8.
Do you mean lowering the RHS to the width of the register classes that it is using to implement the LHS? In x64, for example, we don't differentiate between e.g. rax
and eax, even though we will at times emit instructions that just use
eax/the lower half of
rax. I wonder if it makes sense to represent the full bit width of e.g.
rax` even when we are lowering a 32-bit or smaller LHS. This could be a small step towards catching bugs like our CVE where we spilled a 32-bit value and then reloaded it with the wrong kind of extension before using it as the address for a memory access.
@Alexa VanHattum yes, we'd love to hear more about this at the next meeting! (Please feel free to create a PR to add to the meeting agenda any time; happy to do it now as well if you'd like)
I'm mostly curious about the longer-term story for integration of semantics models on both ends (CLIF and the target ISA); I imagine that making this generic would also largely take care of the specific questions of types. I would hope that in the eventual design we don't have to pattern-match on has_type
or others as a special case, but that it just falls out of the semantics that we understand about each term.
Another random thought that comes to mind re bitvector widths is that maybe at the SMT-variable level, it would make sense for all bindings to be a bitvector large enough for any value (128 bits currently), but then the specific constraints emitted for terms use only the appropriate bits? This could be a way of avoiding the need to do a kind of type inference in your lowering to SMT, at the cost of a small semantic gap
Great, submitted a PR to add to the agenda!
Re: the currently-less-generic nature of how we're handling types, we anticipant that types may need to be a special case to facilitate verification scaling. We could, as you say, over-approximate everything as an i128, but doing so might cause us to hit total timeouts in more complicated cases where smaller bitwidths are tractable. But I totally agree that the current handling of has_type
is way more of a special case than our goal.
One open question with respect to the semantics models for CLIF and the target ISA is how to handle rules that write to intermediate ISLE terms not in the target ISA. While we could provide semantics for every intermediate term (for example, the sub_imm
term is further rewritten later), it might be nice for symbolic execution to continue across multiple rules. This would let us only provide (initially hand-specified) semantics for starting CLIF terms and final ISA terms. This is part of why we are curious about the depths of the trie for combining ISLE rules.
That's a good point, one model could just be to statically elaborate/inline rules from input language to output. At one point I actually had wanted to build the codegen in islec this way; I decided to go instead to a model that actually generates a separate Rust function per intermediate term to keep the code-size expansion down, but I didn't get to a point where I'd be able to actually measure that expansion
one potential complexity is that the language semantics allow, at least in theory, for statically-unbounded recursion (cycles can occur in the callgraph). In practice though we don't use that ability (though I think Ulrich may have in some prototype call-ABI stuff)
In any case, static elaboration and bailing on cycles feels like a good first-pass answer here, and we can solve any scalability problems (maybe by refactoring code, maybe by doing a conditionally-context-sensitive thing or using intermediate invariants) as they come up
https://unsat.cs.washington.edu/projects/serval/ seems relevant. No idea how it compares with the approach currently taken, but it seems to have been used for finding bugs in the linux ebpf jit: https://github.com/uw-unsat/jitterbug
Last updated: Jan 24 2025 at 00:11 UTC