During the last decade, code inspection for standard programming errors has largely been automated with static code analysis. Commercially available static program analysis tools are now routinely used in many software development organizations. These tools are popular because they find many (real) software bugs, thanks to three main ingredients: they are automatic, they are scalable, and they check many properties. In general, a tool that is able to check automatically (with sufficient precision) millions of lines of code against hundreds of coding rules and properties is bound to find on average about one bug (i.e., code error) every thousand lines of code.
As basic code inspection can be achieved using automated code analysis, cost, as part of the software development process, is typically reasonable and manageable. However, a more thorough type of testing, referred to as “software testing”, is a more costly part of the software development process that usually accounts for about 50% of the R&D budget of software development organizations.
Software testing relies on so-called “test cases” or more simply “tests”. To be efficient, tests should be generated in a relevant manner. For example, tests may be generated on the basis of information acquired from analyzing a program. Automating test generation from program analysis can roughly be partitioned into two groups: static versus dynamic test generation. Static test generation consists of analyzing a program statically to attempt to compute input values to drive its executions along specific program paths. In contrast, dynamic test generation consists in executing a program, typically starting with some random inputs, while simultaneously performing a symbolic execution to collect symbolic constraints on inputs obtained from predicates in branch statements along the execution, and then using a constraint solver to infer variants of the previous inputs in order to steer program executions along alternative program paths. Since dynamic test generation extends static test generation with additional runtime information, it can be more powerful.
While aspects of scalability of dynamic test generation have been recently addressed, significant issues exist as to how to dynamically check many properties simultaneously, thoroughly and efficiently, to maximize the chances of finding bugs during an automated testing session.
Traditional runtime checking tools (e.g., Purify, Valgrind and AppVerifier) check a single program execution against a set of properties (such as the absence of buffer overflows, uninitialized variables or memory leaks). Such techniques are referred to herein as traditional passive runtime property checking. As an example, consider the program: int divide(int n, int d){// n and d are inputs return (n/d); // division-by-zero error if d==0}. The program “divide” takes two integers n and d as inputs and computes their division. If the denominator d is zero, an error occurs. To catch this error, a traditional runtime checker for division-by zero would simply check whether a concrete value of d satisfies (d==0) just before the division is performed for a specific execution run, but it would not provide any insight or guarantee concerning other executions. Further, testing this program with random values for n and d is unlikely to detect the error, as d has only one chance out of 2=to be zero if d is a 32-bit integer. Static (and even dynamic) test generation techniques that attempt to cover specific or all feasible paths in a program will also likely miss the error since the program has a single program path which is covered no matter what inputs are used.
While an attempt at checking properties at runtime on a dynamic symbolic execution of a program has been reported, such an approach is likely to return false alarms whenever symbolic execution is imprecise, which is often the case in practice.
Various exemplary methods, devices, systems, etc., are described herein pertain to active property checking. Such techniques can extend runtime checking by checking whether the property is satisfied by all program executions that follow the same program path.
An exemplary method includes providing software for testing; during execution of the software, performing a symbolic execution of the software to produce path constraints; injecting issue constraints into the software where each issue constraint comprises a coded formula; solving the constraints using a constraint solver; based at least in part on the solving, generating input for testing the software; and testing the software using the generated input to check for violations of the injected issue constraints. Such a method can actively check properties of the software. Checking can be performed on a path for a given input using a constraint solver where, if the check fails for the given input, the constraint solver can also generate an alternative input for further testing of the software. Various exemplary methods, devices, systems, etc., are disclosed.
Non-limiting and non-exhaustive examples are described with reference to the following figures:
Various exemplary methods, devices, systems, etc., actively search for property violations in software. For example, consider the example program “divide” presented in the Background section. By inserting a test “if (d==0) error ( )” before the division (n/d), an attempt can be made to generate an input value for d that satisfies the constraint (d==0), which is now present in the program path. This attempt to generate an input value for d can be used to detect an error. Accordingly, active property checking injects, at runtime, additional symbolic constraints on inputs that, when solvable by a constraint solver, will generate new test inputs leading to potential or certain property violations. In other words, active property checking extends runtime checking by checking whether a property is satisfied by all program executions that follow the same program path. As described herein, such a check can be performed on a dynamic symbolic execution of a given program path using a constraint solver. If the check fails, the constraint solver can generate an alternative program input triggering a new program execution that follows the same program path but exhibits a property violation. Such checking is referred to as “active” checking because a constraint solver is used to “actively” look for assignments that cause a runtime check to fail. In general, an assignment output by a constraint solver is readily mappable to an input for the program undergoing testing.
Combined with systematic dynamic test generation, which attempts to exercise all feasible paths in a program, active property checking defines a new form of program verification.
Active property checking extends the concept of checking properties at runtime on a dynamic symbolic execution of the program by combining it with constraint solving and test generation in order to further check using a new test input whether a property is actually violated as predicted by a prior imperfect symbolic execution. In such a manner, false alarms are eliminated (e.g., never reported). Active property checking can also be viewed as systematically injecting assertions all over a program under test, and then using dynamic test generation to check for violations of those assertions.
As described herein, test generation is automated by leveraging advances in program analysis, automated constraint solving, and increasing computation power available on modern computers. To replicate the success of static program analysis in the testing space, as described herein, various exemplary techniques for active property checking are automatable, scalable and able to check many properties.
As shown in
Given the constraints (e.g., a path constraint and an associated injected constraint on that path), a solution block 130 solves the constraints using a constraint solver. As described in more detail below, a constraint solver determines a solution exists and, if so, it can provide as an output an assignment that satisfies the constraints; otherwise, the constraint solver indicates that no solution exists. The existence of solution infers that a violation may occur, i.e., that the associated “looked for” issue may exist in the software. Accordingly, in the method 100 of
Overall, the method 100 provides for active property checking as a constraint solver actively uses injected constraints to identify inputs that cause a runtime check to fail (e.g., property violations). Such an approach can be combined with dynamic test generation (e.g., to exercise all feasible paths in code) to generate new tests for code verification.
In general, the exemplary method 100 involves the following three processes: normal execution of software; symbolic execution and active checking to insert constraints. These three processes may operate in parallel or in a disjointed manner. For example, a disjointed manner may execute the software and acquire a trace that is then used for symbolic execution and active checking. Various techniques are described herein for parallel operation that can optimize constraint solving (e.g., grouping constraints, etc.). While such techniques are presented that pertain to examples for parallel operation, other techniques and modes of operation may be used. Hence, in various examples, the order may be altered while still achieving active property checking.
As described herein, a constraint can be injected as a formula (e.g., a line or segment of code) into software destined for testing. Such a formula may be generated by a so-called active checker. In general, checkers can be classified as passive checkers or active checkers. A passive checker for a property is a function that takes as input a finite program execution and returns an error message (e.g. “fail”) if the property is violated for the finite program execution. In contrast, an active checker for a property is a function that takes as input a finite program execution and returns a formula such that the formula is satisfiable if and only if there exists some finite program execution that violates the property (e.g., along a common, specified “path constraint”).
As mentioned, a constraint can be a formula, for example, a formula output by an active checker. Examples of active checkers and corresponding constraints include those for division by zero, array bounds and null pointer de-reference. With respect to array bounds, an active checker may insert formulas as symbolic tests prior to all array accesses. The foregoing list of active checkers is not exhaustive. Further, it is important to note that multiple active checkers can be used simultaneously. Yet further, if unrestrained, active checkers may inject many constraints all over program executions; hence, various exemplary techniques can be used optimize injection of constraints to make active tracking more tractable in practice (e.g., by minimizing calls to a constraint solver, minimizing formulas, caching strategies, etc.).
As described herein, an exemplary method can include performing a symbolic execution of software to produce path constraints; injecting issue constraints into the software where each issue constraint comprises a coded formula; solving the constraints using a constraint solver; based at least in part on the solving, generating input for testing the software; and testing the software using the generated input to check for violations of the injected issue constraints.
As described herein, static and dynamic type checking can be extended with active type checking. Efficient implementation of active property checking is presented along with trial results from testing of large, shipped WINDOWS® applications, where active property checking was able to detect several new security-related bugs.
More specifically, the discussion that follows (i) formalizes active property checking semantically and shows how it provides a new form of program verification when combined with systematic dynamic test generation; (ii) presents a type system that combines static, dynamic and active checking for a simple imperative language (e.g., to clarify the connection, difference and complementarity between active type checking and traditional static and dynamic type checking); (iii) explains how to implement active checking efficiently by minimizing the number of calls to a constraint solver, minimizing formula sizes and using two constraint caching schemes; (iv) describes an exemplary implementation of active property checking in SAGE (see, e.g., P. Godefroid, M. Y. Levin, and D. Molnar. Automated Whitebox Fuzz Testing. Technical Report MS-TR-2007-58, Microsoft, May 2007), a tool for security testing of file-reading WINDOWS® applications that performs systematic dynamic test generation of x86 binaries; and (v) results of trials with large, shipped WINDOWS® applications where active property checking was able to detect several new bugs in those applications.
Systematic Dynamic Test Generation
Dynamic test generation consists of running a program P under test both concretely, executing the actual program, and symbolically, calculating constraints on values stored in program variables x and expressed in terms of input parameters α. In general, side-by-side concrete and symbolic executions are performed using a concrete store Δ and a symbolic store Σ, which are mappings from program variables to concrete and symbolic values, respectively. A symbolic value is any expression sv in some theory T where all free variables are exclusively input parameters α. For any variable x, Δ(x) denotes the concrete value of x in Δ, while Σ(x) denotes the symbolic value of x in Σ. The judgment Δ ├e→v means that that an expression e reduces to a concrete value v, and similarly Σ├e→sv means that e reduces to a symbolic value sv. For notational convenience, it is assumed that Σ(x) is always defined and is simply Δ(x) by default if no expression in terms of inputs is associated with x. The notation Δ(x→c) denotes updating the mapping Δ so that x maps to c.
The program P manipulates the memory (concrete and symbolic stores) through statements, or commands, that are abstractions of the machine instructions actually executed. A command can be an assignment of the form x:=e (where x is a program variable and e is an expression), a conditional statement of the form if e then C else C′ where e denotes a Boolean expression, and C and C′ are continuations denoting the unique next statement to be evaluated (programs considered here are thus sequential and deterministic), or stop corresponding to a program error or normal termination.
Given an input vector {right arrow over (a)} assigning a value to every input parameter α, the evaluation of a program defines a unique finite program execution s0
that executes the finite sequence C1 . . . Cn of commands and goes through the finite sequence s1 . . . sn of program states. Each program state is a tuple {C, Δ, Σ, pc} where C is the next command to be evaluated, and pc is a special meta-variable that represents the current path constraint. For a finite sequence w of statements (i.e., a control path w), a path constraint pcw is a formula of theory T that characterizes the input assignments for which the program executes along w. To simplify the presentation, it is assumed that all the program variables have some default initial concrete value in the initial concrete store Δ0, and that the initial symbolic store Σ0 identifies the program variables v whose values are program inputs (for all those, we have Σ0(v)=α where α is some input parameter). Initially, pc is defined to true.
Systematic dynamic test generation consists of systematically exploring all feasible program paths of the program under test by using path constraints and a constraint solver. By construction, a path constraint represents conditions on inputs that need be satisfied for the current program path to be executed. Given a program state <C, Δ, Σ, pc> and a constraint solver for theory T, if C is a conditional statement of the form if e then C else C′, any satisfying assignment to the formula pcsv (respectively pcsv) defines program inputs that will lead the program to execute the then (resp. else) branch of the conditional statement. By systematically repeating this process, such a directed search can enumerate all possible path constraints and eventually execute all feasible program paths.
Such a directed search is exhaustive provided that the generation of the path constraint (including the underlying symbolic execution) and the constraint solver for the given theory T are both sound and complete, that is, for all program paths w, the constraint solver returns a satisfying assignment for the path constraint pcw if and only if the path is feasible (i.e., there exists some input assignment leading to its execution). In this case, in addition to finding errors such as the reachability of bad program statements (like assert (0)), a directed search can also prove their absence, and therefore obtain a form of program verification.
Accordingly, Theorem 1 is presented: Given a program P as defined above, a directed search using a path constraint generation and a constraint solver that are both sound and complete exercises all feasible program paths exactly once.
In this case, if a program statement has not been executed when the search is over, this statement is not executable in any context.
In practice, path constraint generation and constraint solving are usually not sound and complete. When a program expression cannot be expressed in the given theory T decided by the constraint solver, it can be simplified using concrete values of sub-expressions, or replaced by the concrete value of the entire expression. For example, if the solver handles only linear arithmetic, symbolic sub-expressions involving multiplications can be replaced by their concrete values.
Active Checkers
Even when sound and complete, a directed search based on path exploration alone can miss errors that are not path invariants, i.e., that are not violated by all concrete executions executing the same program path, or errors that are not caught by a program's runtime environment. For example, consider the following program:
This program takes as (untrusted) input an integer value stored in variable x. A buffer overflow in line 3 will be detected at runtime only if a runtime checker monitors buffer accesses. Such a runtime checker would thus check whether any array access of the form a[x] satisfies the condition 0≦Δ(x)<b where Δ(x) is the concrete value of array index x and b denotes the bound of the array a (b is 20 for the array buf[ ] in the foregoing example). As described herein, such a traditional runtime checker for concrete values is referred to as a passive checker.
Moreover, a buffer overflow is also possible in line 7 provided x==20, yet a directed search focused on path exploration alone may miss this error. The reason is that the only condition that will appear in a path constraint for this program is x>20 and its negation. Since most input values for x that satisfy(x>20) do not cause the buffer overflow, the error will likely be undetected with a directed search as already defined.
To catch the buffer overflow on line 7, the program should be extended with a symbolic test 0≦Σ(x)<b (where Σ(x) denotes the symbolic value of array index x) just before the buffer access buf[x] on line 7. This approach will force the condition 0≦x≦20 to appear in the path constraint of the program in order to refine the partitioning of its input values. An exemplary active checker for array bounds can be viewed as systematically adding such symbolic tests before all array accesses.
Formally, passive checkers and active checkers may be defined as follows.
Definition 1. A passive checker for a property π is a function that takes as input a finite program execution w, and returns “fail π” iff the property π is violated by w. Because it is assumed all program executions terminate, properties considered here are safety properties. Runtime property checkers like Purify, Valgrind and AppVerifier are examples of tools implementing passive checkers.
Definition 2. Let pcw denote the path constraint of a finite program execution w. An active checker for a property π is a function that takes as input a finite program execution w, and returns a formula φc such that the formula pcwφc is satisfiable iff there exists a finite program execution w violating property π and such that pcw′=pcw.
Exemplary active checkers can be implemented in various ways, for instance using property monitors/automata, program rewrite rules or type checking. They can use private memory to record past events (leading to a current program state), but, in general, they are not allowed any side effect on a program.
Further below, detailed examples are presented of how active checkers can be formally defined and implemented. Below, are some examples of specifications for exemplary active property checkers.
Example 1 is Division By Zero: Given a program state where the next statement involves a division by a denominator d which depends on an input (i.e., such that Σ(d) ≠Δ(d)), an active checker for division by zero outputs the constraint φDIV=(Σ(d)≠0).
Example 2 is Array Bounds: Given a program state where the next statement involves an array access a[x] where x depends on an input (i.e., is such that Σ(x)≠Δ(x)), an active checker for array bounds outputs the constraint φBuf=(0≦Σ(x)<b) where b denotes the bound of the array a.
Example 3 is NULL Pointer Dereference: Consider a program expressed in a language where pointer dereferences are allowed (unlike our simple language SimpL). Given a program state where the next statement involves a pointer dereference *p where p depends on an input (i.e., such that Σ(p)≠Δ(p)), an active checker for NULL pointer dereference generates the constraint φNULL=(Σ(p)≠NULL).
Multiple active checkers can be used simultaneously by simply considering separately the constraints they inject in a given path constraint. In such a way, they are guaranteed not to interfere with each other (since they have no side effects). A discussion of how to combine active checkers to maximize performance appears further below.
By applying an active checker for a property π to all feasible paths of a program P, we can obtain a form of verification for this property, that is stronger than Theorem 1.
Consider Theorem 2: Given a program P as defined above, if a directed search (1) uses a path constraint generation and constraint solvers that are both sound and complete, and (2) uses both a passive and an active checker for a property π in all program paths visited during the search, then the search reports “fail π” iff there exists a program input that leads to a finite execution violating φ.
Proof Sketch: Assume there is an input assignment that leads to a finite execution w of P violating π. Let pcw be the path constraint for the execution path w. Since path constraint generation and constraint solving are both sound and complete, we know by Theorem 1 that w will eventually be exercised with some concrete input assignment α. If the passive checker for π returns “fail π” for the execution of P obtained from input α (for instance, if α=a), the proof is finished. Otherwise, the active checker for π will generate a formula φc and call the constraint solver with the query pcwφc. The existence of α implies that this query is satisfiable, and the constraint solver will return a satisfying assignment from which a new input assignment α is generated (α could be α itself). By construction, running the passive checker for π on the execution obtained from that new input α will return “fail π”.
Note that in the foregoing example, both passive checking and active checking are used to obtain the result (see also the example for buffer overflow). In practice, however, symbolic execution, path constraint generation, constraint solving, passive and active property checking are typically not sound and complete, and therefore active property checking reduces to testing.
Active Type Checking
An exemplary framework is described below for specifying checkers, which illustrates their complementarity with traditional static and dynamic checking. The framework includes aspects of “hybrid type checking” as it observes that type-checking a program statically is undecidable in general, especially for type systems that permit expressive specifications. Therefore, the framework aims to satisfy the need to handle programs for which one cannot decide statically that the program violates a property, but may in fact satisfy the property. The hybrid type checking approach can automatically insert run-time checks for programs in a language λH in cases where typing cannot be decided statically.
The exemplary frame extends aspects of hybrid property checking to active checking. A particular example, implements active checking with a simple imperative language CSimpL and a type system that supports integer refinement types, in which types are defined by predicates, and subtyping is defined by logical implication between these predicates.
Also described below is an exemplary method for compiling programs that can either statically reject a program as ill-typed, or insert casts to produce a well-typed program. In this example, each cast performs a run-time membership check for a given type and raises an error if a run-time value is not of the desired type.
A key property of various exemplary approaches is that the run-time check is a passive checker in the sense of a post-compilation program computes a function on its own execution that returns “fail φ” if and only if the run-time values violate a cast's type membership check.
Define below is a side-by-side symbolic and concrete evaluation of the language CSimpL to generate symbolic path conditions from program executions and symbolic membership checks from casts. As described herein, these symbolic membership checks are active checkers with respect to the run-time membership checks. Therefore, a type environment can be thought of as specifying a property: a first attempt is made to prove that this property holds statically or rejects the program statically. Where decisions in some portions of the program fail to occur, insertion of casts occur. The inserted casts give rise to passive and active checkers for the particular property.
Two examples of specifying properties with type environments are presented with checks for division by zero and integer overflow. Various cases are discussed where different type environments can be combined to simultaneously check different properties.
Simple Language with Casts CSimpL
Semantics and a type system for an imperative language with casts, CSimpL, are described below, which allows for demonstrating active type checking.
A value v is either an integer i or a Boolean constant b. An operand o is either a value or a variable reference x. An expression e is either an operand or an operator application op(o1 . . . on) for some operator name op and operands o1 . . . on. An operator denotation is a partial function from tuples of values to a value. A concrete store Δ is a map from variables and operator names to values and operator denotations respectively.
CSimpL supports integer refinement and Boolean types. Type Bool classifies Boolean expressions and Boolean values true and false. Integer refinement types have the form {x: Intlt} for some Boolean expression t whose only free variable may be x. A refinement type denotes the set of integers that satisfy the Boolean expression. A refinement type T is said to be a subtype of a refinement type S, written T<: S, if the denotation of T is a subset of the denotation of S. A value v is said to have type T, written vεT, either if v is a Boolean value and T is Bool or if v is an integer in the denotation of T. Note that this value typing relation is decidable.
A type environment ┌ is a map from variables and operator names to types and operator signatures respectively. An operator signature has the form op(S1 . . . Sn): T where S1 . . . Sn are the types of the parameters and T is the type of the result. A cast set G is a type environment whose domain contains only variables.
As described herein, a concrete store Δ corresponds to a type environment ┌, written Δε┌ if for any variable x, one has Δ(x)ε┌(x) and for any operator op, one has ┌(op)=op(S1 . . . Sn): T and Δ(op)= with such that defined on any value tuple v1 . . . vn viεSi for 0<I≦n and (v1 . . . vn)εT. A concrete store Δ satisfies a cast set G, written ΔεG, if for any variable x in the domain of G, we have Δ(x)εG(x).
Given two refinement types T={x: Int|t1} and S={x: Int|t2}, the intersection of T and S, denoted T∩S is defined to be the refinement type T={x: Int|t1t2}. Assuming that two type environments ┌1 and ┌2 agree on the return types of operators, ┌1∩┌2 is defined point-wise.
A program C consists of commands and is defined by the following grammar:
In this example, each non-halting command is annotated with a cast set specifying the type assumptions that must be checked dynamically before the command is executed. Additionally, the assignment command also specifies a cast on the right hand side expression.
A command C contains a failed cast under Δ either if the cast set of C is not satisfied by Δ or if C is of the form (G)x:=<T>e; C with e evaluating to some value v such that v∉T.
Theorem 3. (Type preservation.) Let Δ and ┌ be a concrete store and a type environment such that Δε┌. Then the following two properties hold:
1. If ┌├e: T and Δ├e→v, then ┌├v:T.
2. If ┌├C and <C,Δ>→<C′,Δ′>, then ┌├C′ and Δ′ε┌.
Theorem 4. (Progress.) Let Δ and ┌ be a concrete store and a type environment such that Δε┌. If ┌├C, then either <C,Δ>→<C′,Δ′>, or C contains a failed cast under Δ.
Casts Insertion
The typing relation defined above does not give an algorithm for checking whether an arbitrary program is well-typed because it relies on checking subtyping which is undecidable in general. In practice, it is common to use a theorem prover that can validate or invalidate some subtyping assumptions and fail to produce a definitive answer on others. As described herein, a theorem prover is modeled by an algorithmic subtyping relation that, given two refinement types T and S, can either fail to produce an answer, written T<:alg?S, return true, written T<:algok S, or return false, written T≮:algok S such that T<:algok S and T≮:algok S imply T<: S and T≮: S respectively.
The following two theorems establish the static properties of the compilation algorithm:
To show that the compiled program and the result of the compilation are equivalent at run-time, a k-step evaluation relation is first introduced. A program C0 and a store Δ0 are said to make k steps producing a program Ck and a store Δk, written C0,Δ0→<Ck,Δk>, if <Ci,Δi>→<Ci+1,Δi+1> for 0≦i≦k.
The following theorem establishes that the result of the compilation algorithm is equivalent to the original program by stating that if the latter can make k steps then the former either can make exactly the same k steps or fail on an inserted cast along the road:
Below, an exemplary side-by-side symbolic and concrete evaluation of a compiled program is described with casts constructs both path constraints and active checker constraints.
This approach introduces two auxiliary judgments for use in the definition of the side-by-side evaluation judgment. A symbolic value sv has type T provided that an input constraint φ is satisfied, written svεTφ, if T={x: Int|t} and t[x:=sv] where t[x:=sv] denotes a Boolean expression obtained by substituting sv for x. This judgment is referred to herein as symbolic value typing.
A symbolic store Σ is said to satisfy a cast set G provided that an input constraint set φ is satisfied, written Σ├Gφ, if for any variable x in the domain of G, one has Σ(x)εG(x)φx and φ=∪x{(φx}. This judgment is referred to herein as symbolic cast checking.
The side-by-side program evaluation rules are defined in terms of the concrete evaluation judgments discussed above. The concrete evaluation rules can be recovered by removing all the symbolic artifacts from the side-by-side rules. The rule X-IF1 and X-IF2 are similar to the corresponding rules described above. The key difference here is that we use the cast checking judgment Σ├Gφ which is a set of symbolic values which abstract the concrete type membership checks performed to satisfy G.
The path constraints and the cast checking constraints generated by side-by-side evaluation can be used to compute concrete inputs which drive the program to a failed cast. We write Σ∘{right arrow over (α)}→Δ to represent substituting input parameters by values and then reducing each entry of Σ to obtain a concrete store. One can write {right arrow over (a)}(φ)→v to mean substituting values for variables in φ and reducing to a value. For a concrete store Δ, one can write ΔΣ if there exists an {right arrow over (α)} that reduces Σ to Δ.
One can now prove that a checker constraint generated at a program point is an active checker for the property of failing a runtime type membership check. That is, if one takes a path constraint pc and a constraint φεφ, then pcφ is satisfiable if and only if there is some execution along the same path that causes a runtime check to fail. The main idea is that this should hold because φ is an abstraction of the run-time membership check. This is stated formally as follows:
1. If there exists an input assignment {right arrow over (α)} such that {right arrow over (α)}(pckφ)→true with Σ0∘{right arrow over (α)}→Δ′0 for some φεφ, then <C0,Δ′0>→i<Ci,Δ′i> and Ci contains a failed cast under Δ′i for some 0≦i≦k.
2. If there exists an input {right arrow over (α)} such that {right arrow over (α)}(pc0)→true and Σ0∘{right arrow over (α)}→Δ′0 with <C0,Δ′0>→k<Ck,Δ′k> where Ck contains a failed cast under Δk′, then {right arrow over (α)} (pckφ)→true for some φεφ.
Some Active Checker Examples
Division by Zero. As an example, an active checker for division by zero errors is described. First, the refinement type notzero is defined as the set of values x of type Int such that x≠0. Then, the type of the division operator div is defined as taking an Int and a notzero argument. Finally, a semantic implementation function for division is needed, which in this example is the standard division function for integers. These are shown as exemplary type and implementation for the division operator:
Overall, the example of
Integer overflow/underflow. Integer overflow and underflow and related machine arithmetic errors are a frequent cause of security-critical bugs. As described herein, an exemplary approach defines upper and lower bounds for signed and unsigned integer types and then inserts a check whenever a potentially unsafe assignment is carried out. This approach then follows by capturing these checks with exemplary refinement types 800 of
Combining Checkers
Finally, attention is focused on the question of combining checkers for different integer refinement properties. The following definition defines what it means for one environment to be a restriction of another.
Definition 3. (Environment restriction.) It can be stated that ┌<: ┌′ if the following two conditions hold:
1. If ┌├xi: Si, and ┌′├xi: Si′ for some xi, then Si<:Si′
2. If ┌├op(T1 . . . Tn): T and ┌′op(T1′ . . . Tn′): T for some operator op, then Ti<: Ti′ for 1≦i≦n.
It is desirable to ensure that a restricted environment provides for more checking. In particular, if two environments ┌ and ┌′ are used, where ┌<: ┌′, to insert casts into the same program, it is desirable to ensure that ┌ does not “disable” the casts inserted by ┌′. The following lemma states this formally:
1. If ┌├C and Δε┌, then ┌′├C and Δε┌′.
2. Let ┌├C0C0′ and ┌′├C0C0″. Let Δ0 be a concrete store such that Δ0ε┌. Then if <C0′,Δ0>→k<Ck′,Δk>, and Ck′ contains a failed cast under Δk, then <C0″,Δ0>→i<Ci″,Δi> and Ci″ contains a failed cast under Δi, for some 0≦i≦k.
Suppose one wants to check both integer overflow and division by zero simultaneously where existing type environments ┌int and ┌div exist that check each property individually. By this setup, it is desirable to construct a new type environment ┌ such that if a program compiled with ┌int fails on some input, then the same program compiled with ┌ fails on that input, and similarly for ┌div. The following lemma indicates that the intersection ┌=int∩┌div is a restriction of both ┌int and ┌div:
Together with the monotonicity property, this gives a desired result. Specifically, to check both properties, an exemplary approach compiles with the intersection environment, giving the following theorem:
Active checkers can be viewed as injecting additional constraints in path constraints in order to refine the partitioning on input values. In practice, active checkers may inject many such constraints all over program executions, making path explosion even worse than with path exploration alone. Hence, described below several optimizations are presented to help make active checking tractable in practice.
Minimizing Calls to the Constraint Solver
As already discussed, (negations of) constraints injected by various active checkers in a same path constraint can be solved independently one-by-one since they have no side effects. This is called a naive combination of checker constraints.
However, the number of calls to the constraint solver can be reduced by bundling together constraints injected at the same or equivalent program states into a single conjunction. If pc denotes the path constraint for a given program state, and φC1, . . . , φCn are a set of constraints injected in that state by each of the active checkers, one can define the combination of these active checkers by injecting the formula φC=φC1 . . . φCn in the path constraint, which will result in the single query pc(φC1 . . . φCn) to the constraint solver. As described herein, one can also bundle in the same conjunction constraints φCi injected by active checkers at different program states anywhere in between two conditional statements, i.e., anywhere between two constraints in the path constraint (since those program states are indistinguishable by that path constraint). This combination reduces the number of calls to the constraint solver but, if the query pc)φC1 . . . φCn) is satisfied, a satisfying assignment produced by the constraint solver may not satisfy all the disjuncts, i.e., it may violate only some of the properties being checked. Hence, this is called a weakly-sound combination.
A strongly-sound, or “sound” for short, combination can be obtained by making additional calls to the constraint solver using the procedure or function:
The foregoing function can be called to compute a strongly-sound combination of active checkers. For example, one can call CombineActiveCheckers(Ø, pc, φC1, . . . , φCn) where this call returns a set I of input values that covers all the disjuncts that are satisfiable in the formula pc(φC1 . . . φCn). The function first queries the solver with the disjunction of all the checker constraints (line 1). If the solver returns UNSAT, it is known that all of these constraints are unsatisfiable (line 2). Otherwise, it is possible to check the solution x returned by the constraint solver against each checker constraint to determine which are satisfied by solution x (line 3). (This is a model-checking check, not a satisfiability check; in practice, this can be implemented by calling the constraint solver with the formula (biφCi) pc(bi) where bi is a fresh Boolean variable which evaluates to true iff φCi is satisfied by a satisfying assignment x returned by the constraint solver; determining which checker constraints are satisfied by x can then be performed by looking up the values of the corresponding bits bi in solution x). Then, removal of these checker constraints from the disjunction (line 4) can be performed and a query issued to the solver again until all checker constraints that can be satisfied have been satisfied by some input value in I. If t out of the n checkers can be satisfied in conjunction with the path constraint pc, this function requires at most min(t+1, n) calls to the constraint solver, because each call removes at least one checker from consideration. Obtaining strong soundness with fewer than t calls to the constraint solver is not possible in the worse case. Note that the naive combination defined above is strongly-sound, but always requires n calls to the constraint solver.
It is worth emphasizing that none of these combination strategies attempt to minimize the number of input values (solutions) needed to cover all the satisfiable disjuncts. This could be accomplished by querying first the constraint solver with the conjunction of all checker constraints to check whether any solution satisfies all these constraints simultaneously, i.e., to check whether their intersection is non-empty. Otherwise, one could then iteratively query the solver with smaller and smaller conjunctions to force the solver to return a minimum set of satisfying assignments that cover all the checker constraints. This procedure may require in the worse case O (2n) calls to the constraint solver; noting that the problem can be shown to be NP-complete by a reduction from the NP-hard SET-COVER problem.
Weakly and strongly sound combinations capture possible overlaps, inconsistencies or redundancies between active checkers at equivalent program states, but is independent of how each checker is specified: it can be applied to any active checker that injects a formula at a given program state. Also, the above definition is independent of the specific reasoning capability of the constraint solver. In particular, the constraint solver may or may not be able to reason precisely about combined theories (abstract domains and decision procedures) obtained by combining individual constraints injected by different active checkers. However, as described herein, any level of precision is acceptable and useful.
Some Minimizing Formulas
In general, minimizing the number of calls to the constraint solver should not be achieved at the expense of using longer formulas. Various exemplary strategies, described above, for combining constraints injected by active checkers can also reduce formula sizes.
For instance, consider a path constraint pc and a set of n constraints φC1 . . . φCn to be injected at the end of pc. The naive combination makes n calls to the constraint solver, each with a formula of length |pc|+|φCi|, for all 1≦i≦n. In contrast, the weak combination makes only a single call to the constraint solver with a formula of size |pc|+Σ1≦i≦n|φCi|, i.e., a formula (typically much) smaller than the sum of the formula sizes with the naive combination. The strong combination makes, in the worse case, n calls to the constraint solver with formulas of size |pc|+Σ1≦i≦j|φCi| for all 1≦j≦n, i.e., possibly bigger formulas than the naive combination. But often, the strong combination makes fewer calls than the naive combination, and matches the weak combination in the best case (when none of the disjunctsφCi are satisfiable).
In practice, path constraints pctend to be long, much longer than injected constraints φCi. A simple optimization includes eliminating the constraints in pc which do not share symbolic variables (including by transitivity) with the negated constraint c to be satisfied. This unrelated constraint elimination can be done syntactically by constructing an undirected graph G with one node per constraint in pc∪{c} and one node per symbolic (input) variable such that there is an edge between a constraint and a variable iff the variable appears in the constraint. Then, starting from the node corresponding to constraint c, one performs a (linear-time) traversal of the graph to determine with constraints c′ in pc are reachable from c in G. At the end of the traversal, only the constraints c′ that have been visited are kept in the conjunction sent to the constraint solver, while the others are eliminated.
With unrelated constraint elimination and the naive checker combination, the size of the reduced path constraint pci may vary when computed starting from each of the n constraints φCi injected by the active checkers. In this case, n calls to the constraint solver are made with the formulas pciφCi, for all 1≦i≦n. In contrast, the weak combination makes a single call to the constraint solver with the formula pc′(φCi) where pc′ denotes the reduced path constraint computed when starting with the constraint φCi. It may be shown that |pc′|≦Σi|pci|, and therefore that the formula used with the weak combination is again smaller than the sum of the formula sizes used with the naive combination. Loosely speaking, the strong combination includes again both the naive and weak combinations as two possible extremes.
Some Caching Strategies
Regardless of the chosen strategy for combining checkers at a single program point, constraint caching can significantly reduce the overhead of using active checkers.
To illustrate the benefits of constraint caching, consider a NULL dereference active checker and the program Q:
This program has O(2k) possible execution paths. A naive application of a NULL dereference active checker results in O(k2k) additional calls to the constraint solver, while local constraint caching eliminates the need for any additional calls to the constraint solver.
More specifically, program Q has 2k+1 executions, where 2k of those dereference the input pointer x k times each. A naive approach to dynamic test generation with a NULL dereference active checker would inject k constraints of the form x≠NULL at each dereference of *x during every such execution of Q, which would result in a total of k·2k additional calls to the constraint solver (i.e., k calls for each of those executions).
To limit this expensive number of calls to the constraint solver, a first optimization consists of locally caching constraints in the current path constraint in such a way that syntactically identical constraints are never injected more than once in any path constraint; noting that path constraints are generally simply conjunctions. Such an optimization is applicable to any path constraint, with or without active checkers. The correctness of this optimization is based on the following observation: if a constraint c is added to a path constraint pc, then for any longer pc′ extending pc, one has pc′pc (where denotes logical implication) and pc′c will always be unsatisfiable because c is in pc′. In other words, adding the same constraint multiple times in a path constraint is pointless since only the negation of its first occurrence has a chance to be satisfiable.
Constraints generated by active checkers can be dealt with by injecting those in the path constraint like regular constraints. Indeed, for any constraint c injected by an active checker either at the end of a path constraint pc or at the end of a longer path constraint pc′ (i.e., such that pc′pc), the following holds:
if pcc is unsatisfiable, then pc′c is unsatisfiable; conversely, if pc′c is satisfiable, then pcc is satisfiable (and has the same solution).
Therefore, one can checkc as early as possible, i.e., in conjunction with the shorter pc, by inserting the first occurrence of c in the path constraint. If an active checker injects the same constraint later in the path constraint, local caching will simply remove this second redundant occurrence.
By injecting constraints generated by active checkers into regular path constraints and by using local caching, a given constraint c, like x≠NULL in the previous example, will appear at most once in each path constraint, and a single call to the constraint solver will be made to check its satisfiability for each path, instead of k calls as with the naive approach without local caching. Moreover, because the constraint x≠NULL already appears in the path constraint due to the if statement on line 4 before any pointer dereference *x on lines 6 or 7, it will never be added again to the path constraint with local caching, and no additional calls will be made to the constraint solver due to the NULL pointer dereference active checker for this example.
Another optimization consists of caching constraints globally: whenever the constraint solver is called with a query, this query and its result are kept in a (hash) table shared between execution paths during a directed search. The effect of both local and global caching is measured empirically and discussed further below.
Examples of Active Checkers in SAGE
Various exemplary trials were performed. The trials implemented active checkers as part of a dynamic test generation tool called SAGE (Scalable, Automated, Guided Execution). The SAGE tool uses a tool called iDNA tool to trace executions of WINDOWS® programs, then virtually re-executes these traces with the TruScan trace replay framework. During re-execution, SAGE checks for file read operations and marks the resulting bytes as symbolic. As re-execution progresses, SAGE generates symbolic constraints for the path constraint. After re-execution completes, SAGE uses the constraint solver Disolver to generate new input values that will drive the program down new paths. SAGE then completes this cycle by testing and tracing the program on the newly generated inputs. The new execution traces obtained from those new inputs are sorted by the number of new code block they discover, and the highest ranked trace is expanded next to generate new test inputs and repeat the cycle. Note that SAGE does not perform any static analysis.
An active checker in SAGE first registers a TruScan callback for specific events that occur during re-execution. For example, an active checker can register a callback that fires each time a symbolic input is used as an address for a memory operation. The callback then inspects the concrete and symbolic state of the re-execution and decides whether or not to emit an active checker constraint. If the callback does emit such a constraint, SAGE stores it in the current path constraint.
SAGE implements a generational search: given a path constraint, all the constraints in that path are systematically negated one-by-one, placed in a conjunction with the prefix of the path constraint leading it, and attempted to be solved with the constraint solver. Constraints injected by active checkers are inserted in the path constraint and treated as regular constraints during a generational search.
Because trials pertain to x86 machine-code traces, some information desirable for use as part of an exemplary active checker approach is not immediately available. For example, when SAGE observes a load instruction with a symbolic offset during re-execution, it is not clear what the bound should be for the offset. As described herein, a work around for these limitations includes leveraging the TruScan infrastructure. During re-execution, TruScan observes calls to known allocator functions. By parsing the arguments to these calls and their return values, as well as detecting the current stack frame, TruScan builds a map from each concrete memory address to the bounds of the containing memory object. An exemplary approach uses the bounds associated with the memory object pointed to by the concrete value of the address as the upper and lower bound for an active bounds check of the memory access.
Evaluation
Trials were performed using an exemplary approach that extends SAGE with active checkers. This approach was applied to two media-parsing applications widely used on a WINDOWS® operating system.
For each application, microbenchmarks were run to quantify the marginal cost of active checking during a single symbolic execution task and to measure the effectiveness of various exemplary optimizations. Also performed were long-running SAGE searches with active checkers to investigate their effectiveness at finding bugs. These searches were performed on a 32-bit computing device with the WINDOWS® VISTA® operating system. The device included two dual-core AMD OPTERON® 270 processors running at 2 GHz, with 4 GB of RAM and a 230 GB hard drive; all four cores were used in each search.
Microbenchmarks
Exemplary trials demonstrate that active checkers produce more test cases than path exploration at a reasonable cost. As already explained, using checkers increases total run time but also generates more tests. For example, all checkers with naive combination for Media 2 create 5122 test cases in 1226 seconds, compared to 1117 test cases in 761 seconds for the case of no active checkers; this gives 4.5 times as many test cases for 61% more time spent in this case.
As explained, the naive combination generates more tests than the strong combination, which itself generates more tests than the weak combination. Perhaps surprisingly, most of the extra time is spent in symbolic execution, not in solving constraints. This may explain why the differences in runtime between the naive, strong and weak cases are relatively not that significant.
Trials were also run with a “basic” set of checkers that consisted only of Array Bounds and DivByZero active checkers; these trials produced fewer test cases, but had little to no runtime penalty for test generation for both test programs.
Exemplary trials demonstrated that the weak combination had the lowest overhead. Observations indicated that the solver time for weak combination of disjunctions was the lowest for Media 2 runs with active checkers and tied for lowest with the naive combination for Media 1. The strong disjunction generates more test cases, but surprisingly takes longer than the naive combination in both cases. For Media 1, this is due to the strong combination hitting one more 5-second timeout constraints than the naive combination. For Media 2, it is postulated that this is due to the overhead involved in constructing repeated disjunction queries. Because disjunctions in both cases have fairly few disjuncts on average (around 4 or 5), this overhead dominates for the strong combination, while the weak one is still able to make progress by handling the entire disjunction in one query.
Exemplary trials demonstrate that unrelated constraint elimination is important for checkers. The trial implementation of the unrelated constraint optimization introduces additional common subexpression variables. Each of these variable defines a subexpression that appears in more than one constraint. In the worst case, the maximum possible size of a list of constraints passed to the constraint solver is the sum of the number of these variables, plus the size of the path constraint, plus the number of checker constraints injected. Trials collected the maximum possible constraint list size (Max CtrList Size) and the mean size of constraint lists produced after our unrelated constraint optimization (Mean CtrList Size). The maximum possible size does not depend on choice of weak, strong, or naive combination, but the mean list size is slightly affected. In the Media 2 microbenchmarks, it was observed that the maximum possible size jumps dramatically with the addition of checkers, but that the mean size stays almost the same. Furthermore, even in the case without checkers, the mean list size was 100 times smaller than the maximum. The Media 1 case was less dramatic, but still showed post-optimization constraint lists an order of magnitude smaller than the maximum. These results show that unrelated constraint optimization is a key factor to be considered to efficiently implement active checkers.
Macrobenchmarks
For macrobenchmarks, SAGE was run for 10 hours starting from the same initial media file, and generated test cases with no checkers, and with the weak and strong combination of all 13 checkers. Each test case was then tested by running the program with AppVerifier, configured to check for heap errors. For each crashing test case, the checker kinds responsible for the constraints that generated the test were recorded. Since a SAGE search can generate many different test cases that exhibit the same bug, crashing files were bucketed by the stack hash of the crash, which included the address of the faulting instruction. Also reported was a bucket kind, which is a NULL pointer dereference, a read access violation (ReadAV), or a write access violation (WriteAV). It is possible for the same bug to be reachable by program paths with different stack hashes for the same root cause. The trials always reported the distinct stack hashes. Also computed was the hit rate for global caching during each SAGE search.
As demonstrated, checkers can find bugs missed by path exploration.
The type of checkers whose constraint found the crash bucket is also indicated in
As demonstrated, checker yield can vary widely.
As demonstrated, local and global caching are effective. Local caching can remove a significant number of constraints during symbolic execution. For Media 1, an 80% or more local cache hit rate was observed (see
To measure the impact of global caching on macrobenchmark runs, code was added that dumps to disk the SHA-1 hash of each query to the constraint solver, and then computes the global cache hit rate. For Media 1, all searches showed roughly a 93% hit rate, while for Media 2, 27% was observed. These results show that there are significant redundancies in queries made by different test generation tasks during the same SAGE search.
Additional Trials
Exploratory SAGE searches were performed on several other applications, including two shipped as part of MICROSOFT® OFFICE® 2007 applications and two media parsing layers. In one of the applications and media layer, the division by zero checker and the integer overflow checker each created test cases leading to previously-unknown division by zero errors. In the other cases, the trials also discovered new bugs in test cases created by checkers.
In general, the more one checks for property violations, the more one should find software errors. As described herein, active property checking is defined and trials performed using exemplary dynamic property checking methods based on dynamic symbolic execution, constraint solving and test generation. Trials demonstrate how active type checking extends conventional static and dynamic type checking. Various exemplary optimization techniques can implement active property checkers efficiently. Trial results for several large shipped WINDOWS® applications demonstrated how active property checking was able to detect several new bugs in those applications.
Exemplary Computing Device
Computing device 1500 may have additional features or functionality. For example, computing device 1500 may also include additional data storage devices (removable and/or non-removable) such as, for example, magnetic disks, optical disks, or tape. Such additional storage is illustrated in
Computing device 1500 may also contain communication connections 1516 that allow the device to communicate with other computing devices 1518, such as over a network. Communication connections 1516 are one example of communication media. Communication media may typically be embodied by computer readable instructions, data structures, program modules, or other data forms. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.
Number | Name | Date | Kind |
---|---|---|---|
6658651 | O'Brien et al. | Dec 2003 | B2 |
7058910 | Bharadwaj et al. | Jun 2006 | B2 |
7089542 | Brand et al. | Aug 2006 | B2 |
7346486 | Ivancic et al. | Mar 2008 | B2 |
7478367 | Morgan et al. | Jan 2009 | B2 |
7584455 | Ball | Sep 2009 | B2 |
7624304 | Thiagarajan et al. | Nov 2009 | B2 |
7661094 | Blevin et al. | Feb 2010 | B2 |
7945898 | Episkopos et al. | May 2011 | B1 |
8122436 | Costa et al. | Feb 2012 | B2 |
20010010091 | Noy | Jul 2001 | A1 |
20020087949 | Golender et al. | Jul 2002 | A1 |
20050081104 | Nikolik | Apr 2005 | A1 |
20060010428 | Rushby et al. | Jan 2006 | A1 |
20060080389 | Powers et al. | Apr 2006 | A1 |
20060242466 | Tillmann et al. | Oct 2006 | A1 |
20060253739 | Godefroid et al. | Nov 2006 | A1 |
20060265691 | Klinger et al. | Nov 2006 | A1 |
20070033443 | Tillmann et al. | Feb 2007 | A1 |
20070033576 | Tillmann et al. | Feb 2007 | A1 |
20070157169 | Chen et al. | Jul 2007 | A1 |
20070168988 | Eisner et al. | Jul 2007 | A1 |
20080082968 | Chang et al. | Apr 2008 | A1 |
20090113187 | Hansen et al. | Apr 2009 | A1 |
20090132861 | Costa et al. | May 2009 | A1 |
20090132999 | Reyes | May 2009 | A1 |
Entry |
---|
Using Genetic Algorithms to Aid Test-Data Generation for Data-Flow Coverage, 14th Asia-Pacific Software Engineering Conference, 1530-1362/07 2007 IEEE, Ghiduk et al. |
“Automatic, evolutionary test data generation for dynamic software testing”, Sofokleous et al, The Journal of Systems and Software 81 (2008) 1883-1898, Available online Jan. 18, 2008. |
Automatic test data generation for path testing using Gas, Information Sciences 131 (2001) 47-64, Lin et al. |
Automatic Test Data Generation using Constraint Solving Techniques, Got;oeb, et al., ISSTA 98 Clearwater Beach Florida USA, 1998. |
“Active Property Checking”, Godefroid et al., 2008 ACM 978-1-60558-468-Mar. 8, 2010. |
“Generalized Symbolic Execution for Model Checking and Testing”. Khurshid et al., TACAS 2003, LNCS 2619, pp. 553-568, 2003. |
Jackson, et al., “Finding Bugs with a Constraint Solver”, at <<http://sdg.csail.mit.edu/pubs/2000/issta00.pdf>>, ACM, 2000, pp. 12. |
Artho, et al., “Combining test case generation and runtime verification”, at <<http://ase.arc.nasa.gov/visser/symbolicRuntime.pdf>>, Elsevier B.V., 2004, pp. 26. |
Godefroid, “Compositional Dynamic Test Generation”, at <<http://www.inf.ethz.ch/personal/daniekro/classes/se-sem/ss2007/papers/vijay/godefroid.pdf>>, ACM, 2007, pp. 47-54. |
Cadar, et al., “EXE: A System for Automatically Generating Inputs of Death using Symbolic Execution”, available at least as early as Sep. 5, 2007, at <<http://hci.stanford.edu/cstr/reports/2006-01.pdf>>, pp. 20. |
Number | Date | Country | |
---|---|---|---|
20090265692 A1 | Oct 2009 | US |