This application relates to concurrent test generation techniques.
Given the omnipresence of software in today's society, there is a great need to develop technologies that target effective verification technologies for software. In industry, software testing and coverage-based metrics are still the predominant techniques to find correctness and performance issues in software systems. Recently, there has been extensive interest in both sequential test generation methods as well as predictive testing for concurrent programs.
In the past decade, there has been extensive interest in concolic execution for automatically generating tests to increase path coverage of sequential programs. These techniques combine symbolic execution for path exploration with powerful satisfiability modulo theory (SMT) solvers to compute inputs to previously unexplored branches or paths. To allow for a scalable and complete branch or path exploration, these techniques generally fall back upon concrete values observed during execution to handle non-linear computations or calls to external library functions, for which no good symbolic representation is available. The term concolic execution captures the combination of concrete and symbolic path exploration.
Discovering concurrency bugs is inherently hard due to the nondeterminism in multi-thread scheduling. One approach to discover concurrency bugs is based on systematic testing using stateless model checking. Another popular approach uses predictive analysis techniques. In predictive analysis, concurrency bugs are targeted by first observing multi-threaded execution traces on a given test input. Assume that the observed execution trace did not violate any embedded checks for concurrency issues, such as assertions, NULL pointer dereferences, deadlocks, or data races. Predictive analysis then tries to statically find a feasible permutation of the concurrent events of the observed trace, such that the permuted trace violates some property.
In one aspect, a method to test a concurrent program includes performing a concolic multi-trace analysis (CMTA) to analyze the concurrent program by taking two or more runs over many threads and generating a satisfiability modulo theory (SMT) formula to select inputs, schedules and parts of threads from one or more predetermined runs; using an SMT solver to find possible solutions to such an SMT formula thus generating specific input values, parts of thread selections and thread schedules; and executing the so created concurrent test runs.
Advantages of the preferred embodiments may include one or more of the following. The system increases structural code coverage for concurrent programs by generating new test inputs and thread schedules that are extensions/compositions of previously observed test runs. The system extends test input generation methods used in sequential programs with predictive analysis for the concurrent setting. For example, the system looks for uses (variable reads) of shared variables that lead to previously uncovered code parts, and then find appropriate definitions or defs (variable writes) in some other test runs that may feasibly be intertwined. The intertwining of multiple multi-threaded runs is formulated as an SMT problem, in a manner similar to concolic execution with predictive analysis to simultaneously consider alternate test inputs and thread schedules. Unlike previous extensions of concolic execution to concurrent programs based on global program structure, the instant approach targets branch coverage on the code of each individual thread, as this approach is more scalable. The search is guided by selection heuristics, providing a relatively complete predictive exploration in the limit (which should be avoided in practice). The resulting test generation tool can generate tests and schedules for concurrent programs and can successfully generate interesting tests, thus increasing structural code coverage. Other benefits include fast operation and low cost for improving structural test coverage. Due to the low cost and rapid analysis, the system can expose interesting concurrency issues/bugs. Since we generate test data inputs and thread schedules automatically, this can result in significant operational cost reduction in industrial practice. The system can automatically generate test inputs and thread execution schedules for concurrent programs that would allow it to increase structural code coverage of such concurrent programs. As discovering concurrency issues is inherently hard due to the non-determinism in concurrent thread scheduling, instead of trying to cover all possible thread schedules, the system focuses on the industrial practice of measuring structural code coverage, and design a methodology to automatically generate test input values and test schedules that would cover previously uncovered parts of the program.
A concolic multi-trace analysis (CMTA) system that efficiently increases code coverage in concurrent programs is disclosed.
This system addresses the test generation problem for concurrent multi-threaded programs. The system addresses the generation of tests that will increase structural coverage of such programs. Thus, we are not necessarily interested in covering all possible thread interleavings unless this would increase structural coverage as well. However, having generated a set of relevant test inputs, it is always possible to perform a full predictive analysis as discussed above for each such test input. The system can use interesting def-use pairs, where a definition (def) represents a write of a shared variable in some thread, and a use represents a read of that variable in some other thread. The system can search over the space of such def-use pairs and exploits the fact that many different tests (inputs or schedules) may already be available or are easy to generate. By observing already available test runs for various writes to shared variables, the system can select parts of previously observed tests, and interject them into other tests to target previously unseen def-use pairs, thereby leading to new interesting concurrent program behaviors. In the following, previously observed multi-threaded execution fragments that end in a write to a shared variable are referred to as interlopers.
The system generates an interesting set of test inputs and thread schedules to start with, if none is provided; and efficiently searches for feasible interlopers that may result in new relevant def-use pairs. Branch coverage is provided. The system determines def-use pairs that lead to previously uncovered branches. Sequential test generation methods are used to generate inputs preventing context switching.
While sequential test generation methods are able to quickly cover large parts of the program sequentially even for concurrent programs, the system of
The system uses interloper executions to generate appropriate SMT queries. The system reasons and analyzes multiple test runs (each test run contains many threads), and merges the test runs so that a previously uncovered code portion becomes coverable. This is better than current predictive analysis techniques as implemented in CHESS, INSPECT, FUSION, and even TICK, which only reason about a single test run (over many threads).
To perform CMTA, the system requires each test run to record additional information to guide, for example, the selection of potential target branches of interest and the selection of test runs of interest (including an interloper test run). To do so, the databases 60-64 record information about which code structures are covered by the various test runs, which shared variables are written at various test runs, among others.
The system takes advantage of state-of-the-art sequential test generation methods that are generally able to quickly cover a large part of the program in terms of branches even for concurrent programs. Indeed, the branches of the concurrent program that are not covered using sequential methods alone are often due to interesting synchronizations between threads of the concurrent program that are worth exploring deeper. By focusing on such branches after all sequentially coverable branches are reached, the system able to explore synchronization related branches. Like predictive analysis, the system looks for alternate interleavings of observed events, but in multiple traces, not a single trace. Furthermore, the concolic approach generates alternate test inputs that can cover a branch or a target path. Thus, we try to cover branches or paths by generating appropriate SMT queries where the solver tries to find both a particular thread schedule and a required test input. Finally, note that in an active testing framework, many runtime bugs can be encoded as branches that are covered as part of structural code coverage.
The system uses sequential test generation methods as long as they are able to increase coverage on individual threads. In one embodiment, the concolic execution tool Crest is used as the test input generator. Upon coverage saturation, the system uses CMTA to generate new test inputs and thread schedules to cover previously uncovered branches in one of the threads. After generating new test inputs and thread schedules using CMTA, the system extends these new tests using sequential test generation again. This means, the system follows the generated test in terms of inputs and schedule up to the previously uncovered branch in some thread Ti. Then, given that a new branch in thread Ti was covered, the system tries to further explore potentially other previously uncovered parts of the program by exploring continuations of the test only along the thread Ti, i.e. without allowing additional context switches.
CMTA is used to find a new thread schedule and new test inputs to cover a previously unreached branch. First, the system selects a target branch of interest, and corresponding traces that have been previously observed to come close to the target branch. Each previously observed trace has a given test input and thread schedule that it followed. The system also remembers for each test run which statements and branches in each thread are traversed, as well as the shared variables that are written to during the test. Assume that the uncovered branch depends on some set of shared variables S. Generally, the test condition may not be in terms of shared variables, but by intra-thread value tracing, the system can obtain S and then chooses candidate interloper trace segments from the set of so-far obtained traces, such that the interloper trace segments result in a shared variable state over S that satisfies the condition on the target branch. These interloper segments may contain executions of multiple threads. The system can apply filtering heuristics to choose an appropriate interloper segment to insert into one of the runs that came close to the target branch. Then, the system formulates an SMT problem that tries to find a viable test input and thread schedule of the modified original run, which also contains the chosen interloper segment.
The goal is to achieve high branch coverage on each thread. For that, the system first considers each thread separately and covers as many branches as possible using traditional concolic testing. Then, the system tries to cover the uncovered branches by the concurrent test generation technique. The intuition behind this approach is that many bugs in concurrent programs are sequential bugs that do not relate to any specific interleavings of concurrent execution of the programs. The idea is to catch those bugs by sequential testing, which is cheaper than concurrent testing, without requiring to consider the interleaving space. Then, concurrent test generation aims to cover the remaining uncovered branches by exploring the input space and the interleaving space simultaneously to find a combination that would cause the branch to be taken.
In order to perform sequential testing of a concurrent program, the system first executes the program with a set of random inputs, I, to obtain a concurrent trace of the program (represented by ρ). Then, the system focuses on sequential testing of each thread Ti at a time. Based on the observed trace, the system generates a trace ρ′, which represents a sequential execution of Ti, by enforcing a set of ordering constraints between the events of different threads in ρ. These constraints ensure that in ρ′: (1) thread Ti is created, and (2) thread Ti is executed sequentially and without any interference from other threads (if possible) after it is created until it is completed. To do so, the system generates happens-before relations on the events of ρ to enforce the schedule to be the same as ρ until thread Ti is created, and then to enforce all of the events of other threads (after Ti is created) to happen after the last event of Ti. In cases where the complete sequential execution of Ti is not possible due to some synchronization, the system uses corresponding orderings between the events of different threads in ρ to let Ti complete.
For sequential testing of thread Ti, the system applies traditional concolic testing starting with input set, I, and following the schedule implied by ρ′ until Ti is completed. Traditional concolic testing then performs a DFS and collects a set of path constraints corresponding to the inner-most uncovered branch in Ti. A satisfiable solution for these constraints provides a set of inputs for the next round in concolic testing.
Assume that there is a branch in thread Ti which cannot be covered by sequential testing. Suppose that there is a run, represented by run, that hits the conditional statement corresponding to the uncovered branch. Also, suppose that x is a shared variable whose value affects the condition. The main idea of our concurrent test generation is to generate schedule/inputs in which the last write to x before the branch in run is overwritten by another write to x which will cause the branch to be taken. To that end, we find an interloper segment from a run (could be different from run), with a write to x, that could be “soundly” inserted after the last write to x in run and search for possible inputs that will cause the branch to be taken after the segment is inserted in run.
Next an exemplary process for concurrent test generation is discussed. The process gets as input a set of successful runs of the program, runSet, and a set branchSet of branches that are left uncovered during sequential testing. Initially, runSet mostly contains sequential runs, but over time it accumulates multi-threaded executions as well.
First, an uncovered branch is selected by selectBranch as the target to be covered. selectBranch uses heuristics to select a branch, e.g. the depth of the branch in the CFG, number of failures in targeting to cover the branch, etc. Then, for the selected branch, the process picks a set of runs (runChoices) from runSet that hit the branch condition. Obviously, the branch condition is false in all of these runs. In lines 3-17, the process iterates over the runs in runChoices until the process finds an appropriate segment and corresponding inputs that would likely cause the branch to be taken after the segment is inserted in the run. At line 4, the process picks a run run from runChoices and then find the set aVarSet of shared variables whose values affect the branch condition by performing a traditional def-use analysis on run. In lines 6-17, the system analyzes these variables to find a segment containing a write to the selected variable that can be inserted after the last write to the variable in run. For an affecting variable aVar, let <w,r> be a pair of write/read events where w represents the last write event to aVar before the branch and r represents the read event reading the value of aVar just before the branch in run. In fact, the write to aVar in the segment can be inserted anywhere between w and r in run. The pseudo-code is as follows:
The interloper segments should be selected in such a way that they could be inserted soundly in run. At a minimum, threads executing in the segment should be at the same locations as they are at the insertion point in run. The system defines a global location as a tuple <loc1, loc2, . . . > where loci is the location of thread Ti. Recall that a location contains both the statement identifier as well as an instance identifier. Given a run, the global location can be computed at each point by looking at the last event of each thread in the run before that point. In lines 9-17, the process go over the global locations at an event e, such that w<e<r, where < represents the order of the events in run, and try to find an appropriate set of interloper candidates. Given a global location gLoc, an affecting variable aVar, and a set of runs runSet, algorithm findInterloperSegments returns a set of segments from runSet that can be inserted soundly (and not necessarily atomically) at any point with global location gLoc. All segments end with a write to the shared variable aVar. In lines 12-17, the process goes over the interloper segments and calls the multi-trace predictive analysis engine which encodes the set of all feasible runs of the program that result from inserting a specific segment at gLoc in run as a set of constraints. Then, an SMT solver is used over the concolic execution to search for inputs and a schedule that would cause the branch to be taken. If such inputs/schedule exist, then the process stops the search and executes the program with the found inputs according to the corresponding schedule which guarantees the branch to be taken.
The process for finding the interloper is discussed next. The process for finding interlopers from a set of given runs runSet is based on a global location gLoc and an affecting variable aVar. The set of all runs is analyzed in which there is at least one write to aVar (line 2). For each run, the process iterates over the set of writes to aVar and finds candidate segments containing a write as their last event while starting at a global location consistent with gLoc. To that end, for a selected write to aVar, the process performs a static backward analysis in the corresponding run, until we reach a gLoc-consistent location. Some threads may be active in the run without causally affecting the write to aVar. Requiring the location of such threads to match with gLoc is too restrictive and could miss useful segments. Therefore, as the process goes backward in the run, the process adds events to the segment only if the selected write is causally dependent on the event. The process keeps track of the threads corresponding to such events, represented by threadSet. At each location loc in the run, the process checks whether the projection of global location loc to the threads in threadSet is equal to the projection of global location gLoc to this set of threads; i.e. gLoc|threadSet=loc|threadSet. If this check passes, the segment is added to the candidate segment set. The pseudo-code is as follows:
Next, a Multi-Trace Predictive Analysis with Input Generation process is discussed. Given a run run, containing an uncovered branch br with the affecting read r, an interloper segment seg with a candidate write w, and a global location gLoc in run representing the insertion point, the process symbolically encodes a set of feasible runs, in which the schedule is the same as in run until reaching gLoc and then the interloper segment is inserted (not necessarily atomically) at gLoc in a way that r is guaranteed to read the value written by w. The inputs of the program are treated symbolically allowing SMT solvers to simultaneously search for inputs and a schedule that would cause br to be taken. The event sequence in run is called before the insertion point the prefix segment and the events after the insertion point and before br the main segment.
CMTA is based on the CTPs of the main and the interloper segments, which already represent program inputs symbolically. Let CTPmain and CTPint denote the CTPs of the main and interloper segments, respectively. The process ensures that the location of each thread in the interloper segment is the same at the beginning of both segments. Therefore, threads in the interloper segment should have a maximum common prefix of locations in both CTPmain and
The threads may then diverge after this prefix in the segments. To avoid duplication when inserting the interloper segment in the main segment, it should be ensured that each thread is at each location at most once in the predicted run. Let
represent the first event of thread Ti in CTPmain and CTPint at which
respectively. Since the segments diverge after deTimain and deTiint, this means that for each thread after this point we should consider events either from the main segment or from the interloper segment. This will be enforced using indicator bits (see below, item 6).
Suppose that Emain and Eint represent the set of events in the main and interloper segments, respectively. Note that not all of these events may be required for prediction. Indeed, certain events may be inconsistent with each other, if they originated from diverging runs. Therefore, for each event eiεEmain∪Eint the process considers an indicator bit be
A procedure can insert an interloper segment with the goal of forcing the execution of a target branch br. Towards that end, the process includes identifying a tuple of the form w,r, where w and r are the last write and read events, respectively, for variable affecting the valuation of br. In general, however, to cover all partial orders induced by shared variable accesses in different threads, the process can explore a potential insertion of interlopers between each tuple w′,r′, where w′ and r′ are the definition and use, respectively, of a shared variable sh, say, occurring along a def-use chain leading to a variable impacting the valuation of br. This is because any change to the value of sh between events w′ and r′ propagates to br potentially affecting it.
Motivated by the above discussion, let Tup be the set of all tuples of the form w′,r′, where w′ and r′ are the definition and use, respectively, of a shared variable occurring along a def-use chain leading to a variable impacting the valuation of br. The test generation algorithm can be updated as follows. The process can add an outer loop that enumerates each subset Tup′ of Tup. Then, as discussed above, each def-use tuple in this subset is a candidate for interloper insertion. This is accomplished by identifying an event etup for each tupεTup′ where an interloper can be inserted. As before the interloper can be identified via a call to findInterloperSegments. The constraints for the SMT solver need to be modified to ensure consistency for the simultaneous insertion of all |Tup′| interlopers. This modification will explore only those partial orders that are generated by shared variable accesses occurring in the set of runs runSet. This is why this procedure only guarantees relative completeness, i.e., with respect to the set runSet. This is similar to other predictive and concolic techniques that are biased towards observed test runs. In general, the dynamic tests can be supplemented by static analysis. However, in the limit, this procedure may not scale due to an explosion in the number of runs that may be generated. It is contemplated that prioritization schemes over the set of interleavings can be used in order to excite a given branch.
Based on the given run (including the prefix and main segments) and an interloper segment. a formula ΦMCTP is built such that ΦMCTP is satisfiable if there exists inputs/schedule which would cause br to be taken and the schedule follows the prefix segment and then interleaves the execution of threads in the main and interloper segments
ΦMCTP=ΦMCTPFPΦMCTPPOΦMCTPSTΦMCTPπΦMCTPBRΦMCTPAWRΦMCTPind
is constructed as follows (ΦMCTPFP=ΦMCTPπ=ΦMCTPind=true initially).
The invention may be implemented in hardware, firmware or software, or a combination of the three. Preferably the invention is implemented in a computer program executed on a programmable computer having a processor, a data storage system, volatile and non-volatile memory and/or storage elements, at least one input device and at least one output device.
By way of example, a block diagram of a computer to support the system is discussed next. The computer preferably includes a processor, random access memory (RAM), a program memory (preferably a writable read-only memory (ROM) such as a flash ROM) and an input/output (I/O) controller coupled by a CPU bus. The computer may optionally include a hard drive controller which is coupled to a hard disk and CPU bus. Hard disk may be used for storing application programs, such as the present invention, and data. Alternatively, application programs may be stored in RAM or ROM. I/O controller is coupled by means of an I/O bus to an I/O interface. I/O interface receives and transmits data in analog or digital form over communication links such as a serial link, local area network, wireless link, and parallel link. Optionally, a display, a keyboard and a pointing device (mouse) may also be connected to I/O bus. Alternatively, separate connections (separate buses) may be used for I/O interface, display, keyboard and pointing device. Programmable processing system may be preprogrammed or it may be programmed (and reprogrammed) by downloading a program from another source (e.g., a floppy disk, CD-ROM, or another computer).
Each computer program is tangibly stored in a machine-readable storage media or device (e.g., program memory or magnetic disk) readable by a general or special purpose programmable computer, for configuring and controlling operation of a computer when the storage media or device is read by the computer to perform the procedures described herein. The inventive system may also be considered to be embodied in a computer-readable storage medium, configured with a computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner to perform the functions described herein.
The invention has been described herein in considerable detail in order to comply with the patent Statutes and to provide those skilled in the art with the information needed to apply the novel principles and to construct and use such specialized components as are required. However, it is to be understood that the invention can be carried out by specifically different equipment and devices, and that various modifications, both as to the equipment details and operating procedures, can be accomplished without departing from the scope of the invention itself.
The present application is a non-provisional application of and claims priority to Provisional Application Ser. 61/657,107, filed Jun. 8, 2012, the content of which is incorporated by reference.
Number | Date | Country | |
---|---|---|---|
61657107 | Jun 2012 | US |