1. Technical Field
The present invention relates to software analysis and, more particularly, to verifying a safety reachability property in already-deployed systems.
2. Description of the Related Art
Analyzing the runtime behavior of large scale computer systems is challenging due to the lack of availability of accurate models describing the structure and behavior of such systems. Even when models exist, they usually pertain to an old configuration of the system and do not include all changes made after the system was deployed and tuned. Moreover, it is often hard to get access to the actual system to carry out such modeling/analysis. Typically all one has access to is system logs and traces.
Hidden Markov models have been extracted from a given set of logs using a variety of extant techniques. However, existing techniques mainly infer one Markov model which assigns concrete probabilities to transitions of the model. Realistically speaking, however, it is very difficult to deduce a precise Markov model from only a finite set of systems logs, as the deduced probabilities will have uncertainties which existing techniques fail to take into account.
A method for model checking of deployed systems is shown that includes learning an interval discrete-time Markov chain (IDTMC) model of a deployed system from system logs; and checking the IDTMC model with a processor to determine a probability of violating one or more probabilistic safety properties. Checking the IDTMC model includes splitting the probability into a linear part and a non-linear part; calculating the linear part exactly using affine arithmetic; and over-approximating the non-linear part using interval arithmetic.
A system for model checking of deployed systems is shown that includes a model learning module configured to learn an interval discrete-time Markov chain (IDTMC) model of a deployed system from system logs; and a processor configured to checking the IDTMC model to determine a probability of violating one or more probabilistic safety properties, wherein the probability is split into a linear part and a non-linear part. The processor includes an affine module configured to calculate the linear part exactly using affine arithmetic and an interval module configured to over-approximate the non-linear part using interval arithmetic.
These and other features and advantages will become apparent from the following detailed description of illustrative embodiments thereof, which is to be read in connection with the accompanying drawings.
The disclosure will provide details in the following description of preferred embodiments with reference to the following figures wherein:
The present principles provide verification of probabilistic safety properties on already-deployed systems in two steps: learning an interval discrete time Markov chain (IDTMC) from system logs and analyzing the IDTMC model for quantitative and reliability properties.
Referring now to
The model is then checked to establish whether it meets safety properties in block 104 by computing a sound over-approximation of the probability to reach a desired state. To accomplish this, block 104 and splits an IDTMC transition interval probability matrix P into a central transition probability matrix Pc and an interval matrix E over affine arithmetic error terms that encode the uncertainty of the original transition interval probability matrix P. The central matrix Pc is built using the centers of the original intervals of P, determined as the means given by the underlying learning method used. Computation may then be split into a constant value c, a linear part over affine error terms l(e), and a remaining non-linear part. Block 106 computes the linear part using affine arithmetic, producing error bounds for computing the probability of satisfying a given property. Affine arithmetic is used in block 106 to overcome the loss of precision that would result from employing interval arithmetic across the board. Affine arithmetic overcomes the loss of relations in interval arithmetic. This allows precise computation of first-order terms.
Block 108 exploits the error bounds using interval arithmetic to compute the non-linear part. Block 110 then tests the deployed system using the formal analysis, where the above-described over-approximation may be computed in polynomial time and provides confidence metrics in the results. The model checking of block 104 finds probabilities of failure in the system. If a probability of a problem is high enough to cause concern, block 110 implements a set of conditions to replicate the circumstances predicted by the model to cause the problem.
Embodiments may include a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system. A computer-usable or computer readable medium may include any apparatus that stores, communicates, propagates, or transports the program for use by or in connection with the instruction execution system, apparatus, or device. The medium can be magnetic, optical, electronic, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium. The medium may include a computer-readable storage medium such as a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk, etc.
A data processing system suitable for storing and/or executing program code may include at least one processor coupled directly or indirectly to memory elements through a system bus. The memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code to reduce the number of times code is retrieved from bulk storage during execution. Input/output or I/O devices (including but not limited to keyboards, displays, pointing devices, etc.) may be coupled to the system either directly or through intervening I/O controllers.
Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks. Modems, cable modem and Ethernet cards are just a few of the currently available types of network adapters.
Embodiments described herein may be entirely hardware, entirely software or including both hardware and software elements. In a preferred embodiment, the present invention is implemented in software, which includes but is not limited to firmware, resident software, microcode, etc.
Referring now to
A DTMC may be defined as a 4-yuple: M≡(S,s0,P,l) , where S is a finite set of states, s0∈S is the initial state, P is a stochastic matrix, and l:S→2AP is a labeling function which assigns to each state s∈S a set of atomic propositions α∈AP that are valid in s, and AP denotes a finite set of atomic propositions. The element pij of the square matrix P denotes the transition probability from state si, to state sj. Therefore, pij∈[0,1] and for all i, Σjpij=1. An IDTMC may then be defined as a 4-tuple Ml≡(S,si,Pl,l), where Pl is an interval valued matrix. The IDTMC may further be defined as a set of DTMCs, where each stochastic matrix P can be found in the interval matrix Pl. Model checking a stochastic property then means computing the set:
{P|P=ProbM(s,Ψ),∀M∈Ml}.
The set of states may be split into those having definite probabilities, either of 1 or of 0, and those which are uncertain. The probabilities may be expressed as a vector νk, where the elements of the vector are the probabilities ProbM(si,Ψ,k) that a path of length k, starting from a state si satisfies the property Ψ. This vector may be defined recursively as νk=P′ν(k-l)+b, where P′ is a square matrix extracted from the transition probability matrix P by removing all the rows i such that s, has a definite probability of 1 or 0.The components of the vector b may be defined as
where Imaybe is the set of indices corresponding to states having uncertain probabilities and νk[j] is the jth element of the vector νk.
In the bounded case, where k<+∞, the recursion may be unrolled completely, starting with ν0=0. The probability that a path of length zero satisfies the property Ψ is defined to be zero for all states of uncertain probability. To solve the unbounded case, the system of linear equations ν=P′ν+b is resolved.
Solving the bounded case involves computing the n objective values of the following linear programming problems:
where each εij is an interval component of an interval matrix E that represents the uncertainty of the model. The interval components εij are bounded on each side by an interval eij. The coefficients αij represent error weights of the symbolic error variables εij. Any appropriate linear programming solver can solve the above problems. However, a particularly efficient method for solving the present linear programming problem is shown hereinbelow.
The linear programming problem above can be decomposed into n smaller problems of the form:
with εi≡[−ei,ei]. For a feasible tuple (εl, . . . , εn), εl is said to be positively or negatively saturated accordingly as ε1 equals −ei or ei. Exploiting the fact that there always exists a maximizing feasible solution that saturates all but possibly one variable, the maximization problem reduces to determining the variables that need to be saturated positively and the ones that need to he saturated negatively which, in turn, automatically determines the values assigned to all the variables ε1. This reduces to an instance of the weighted median problem, solvable in linear time.
In the unbounded case, the solution can similarly be reduced to the solution of a system of interval linear equations. Analytical solvability of such a system is NP-hard, but numerical techniques may be used to approximate the set of solutions to efficiently solve the problem.
Referring now to
Having described preferred embodiments of a system and method for probabilistic model checking of systems with ranged probabilities (which are intended to be illustrative and not limiting), it is noted that modifications and variations can be made by persons skilled in the art in light of the above teachings. It is therefore to be understood that changes may be made in the particular embodiments disclosed which are within the scope of the invention as outlined by the appended claims. Having thus described aspects of the invention, with the details and particularity required by the patent laws, what is claimed and desired protected by Letters Patent is set forth in the appended claims.
This application claims priority to provisional application Ser. No. 61/543,839 filed on Oct. 6, 2011, incorporated herein by reference. This application also claims priority to provisional application Ser. No. 61/546,759 filed on Oct. 13, 2011, incorporated herein by reference.
Entry |
---|
Boyd, et al., Fastest Mixing Markov Chain on a Graph, SIAM Review, vol. 46, No. 4, 2004, pp. 667-689. |
Borges, M., et al. “Symbolic Execution With Interval Solving and Meta-Heuristic Search” 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation. Apr. 2012. pp. 111-120. |
Forejt, V., et al. “Automated Verification Techniques for Probabilistic Systems” Formal Methods for Eternal Networked Software Systems—11th International School on Formal Methods for the Design of Computer, Communication and Software Systems, SFM 2011. Jun. 2011. pp. 1-60. |
Gao, S., et al. “Integrating ICP and LRA Solvers for Deciding Nonlinear Real Arithmetic Problems” Proceedings of 10th International Conference on Formal Methods in Computer-Aided Design, FMCAD 2010. Oct. 2010. pp. 81-89. |
Godefroid, P., et al. “DART: Directed Automated Random Testing” Hardware and Software: Verification and Testing—5th International Haifa Verification Conference, HVC 2009. Oct. 2009. (11 Pages). |
Majumdar, R., et al. “Hybrid Concolic Testing” 29th International Conference on Software Engineering (ICSE 2007). May 2007. pp. 416-426. |
Pacheco, C., et al. “ECLAT: Automatic Generation and Classification of Test Inputs” ECOOP 2005—Object-Oriented Programming, 19th European Conference. Jul. 2005. pp. 504-527. |
Pacheco, C., et. “Feedback-Directed Random Test Generation” 29th International Conference on Software Engineering (ICSE 2007). May 2007. pp. 1-10. |
Pacheco, C., et. “Randoop: Feedback-Directed Random Testing for Java” Companion to the 22nd Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2007, Oct. 2007, pp. 815-816. |
Sen, K., et al. “CUTE: A Concolic Unit Testing Engine for C” Proceedings of the 10th European Software Engineering Conference held jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering. Sep. 2005. pp. 263-272. |
Sen, K., et al. “Learning Continuous Time Markov Chains From Sample Executions” 1st International Conference on Quantitative Evaluation of Systems (QEST 2004). Sep. 2004. pp. 1-10. |
Sen, K., et al. “Model-Checking Markov Chains in the Presence of Uncertainties” Tools and Algorithms for the Construction and Analysis of Systems, 12th International Conference, TACAS 2006 Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2006. Mar. 2006. pp. 1-22. |
Souza, M., et al. “Coral: Solving Complex Constraints for Symbolic Pathfinder” NASA Formal Methods—Third International Symposium, NFM 2011. Apr. 2011. pp. 359-374. |
Xie, T., et al. “Symstra: A Framework for Generating Object-Oriented Unit Tests Using Symbolic Execution” Tools and Algorithms for the Construction and Analysis of Systems, 11th International Conference, TACAS 2005, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2005. Apr. 2005. pp. 365-381. |
Younes, H., et al. “Numerical vs. Statistical Probabilistic Model Checking” International Journal on Software Tools for Technology Transfer (STTT), vol. 8, No. 3, Jun. 2006. pp. 1-14. |
Younes, H. “Probabilistic Verification for “Black-Box” Systems” Computer Aided Verification, 17th Internatinonal Conference, CAV 2005. Jul. 2005. pp. 253-265. |
Number | Date | Country | |
---|---|---|---|
20130091080 A1 | Apr 2013 | US |
Number | Date | Country | |
---|---|---|---|
61543839 | Oct 2011 | US | |
61546759 | Oct 2011 | US |