This invention relates to the prediction and detection of conflicts and errors in a collaborative operational environment, and more particularly to improved conflict and error prediction and detection (CEPD) systems and methods employing prognostics and diagnostics.
Error detection is a fundamental challenge in a wide range of enterprises including systems employed in manufacturing, production, supply, and in service industries. It can be critical to the proper operation of distributed systems that require collaboration among multiple participants among whom errors are unavoidable. Extensive research has been devoted to the problem, and yet all known models, methodologies, algorithms, protocols and tools are lacking in one respect or another, and even basic concepts relating to the proper approach for different systems/networks are not well understood.
This patent specification is organized in chapters, two of which (Chapters 1 and 3) include further background information. While some related research and other work are discussed, the work discussed is not necessarily prior art, and the discussion itself does not constitute prior art and is not to be construed as an admission of prior art.
The present invention provides, as one aspect thereof, a method of preventing and detecting conflicts and errors through prognostics and diagnostics in a system having a plurality of cooperative units each configured to collaborate with other cooperative units. The method comprises modeling the system with a plurality of constraints that must be satisfied by cooperative units, wherein the constraints are indicative of potential conflicts and errors in the system and have relationships indicative of how conflicts and errors propagate between units, and applying conflict and error prevention and detection (CEPD) logic configured to detect conflicts and errors that have occurred, and to identify conflicts and errors before they occur, based on whether the constraints are satisfied or unsatisfied.
The objects and advantages of the present invention will be more apparent upon reading the following detailed description in conjunction with the accompanying drawings.
For the purpose of promoting an understanding of the principles of the invention, reference will now be made to the embodiments illustrated in the drawings and specific language will be used to describe the same. It will nevertheless be understood that no limitation of the scope of the invention is thereby intended, such alterations and further modifications in the illustrated device and such further applications of the principles of the invention as illustrated therein being contemplated as would normally occur to one skilled in the art to which the invention relates.
Conflict and error prognostics and diagnostics are critical in distributed and decentralized operations where e-Business and e-Commerce activities require collaboration among multiple participants among whom conflicts and errors (CEs) are unavoidable. To prevent, track, detect, trace, diagnose, and resolve CEs are fundamental challenges in collaboration. Various models, methodologies, algorithms, protocols, and tools have been developed to manage CEs.
In previous research, a conflict has been defined as an inconsistency between cooperative units (Co-Us)' goals, dependent tasks, associated activities, or plans of sharing resources (Yang, 2004). According to this definition, a conflict occurs whenever an inconsistency between two or more units in a system occurs. An error has been defined as any input, output, or intermediate result that has occurred in a system and does not meet system specification, expectation, or comparison objective (Klein, 1997). CE detection is most well studied because of its unique importance to collaboration. If CEs in a system are not detected, (1) CEs cannot be diagnosed and resolved; (2) more CEs may occur due to propagation; (3) eventually collaboration among units in the system is damaged.
The prevalent method to detect conflicts has been using layered constraints (Barber et al., 2001; Klein, 1992; Li et al., 2002; Shin et al., 2006). Each unit must satisfy a set of predefined constraints which can be categorized into different layers, e.g., goal layer, plan layer, belief layer, and task layer. Any violation of constraints that involves two or more units is a conflict. Conflict detection has been mostly studied in collaborative design (Ceroni and Velasquez, 2003; Jiang and Nevill Jr, 2002; Lara and Nof, 2003). The ability to detect conflicts in distributed design activities is vital to their success because multiple designers tend to pursue individual (local) goals prior to considering common (global) goals.
Various domain-specific methods have been developed to detect errors. Three basic research questions in CE prognostics and diagnostics have not been well answered, albeit enormous accomplished and ongoing research:
In this research, an agent-based modeling approach is explored to formally model a system for CEPD. The CEPD logic is developed based on previous work. The analytical methods and simulation experiments are used to validate the methodology. Four performance measures are defined to evaluate the CEPD logic and compare it with the Traditional CEPD Algorithm. Results show that (1) the agent-based modeling approach is suitable for CEPD in various systems/networks, and (2) the CEPD logic outperforms the Traditional CEPD Algorithm.
The purpose of this research is to develop and apply CEPD logic to prevent and detect CEs. The prevention logic is applied to analyze CE propagation to prevent CEs that may occur. The detection logic is applied to analyze CE propagation to identify causes of CEs. A CE can be prevented only if it is prognosed by the prevention logic and there is sufficient time to take prevention actions before it occurs. Three research problems are defined:
Research problem 1: Not all CEs that have occurred in a system are detected.
Research problem 2: Causes of certain CEs are unknown.
Research problem 3: CEs are not prevented through prognostics.
Five research questions are defined to address the three research problems:
Three basic research assumptions are as follows:
To prevent and detect CEs in a system can significantly
The overall goal of this research is to design and develop effective methods to prevent and detect CEs through prognostics and diagnostics. The specific objectives are to:
CHAPTER 3 reviews related work. CHAPTER 4 illustrates the developed methodology. CHAPTER 5 and CHAPTER 6 validate the methodology with analytical tools and simulation experiments, respectively. CHAPTER 7 concludes and summarizes the methodology.
As the first step to prevent errors, error detection has gained much attention especially in assembly and inspection. For instance, researchers (Najjari and Steiner, 1997) have studied an integrated sensor-based control system for a flexible assembly cell which includes error detection function. An error knowledge base has been developed to store information about previous errors that had occurred in assembly operations, and corresponding recovery programs which had been used to correct them. The knowledge base provides support to both error detection and recovery. In addition, a similar machine learning approach to error detection and recovery in assembly has been discussed. To realize error recovery, failure diagnostics has been emphasized as a necessary step after the detection and before the recovery. It is noticed that in assembly, error detection and recovery are often integrated.
Automatic inspection has been applied in various manufacturing processes to detect, identify, and isolate errors or defects with computer vision. It is mostly used to detect defects on printed circuit board (Chang et al., 2005; Moganti and Ercal, 1995; Rau and Wu, 2005) and dirt in paper pulps (Calderon-Martinez and Campoy-Cervera, 2006; Duarte et al., 1999). The use of robots has enabled automatic inspection of hazardous materials (e.g., Wilson and Berardo, 1995) and in environments where human operators cannot access, e.g., pipelines (Choi et al., 2006). Automatic inspection has also been adopted to detect errors in many other products such as fuel pellets (Finogenoy et al., 2007), print contents of soft drink cans (Ni, 2004), oranges (Cai et al., 2006), aircraft components (Erne et al., 1999), and micro drills (Huang et al., 2006). The key technologies involved in automatic inspection include but are not limited to computer or machine vision, feature extraction, and pattern recognition (Chen et al., 1999; Godoi et al., 2005; Khan et al., 2005).
Process monitoring, or fault detection and diagnostics in industrial systems, has become a new sub-discipline within the broad subject of control and signal processing (Chiang et al., 2001). Four procedures are associated with process monitoring (Raich and Cinar, 1996), although there appears to be no standard terminology for them: (1) Fault detection is a procedure to determine if a fault has occurred; (2) Fault identification is a procedure to identify the observation variables most relevant to diagnosing the fault; (3) Fault diagnostics is a procedure to determine which fault has occurred, or the cause of the observed out-of-control status; (4) Process recovery is a procedure to remove the effect of the fault.
Another term, fault isolation, is also widely used and defined as the procedure to determine the exact location of the fault or faulty component (Gertler, 1998). Fault isolation provides more information than a fault identification procedure in which only the observation variables associated with the fault are determined. Fault isolation does not provide as much information as a fault diagnostics procedure, however, in which the type, magnitude, and time of the fault are determined (Raich and Cinar, 1996). A commonly used term in the literature is FDI (fault detection and isolation), which includes both fault detection and isolation procedures.
Three approaches to manage faults for process monitoring are summarized in
The three approaches do not differentiate conflicts from errors. This is because (1) a fault that is detected, identified, or diagnosed may not be the root cause; (2) even if a fault is determined to be the root cause, systems are not modeled in a way to reveal which units and how they have caused the fault, i.e., is the fault a conflict or an error? Differentiating between conflicts and errors is important for error recovery, conflict resolution, and CE diagnostics.
The three fault management approaches discussed in 3.2 can be classified according to the way that a system is modeled. In analytical approach, quantitative models are used which require the complete specification of system components, state variables, observed variables, and functional relationships among them for the purpose of fault management. The data-driven approach can be considered as the effort to develop qualitative models in which previous and current data obtained from a system are used. Qualitative models usually require less information about a system than quantitative models. The knowledge-based approach uses qualitative models and other types of models. For instance, pattern recognition can use multivariate statistical techniques whereas the signed directed graph is a typical dependence model which represents the cause-effect relationships in the form of a directed graph (Deb et al., 1995).
Similar to algorithms used in quantitative and qualitative models for process monitoring, fault detection and diagnostics algorithms have been developed in other areas. Significant research has been conducted to find optimal and near-optimal test sequences to diagnose faults (Deb et al., 1995; Pattipati and Alexandridis, 1990; Pattipati and Dontamsetty, 1992; Raghavan et al., 1999a, b; Shakeri et al., 1995; Shakeri et al., 2000; Tu et al., 2002; Tu et al., 2003; Tu and Pattipati, 2003). The research started with fault diagnostics in electronic and electromechanical systems with single fault (Pattipati and Alexandridis, 1990). It was assumed that there is at most one fault or faulty state in a system at any time. An X-windows based software tool, TEAMS (Testability Engineering And Maintenance System) was developed for testability analysis of large systems containing as many as 50,000 faults and 45,000 test points (Deb et al., 1995). TEAMS can be used to model individual systems and generate near-optimal diagnostic procedures. The research has been expanded to multiple fault diagnostics (Shakeri et al., 1995; Shakeri et al., 2000; Tu et al., 2002; Tu et al., 2003; Tu and Pattipati, 2003) in various real-world systems including the space shuttle main propulsion system. Test sequencing algorithms with unreliable tests (Raghavan et al., 1999b) and multivalued tests (Tu et al., 2003) have also been studied.
The goal of the test sequencing problem is to design a test algorithm that is able to unambiguously identify the occurrence of any system state (faulty or fault-free state) using the test in the test set and minimize the expected testing cost (Pattipati and Alexandridis, 1990). The test sequencing problem belongs to the general class of binary identification problems. The problem to diagnose single fault is a perfectly observed Markov decision problem (MDP). The solution to the MDP is a deterministic AND/OR binary decision tree with OR nodes labeled by the suspect set of system states and AND nodes denoting tests (decisions). It is well known that the construction of the optimal decision tree is an NP-complete problem (Pattipati and Alexandridis, 1990). The research conducted integrates concepts from information theory and heuristic search to subdue the computational explosion of the optimal test sequencing problem (Pattipati and Alexandridis, 1990).
To diagnose a single fault in a system, the relationship between the faulty states and tests can be modeled by directed graph (DG; digraph model). Once a system is described in a diagraph model, the full order dependences among failure states and tests can be captured by a binary test matrix, also called dependency matrix. The single-fault test strategy for a system can be expressed with a AND/OR decision tree.
In complex systems with large numbers of components and/or subsystems, it is possible that multiple faults occur at the same time. It is therefore necessary to construct optimal and near-optimal test sequences for multiple fault diagnostics. The computational complexity of solving the optimal multiple-fault diagnostics problem is super exponential and is much more difficult than the single-fault diagnostics problem (Shakeri et al., 2000). With the assumption that faulty components are replaced by 100% functional components, an extended strategy to diagnose multiple faults can be developed.
Other strategies (Tu and Pattipati, 2003) and algorithms (Tu et al., 2002; Tu et al., 2003) have been developed to diagnose multiple faults using digraph models, DG=(V, E), which are different than those applied in other research. The nodes V represent the set of components or tests, and an edge (hi,hj)∈E represents the fault propagation from hi to hj (Tu et al., 2003). The dependency matrix is derived from this digraph model. Other researchers have used digraph model to diagnose faults in hypercube microprocessors (Feng et al., 1996). The directed graph is a powerful tool to describe dependences among system components. Several important issues have been brought up in light of the intensive research on test sequencing problem:
Another interesting aspect of the research on fault diagnostics algorithms is the list of assumptions discussed in several articles:
There is a critical difference between assumptions 3 and 4. Assumption 3 is related to diagnostics ability. When an unambiguous test detects a fault, the conclusion is that the fault has occurred with 100% probability. Nevertheless, this conclusion could be wrong if false positive is not zero according to the test (diagnostics) reliability described in assumption 4. When an unambiguous test does not detect a fault, the conclusion is that the fault has not occurred with 100% probability. Similarly, this conclusion could be wrong if false negative is not zero. Unambiguous tests have better diagnostics ability than ambiguous tests. If a fault has occurred, ambiguous tests conclude that the fault has occurred with a probability less than one. Similarly, if a fault has not occurred, ambiguous tests conclude that the fault has not occurred with a probability less than one. In summary, if assumption 3 is true, a test gives only two results: a fault has occurred or has not occurred, always with 100% probability. If both assumptions 3 and 4 are true, (1) a fault must have occurred if the test concludes that the fault has occurred; (2) a fault must have not occurred if the test concludes that the fault has not occurred.
Both the test sequencing and CE detection problems deal with systems with large amount of components where multiple faults or errors may occur at the same time. The major weakness of fault diagnostics algorithms discussed here is the centralized CE detection and diagnostics approach. When a system is large and complex (various dependences among nodes exist), to develop the optimal test sequence becomes infeasible due to the time it requires. To develop the near-optimal test sequence requires less time but the test sequence increases detection cost. The characteristics of the centralized approach are (1) algorithms and necessary system information are controlled by a central control unit; (2) the fault detection is performed sequentially. An alternative approach is to develop decentralized algorithms that can utilize distributed agents and their effective collaboration to detect and diagnose CEs. System resources are better utilized by allocating detection tasks to distributed agents that execute tasks simultaneously and communicate with each other following protocols.
There are at least two reasons why centralized CE detection and diagnostics are widely applied: (1) traditionally, CEs have been detected only in the output of a system. Decentralized detection is not necessary. Centralized detection is sufficient and subsequent diagnostics follows the same framework; (2) systems are often designed in a way that is difficult for decentralized CE detection and diagnostics. For instance, fault diagnostics algorithms discussed above were first developed to diagnose errors in electronic and electromechanical systems in which PDAs are difficult to apply.
Another drawback of the above fault diagnostics algorithms is the lack of system models. The relationship between nodes and the relationship between faults have not been considered until recently (Tu et al., 2003). Algorithms are expected to perform better if a system is appropriately modeled. The difference may be more apparent for large (many components) and complex (components are dependent on each other) systems than for small (a few components) and simple (components are independent of each other) systems.
A conflict has been defined in previous research (Yang, 2004) as an inconsistency between Co-Us' goals, dependent tasks, associated activities, or plans of sharing resources. A Co-U is an autonomous working unit in a system that performs tasks to achieve its local goals, and coordinates with other Co-Us to accomplish the common goals of a set of Co-Us in the system. An error is always restricted to a local scope (within a Co-U). A conflict and error detection model (CEDM) has been developed to detect CEs in a distributed collaborative environment supported by conflict and error detection protocol (CEDP). The CEDP is invoked when a CE is detected or system (collaboration) specifications are changed. A software agent, conflict and error detection agent (CEDA), is installed in each Co-U of a collaborative network to perform CE detection coordinated by CEDP. CEDAs work together to identify CEs by applying active middleware that supports the CEDP to evaluate the detection process and exchange detection information among Co-Us.
Active middleware was first developed to optimize the performance of e-Work interactions in heterogeneous, autonomous, and distributed environments (Anussornnitisarn, 2003; Nof, 2003). It has been slightly revised and applied to a distributed environment to detect CEs (Yang, 2004). Active middleware is an enabling layer of information and information flow, database, modeling tools, and decision support system, which has inputs from a distributed yet networked environment. A CEDM is the integration of active middleware, CEDP, and CEDAs, and its output is the CEs detected.
An important issue, CE propagation, has been addressed in previous research by applying active middleware and agent technologies (Chen and Nof, 2007; Yang and Nof, 2004a, b; Yang et al., 2005). An error inside a Co-U's boundary might cause a conflict because of task dependence, or common resources sharing between Co-Us. The task dependence can be treated as the interaction between Co-Us when they cooperate to achieve a common goal. If the task dependence becomes complicated, which is often the case in a complex system, CE propagation might occur because of the complex correlation between Co-Us. This kind of propagation is common in a supply network or a design process. If one participant cannot fulfill a task request (service agreement) because of an internal error, this error event will also affect other participants' plan and cause a widespread conflict. Consequently, CEs propagate in the collaboration network and affect more Co-Us if they are not resolved in time, or are incorrectly resolved by a Co-U's local-view solutions. Most recently, research at the PRISM (Production, Robotics, and Integration Software for Manufacturing and Management) Center has focused on fundamental principles to address CE propagation by developing algorithms and protocols (Chen, 2009; Chen and Nof, 2007, 2009, 2010, 2011) based on previous work related to error recovery and conflict resolution (Avila-Soria, 1998; Lara and Nof, 2003; Velasquez et al., 2008).
Two measures, conflict-severity and detect-ability have been defined to evaluate conflict situations and detection abilities respectively (Yang and Nof, 2004a, b). Conflict-severity is a measure of how serious a conflict is and is calculated by summing up all weights of unsatisfied constraints that are the benchmark of conflict detection. The detect-ability is a function of detection accuracy, cost, and time. Both measures provided necessary decision-making information for detection as well as resolution. Moreover, a cost-damage analysis has been applied to determine optimal detection policy. A viability measure has been developed to examine detection policy and evaluate detection performance.
In summary, substantial research on CE detection has been conducted at the PRISM Center. Basic decentralized error prediction and detection algorithms (Chen and Nof, 2007, 2010, 2011) and conflict detection protocols (Yang and Nof, 2004a, b) have been developed. Different detection policies have also been studied. Based on the previous research, the current research aims at developing a general methodology that can be used to model systems and prevent and detect CEs with optimized logic. The new modeling method and CEPD logic are critical to the success of a wide range of enterprises where CEs are unavoidable and must be prevented and detected.
Recently, several researchers have applied Petri nets in fault detection and diagnostics (Chung et al., 2003; Georgilakis et al., 2006; Ushio et al., 1998), fault analysis (Rezai et al., 1995a; Rezai et al., 1995b; Rezai et al., 1997), and conflict detection and resolution (Shiau, 2002). Research conducted before 1990 on error detection and recovery using Petri nets has been summarized by two researchers (Zhou and DiCesare, 1993). Petri nets are formal modeling and analysis tool for discrete event or asynchronous systems. For hybrid systems that have both event driven and time driven (synchronous) elements, Petri nets can be extended to global Petri nets to model both discrete time and event elements. Petri nets have also been used in conflict detection and resolution (Shiau, 2002) as the extension of goal structure tree (O'Hare and Jennings, 1996) and the E-PERT (Extended Project Estimation and Review Technique) diagram (Barber et al., 2001).
Research on fault detection and diagnostics with Petri nets was conducted in the context of fault diagnostics in discrete event systems (Sampath et al., 1995; Zad et al., 2003). Using finite state machine (FSM), researchers defined the notion of diagnosability and provided a construction procedure for the diagnoser that can detect faults in diagnosable systems (Sampath et al., 1995). To detect and diagnose faults with Petri nets, some of the places in a Petri net are assumed observable and others are not. All transitions in the Petri net are also unobservable. Unobservable places, i.e., faults, indicate that the numbers of tokens in those places are not observable, whereas unobservable transitions indicate that their occurrences cannot be observed (Chung et al., 2003; Ushio et al., 1998). The objective of the detection and diagnostics is to identify the occurrence and type of a fault based on observable places within finite steps of observation after the occurrence of the fault. To detect and diagnose faults with Petri nets, system modeling is complex and time-consuming because faulty transitions and places must be included in a model. Research on this subject has been mainly the extension of previous work using FSM and gained limited progress.
Conflicts can be categorized into three classes (Barber et al., 2001): goal conflicts, plan conflicts, and belief conflicts. Goal conflicts are modeled with intended goal structure (Barber et al., 2001) which is extended from goal structure tree (O'Hare and Jennings, 1996). Plan conflicts are modeled with E-PERT diagram. Three classes of conflicts are modeled by Petri nets with the help of four basic modules (Zhou et al., 1992), i.e., sequence, parallel, decision, and decision-free, to detect conflicts in a multi-agent system. Each agent's goal and plan are modeled by separate Petri nets (Shiau, 2002), and many Petri nets are integrated using a bottom-up approach (Zhou and DiCesare, 1993; Zhou et al., 1992) with three types of operations (Shiau, 2002): AND, OR, and Precedence. The synthesized Petri net is analyzed to detect conflicts.
The Petri net based approach for conflict detection developed so far has been rather limited. The approach emphasized more on the modeling of a system and its agents than on the analysis process through which conflicts are detected. The Petri net model that has been applied is indeed a goal-based or plan-based approach, not an agent-based approach. With the agent-based approach, an agent is modeled only once. In other approaches including the Petri net based approach for conflict detection, an agent is modeled multiple times due to multiple goals, plans, or tasks the agent has. Also, the Petri net based approach is a static rather than a dynamic approach in which multiple resources of each type, different plans according to token attributes, and time for transitions must be considered and studied.
Research has also been conducted to diagnose faults in discrete event systems with decentralized approach (Wenbin and Kumar, 2006). Distributed diagnostics can be performed by either diagnosers communicating with each other directly or through a coordinator. Alternatively, diagnostics decisions can be made completely locally without combining the information gathered (Wenbin and Kumar, 2006).
To summarize, Petri nets have been applied to fault and conflict detection in different ways. To detect and diagnose faults, both normal and faulty transitions and places are modeled with Petri nets and detection and diagnostics are executed by utilizing information from observable places. For conflict detection and resolution, only normal transitions and places are modeled. The prevention and prognostics of conflicts and errors have not been addressed with Petri nets and there has been no available approach that can detect both conflicts and errors. The analysis of Petri nets is difficult and has not been well studied, especially for multiple attributes in discrete event systems. The preliminary studies on distributed detection and diagnostics of faults and the use of agent technology for conflict detection are interesting and in line with current research.
Place/transition (P/T) nets, or Petri nets, have been intensively studied as one of formal methods for verifying the correctness of systems since they were originally introduced by Dr. Petri in 1962. Systems are described as mathematical objects in formal methods which can handle characteristics such as non-determinism and concurrency of a system where CEs are difficult to prevent and detect. Eq, 3.1 defines a P/T net as a tuple:
N=
P,T,F,W,M
0
Eq. 3.1
P is a finite set of places and T is a finite set of transitions. The places P and transitions T are disjoint (P∩T=φ). F⊂(P×T)∪(T×P) is the flow relation. W:((P×T)∪(T×P))→N is the arc weight mapping. W(f)=0 for all f∉F, and W(f)>0 for all f∈F. M0:P→N is the initial marking representing the initial distribution of tokens.
To define firing condition and firing rule of the P/T net, four basic concepts are introduced: (1) If p,t∈F for a transition tr and a place p, then p is an input place of tr; (2) If tr, p∈F for a transition tr and a place p, then p is an output place of tr; (3) Let a∈P∪T. The set •a={a′|a′,a∈F} is called the pre-set of a, and the set a•={a′|a,a′∈F} is the post-set of a; (4) M(p) denotes the number of tokens in place p in marking M. The firing condition is defined as:
Transition tr∈T is M-enabled (or enabled in M), written as
M(p)≧W(p,tr), ∀p∈•tr Eq. 3.2
An M-enabled transition tr may fire, producing the successor marking M′, written as
Eq. 3.3 defines the firing rule:
M′(p)=M(p)−W(p,tr)+W(tr,p), ∀p∈P Eq. 3.3
Depending on whether CE states (states that represent conflicts and errors) are modeled with P/T nets, there are two different ways for CEPD. If CE states (places) are not modeled with P/T nets, the prevention and detection process needs to check if all normal states (states without CEs) in a P/T net can be reached. A CE is detected, diagnosed, or prognosed if a normal state cannot be reached. If CE states are modeled with P/T nets, the prevention and detection process is aimed at (1) detecting, diagnosing, and prognosing CE states that may be reached; (2) checking if all normal states can be reached. A CE is detected, diagnosed, or prognosed if a CE state is reached or a normal state cannot be reached. The basic task of CEPD process with P/T nets is to determine if certain states can be reached.
A P/T net that includes CE states is often used to detect CEs dynamically. CEs are detected on the fly while the system being monitored is executing various tasks. In most cases, the CE detection process only determines if any CE state is reached and does not consider normal states. The main disadvantage of this approach is the difficulty to include all possible CE states of a system in a P/T net. If a CE state is not modeled with the P/T net, this CE cannot be detected.
On the other hand, a P/T net that includes only normal states is often used to predict CEs statically. CEs are predicted by checking if all normal states can be reached before a system starts executing tasks. This approach can be challenging in terms of the time and cost required when a P/T net includes a large amount of transitions and states. It is sometimes used to detect CEs dynamically. In that case a time stamp is associated with each state and CEs are detected when a state is not reached by the time specified by the time stamp. Both approaches are similar when they are used to detect CEs dynamically.
Several techniques are used to determine if a state can be reached in a P/T net, including reachability graph, coverability graph, and structural analysis. A reachability graph may include infinite number of reachable states if it is not a k-safe net (k≧0 is some constant and an integer) in which no reachable marking contains more than k tokens in any place. The CEPD process will not terminate if a P/T net is not k-safe. Even if a P/T net is k-safe, there can be as many as (k+1)|P| reachable markings (|P| is the total number of places in the net). On the other hand, coverability graphs do not provide accurate information about reachable states. Structural analysis is therefore preferred because it can prove some properties without constructing the graph. Specifically, it is motivated to use place invariants for CEPD.
Let N=P,T,F,W,M0 be a P/T net. The corresponding incidence matrix C:P×T→Z is the matrix whose rows correspond to places and whose columns correspond to transitions. Column tr∈T denotes how the firing of tr affects the marking of the net: C(tr, p)=W(tr,p)−W(p,tr).
Markings of a P/T net can be written as column vectors, e.g., the initial marking of the P/T net in
For a P/T net N with incidence matrix C, a solution of the equation CTx=0 such that x≠0 is called a place invariant, or P-invariant of N P-invariant is a vector with one entry for each place. For instance, x1=(1 1 1 0 0 0 0)T, x2=(0 1 1 0 0 1)T, and x3=(0 0 0 1 1 1)T are all P-invariants for the P/T net shown in
M
T
x=(M0+Cu)Tx=M0Tx+(Cu)Tx=M0Tx+uTCTx=M0Tx Eq.3.4
P-invariants therefore can be used to determine if certain states in a P/T net cannot be reached. For instance, P-invariant x2 indicates that all reachable markings M must satisfy Eq. 3.5:
M(p3)+M(p4)+M(p7)=M0(p3)+M0(p4)+M0(p7)=1 Eq. 3.5
P-invariants cannot be used to determine if certain states in a P/T can be reached. This implies that for a P-invariant x if a marking M is identified such that MTx=M0Tx, it cannot be concluded that marking M is reachable. Moreover, it can be difficult to determine the marking of all states when a P/T net includes many states and transitions.
Table 3.1 summarizes seven CEPD methods. Three important findings are:
In summary, the analytical, data-driven, and knowledge-based CEPD methods are domain-specific methods. They can be applied to a limited number of systems. The other two centralized methods, diagnostics algorithms and Petri net, are generic methods but do not consider relationships between CEs and cannot prognose or diagnose CEs. The current research aims at developing both the Centralized CEPD Logic and Decentralized CEPD Logic with the consideration of relationships between system components and between CEs for prognostics and diagnostics.
The Centralized CEPD Logic is developed based on five centralized CEPD methods. The Decentralized CEPD Logic is developed based on CEDM and detection algorithms described in Table 3.1. The CEDM requires further development to model relationships between CEs and identify best CEPD logic. The detection algorithms are limited to sequential production/service lines. The review of related work also provides foundation for system modeling and visualization of the CEPD logic.
To clearly model a system, five basic concepts are defined:
Definition 1: Co-U. A cooperative unit, Co-U, is an autonomous working unit in a system that performs tasks to achieve its local goals, and collaborate with other Co-Us to accomplish the common goals of a set of Co-Us in the system. A Co-U can be defined with Eq. 4.1:
u(i,t)={π(i,t),(i,t)} Eq. 4.1
u(i,t) is Co-U i in the system at time t. i is the index of Co-Us and is a nonnegative integer. The value of i is unique for each Co-U. π(i,t) is a set of constraint(s) in the system at time t that needs to be satisfied by Co-U i without collaborating with other Co-Us. Let con(r,t) denote constraint r in the system at time t. r is a nonnegative integer and the index of constraints. The value of r is unique for each constraint. π(i,t) is a set of constraints (con(ri,t), con(r2,t), . . . ) that needs to be satisfied by u(i,t) without collaboration. (i,t) is Co-U i state at time t that describes what has occurred with Co-U i by time t. It includes necessary and sufficient information to determine whether or not constraints in π(i,t) are satisfied. Suppose π(i1,t) is the constraint set for u(i1,t) and π(i2,t) is the constraint set for u(i2,t) (i1≠i2), any constraint con(r1,t)∈π(i1,t) and con(r2,t)∈π(i2,t) are different, i.e., con(r1,t)≠con(r2,t).
Definition 2: Co-net. A Co-net is a coordination network that enables cooperation, collaboration, and coordination among a group of Co-Us in a system. A Co-net r is defined with Eq. 4.2:
n(r,t)={Ω(r,t),con(r,t),θ(r,t)} Eq. 4.2
n(r,t) is Co-net r in a system at time t, r is the index of Co-nets and r is a nonnegative integer. The value of is unique for each Co-net. Note that r is the index of both Co-nets and constraints. Ω(r,t) is a set of Co-Us in n(r,t) that must collaborate to satisfy con(r,t). Let N(Ω(r,t)) denote the number of Co-Us in Co-net r at time t. N(Ω(r,t)) is a positive integer and N(Ω,(r,t))≧2. θ(r,t) is Co-net r state at time t that describes what has occurred with Co-net r by time t. θ(r,t) includes necessary and sufficient information to determine whether or not con(r,t) is satisfied.
Constraints and Co-nets have one-to-one relationship. Each constraint can have one and only one corresponding Co-net. Each Co-net satisfies one and only one constraint. If a constraint does not have any corresponding Co-net, it indicates collaboration among Co-Us is not required to satisfy the constraint. Any Co-net must include two or more Co-Us (N(Ω(r,t))≧2). Each Co-net con(r,t) is unique in a system at r and is identified by r. Two Co-nets r1 and r2 (r1≠r2) may include the same set of Co-Us (Ω(r1,t)=Ω(r2,t), and are still different Co-nets because any two constraints in a system are different at r (con(r1,t)≠con(r2,t)).
Definition 3: Conflict/Error (CE); Definition 4: Error; Definition 5: Conflict. Let
denote that Co-Us/Co-nets satisfy and dissatisfy constraints, respectively. Let CE(r,t) represent any CE. Eq. 4.3 defines a CE:
∃CE(r,t),iff con(r,t) is not satisfied, ∀r,t Eq. 4.3
Let E(u(r,i,t)) and C(n(r,t)) represent an error and a conflict, respectively. Eq. 4.4 defines an error and Eq. 4.5 defines a conflict.
The purpose of this research is to develop and apply CEPD logic for Co-nets and Co-Us in a system (
State 1: An error has been detected at t (Eq. 4.6). t1≦t2≦t implies t1 is before or at t2, and t2 is before or at t. r, i, and t1 are known at t since t2≦t.
∃E(u(r,i,t1)), such that r,i, and t1 are known at time t2, ∀r,i,t1,t2,t,t1≦t2≦t Eq. 4.6
State 2: An error has not been detected at t (Eq. 4.7). t1≦t≦t2 implies t1 is before or at t, and t is before or at t2. At lease one of r, i, and t1 is unknown at t because t≦t2.
∃E(u(r,i,t1)), such that at least one of r∪i∪t1 is unknown at time t2, ∀r,i,t1,t2,t,t1≦t≦t2 Eq. 4.7
State 3: A conflict has been detected at t (Eq. 4.8).
∃C(n(r,t1)), such that r and t1 are known at time t2, ∀r,t1t2,t,t1≦t2≦t Eq. 4.8
State 4: A conflict has not been detected at t (Eq. 4.9).
∃C(n(r,t1)), such that at least one of r∪t1 is unknown at time t2, ∀r,t1,t2,t,t1≦t≦t2 Eq. 4.9
State 5: An error has been prognosed at t (Eq. 4.10). t1>t≧t2 implies t1 is after t, and t is after or at t2. r, i, and t1 are known at t because t≧t2.
∃E(u(r,i,t1)), such that r,i and t1 are known at time t2, ∀r,i,t1,t2,t,t1≧t2 Eq. 4.10
State 6: An error has not been prognosed at t (Eq. 4.11). t1>t2≧t implies t1 is after t2, and t2 is after or at t. At lease one of r, i, and t1 is unknown at t because t2≧t.
∃E(u(r,i,t1)), such that at least one of r∪i∪t1 is unknown at time t2, ∀r,i,t1t2,t,t1>t2≧t Eq. 4.11
State 7: A conflict has been prognosed at t (Eq. 4.12).
∃C(n(r,t1)), such that r and t1 are known at time t2, ∀r,t1,t2,t,t1>t≧t2 Eq. 4.12
State 8: A conflict has not been prognosed at t (Eq. 4.13).
∃C(n(r,t1)), such that at least one of r∪t1 is unknown at time t2, ∀r,t1,t2,t,t1>t2≧t Eq. 4.13
State 9: An error has been detected at t as the cause of another error (Eq. 4.14). The CE on the left side of the symbol
is the cause of the CE on the right side of the symbol. For instance, if E(u(r1,i1,t1)) causes E(u(r2,i2,t2)) directly or indirectly, E(u(r1,i1,t1)) is the cause of E(u(r2,i2,t2) and this relationship is expressed as
State 10: An error has not been detected at t as the cause of another error (Eq. 4.15).
State 11: An error has been detected at t as the cause of a conflict (Eq. 4.16).
State 12: An error has not been detected at t as the cause of a conflict (Eq. 4.17).
State 13: A conflict has been detected at t as the cause of an error (Eq. 4.18).
State 14: A conflict has not been detected at t as the cause of an error (Eq. 4.19).
State 15: A conflict has been detected at t as the cause of another conflict (Eq. 4.20).
State 16: A conflict has not been detected at t as the cause of another conflict (Eq. 4.21).
To prevent and detect CEs in a system, it is important to understand system topologies and task dependences (Chen and Nof, 2007). Tasks are products and services requested by customers. A system is task-driven and must complete tasks through networks of Co-Us. A task may be divided into subtasks. Each subtask may be further divided into several other subtasks. A Co-U that is needed to complete a task is assigned one or more subtasks. Task dependences are ways of collaboration among Co-Us.
A task T is requested by customer(s). In total, 23 Co-Us work collaboratively to complete this task. Six types of collaboration are defined:
The six types of collaboration are defined in terms of the need of collaboration and work flow. They can also be categorized according to the order of collaboration, i.e., first-order collaboration and high-order collaboration. First-order collaboration exists between two Co-Us if one of two conditions is met: (1) the two Co-Us collaborate to provide or receive products/services; (2) one of the two Co-Us directly provides products/services to the other Co-U. The six types of collaboration defined in
High-order collaboration exists between two Co-Us if one of them indirectly provides products/services to the other. Both first-order and high-order collaboration may exist between two Co-Us at the same time. First-order collaboration is most concerned in CEPD and high-order collaboration is always analyzed through first-order collaboration. In the rest of this dissertation, the term ‘collaboration’ is used to refer to first-order collaboration unless otherwise specified. With the above definitions, the six types of collaboration can be described mathematically. Some examples in
1. CP: (u(4,t),u(11,t)P, (u(1,t),u(17,t),u(22,t))P;
2. CR: (u(17,t),u(22,t))R, (u(5,t),u(22,t)R;
3. OO: u(13,t)u(20,t), u(6,t)u(16,t), and u(16,t)u(6,t);
4. MO: (u(1,t),u(17,t),u(22,t)Pu(20,t);
5. OM: U(18,t)(u(12,t),u(15,t)R;
6. MM: (u(4,t),u(11,t))P(u(17,t),u(22,t)R.
When two Co-Us do not collaborate, they are task independent (TI). There are total seven types of relationships among Co-Us: TI, CP, CR, OO, MO, OM, and MM. The TI and any of the other six types of collaboration are mutually exclusive. The six types of collaboration are not mutually exclusive. CP is implied if MO or MM exists. CR is implied if OM or MM exists. OO is implied if any of MO, OM, and MM exists. Both MO and OM are implied if MM exists. MO and OM are mutually exclusive.
Constraints defined in Section 4.1 can be divided into two categories (Table 4.1): capability constraints and task constraints. A task constraint determines what products/services Co-U(s) needs to provide to the other Co-U(s). A capability constraint determines conditions a Co-U or a Co-net must meet.
Each constraint consists of three parts:
Table 4.2 describes the six constraints in Table 4.1 with six types of collaboration. Note that con(2,7) and con(6,7) are two different constraints but have the same requirement which needs to be satisfied by different Co-Us.
(u(49,20), u(38,20), u(77,20))P (u(33,20), u(34,20)P
Constraints are related through Co-Us. For instance, con(3,10) and con(4,t) in Table 4.1 are related through u(11,10)/u(11,t). The relationship between constraints reflects high-order collaboration between Co-Us and the relationship between CEs. Two constraints can be dependent or independent. When two constraints con(r1,t1) and con(r2,t2) are dependent, there are two types of dependences: inclusive and mutually exclusive. con(r1,t1) and con(r2,t2) are inclusive if con(r1,t1)⊂con(r2,t2). This is defined in Eq. 4.22.
p(con(r2,t2) is not satisfied|con(r1,t1) is not satisfied)=1, p(con(r1,t1) is not satisfied)≠0 Eq. 4.22
Similarly, con(r2,t2) and con(r1,t1) are inclusive if con(r2,t2)⊂con(r1,t1) (Eq. 4.23).
p(con(r1,t1) is not satisfied|con(r2,t2) is not satisfied)=1, p(con(r2,t2) is not satisfied)≠0 Eq. 4.23
p(con(r1,t1) is not satisfied)=p(con(r2,t2) is not satisfied) if both Eq. 4.22 and Eq. 4.23 are met. con(r1,t1)⊕con(r2,t2) indicates con(r1,t1) and con(r2,t2) are mutually exclusive, which is defined in Eq. 4.24.
p(con(r2,t2) is not satisfied|con(r1,t1) is not satisfied)=0, p(con(r1,t1) is not satisfied)≠0 Eq. 4.24
To meet Eq. 4.24, p(con(r2,t2) is not satisfied con(t1,t1) is not satisfied) must be zero, this implies that Eq. 4.25 is met.
p(con(r1,t1) is not satisfied|con(r2,t2) is not satisfied)=0, p(con(r2,t2) is not satisfied)≠0 Eq. 4.25
A constraint con(r1,t1) is independent if Eq. 4.26 is met.
p(con(r1,t1) is not satisfied|con(r2,t2) is not satisfied)=p(con(r1,t1) is not satisfied), ∀con(r2,t2) Eq. 4.26
The agent-oriented, constraint-based P/T net is developed in this research to model systems and visualize CEPD logic dynamically. The main advantages of this approach are:
In
The CEPD logic can be developed to detect, diagnose, and prognose CEs in a system and the results and flow of the logic can be visualized with a P/T net. The CEPD logic determines how constraints in a system are examined. For instance, in
The assumption of an invariant model indicates that constraints, Co-Us, and the relationship between constraints do not change between tb and te. Assumptions for the CEPD logic are discussed in the next section.
Following the discussion in Sections 3.3 and 4.5, six assumptions for the CEPD logic are summarized as follows:
The Centralized CEPD Logic starts CEPD at a certain time and evaluates all constraints sequentially. A central control unit executes the logic and controls data and information, which are stored in two tables: a constraint (C) table and a constraint relationship (R) table. The C table describes each constraint and specifies the first-order dependence between and among Co-Us. The R table describes the relationship between constraints and specifies the high-order dependence among Co-Us. Table 4.3 is an example of C table. Table 4.4 is an example of R table. They describe the system modeled in
A P/T net can be constructed to represent the Centralized CEPD Logic. An example is shown in
The places P(1,t
The Centralized CEPD Logic starts at TSa. Suppose it ends at TEa (a is a positive integer and the index of execution). The logic can complete four tasks:
The Centralized CEPD Logic can be executed for multiple times, e.g., between TS1 and TE1, TS2 and TE2, and TS3 and TE3. They satisfy the condition tb≦TS1≦TS2≦TS3≦te. Because the logic is executed sequentially, two adjacent executions do not overlap, e.g., TE1≦TS2 and TE2≦TS3.
The Centralized Logic is usually complex (with many steps) and requires a large database to store data and information for an entire system. An alternative way of CEPD is to use intelligent agents to perform CEPD tasks in parallel. Each constraint employs a PDA that detects, diagnose, and prognose the CE related to the constraint. The prognostics and diagnostics of CEs are performed by sending “CE” and “No CE” messages from one PDA to the other.
To apply the Decentralized CEPD Logic, each agent stores information of a constraint. A C table (e.g., Table 4.3) is not necessary because only information of one constraint is needed. Each agent maintains a R table. Table 4.4 is divided into four tables, Table 4.7, Table 4.8, Table 4.9, and Table 4.10, each of which is used by an agent. If a constraint is independent, e.g., con(8,t4), the agent it employs does not have a R table.
The Decentralized CEPD Logic described in Table 4.6 is activated if an agent (r,t) is in one of three situations: (1) receives a “No CE” message; (2) receives a “CE” message; (3) current time tc is equal to or larger than t.
Three CEPD approaches are evaluated in this research, the Centralized CEPD Logic (Table 4.5), the Decentralized CEPD Logic (Table 4.6), and the Traditional CEPD Algorithm discussed in Section 3.3. Two main characteristics of the Traditional CEPD Algorithm are: (1) it is a centralized approach, i.e., it detects CEs sequentially; (2) it does not consider the relationship between CEs, i.e., it cannot perform CE prognostics and diagnostics. The CEPD approaches are applied to different types of networks whose topologies affect the performance of the approaches. This chapter analyzes the three CEPD approaches for three types of constraint networks.
A network is comprised of nodes and links that connect them. In a constraint network, each constraint is a node. The links describe the relationships between constraints. Two types of links, inclusive and exclusive links, are used to describe the two types of relationships, inclusive and exclusive (defined in 4.3), respectively, between any two constraints when they are related. The inclusive link has directions, i.e., con(r1,t1)⊂con(r2,t2) is different than con(r2,t2)⊂con(r1,t1), whereas the exclusive link is undirected, i.e., con(r1,t1)⊕con(r2,t2) is the same as con(r2,t2)⊕con(r1,t1). A constraint network can have both directed and undirected links.
Directed networks can be cyclic, meaning they contain closed loops of links, or acyclic, meaning they do not contain closed loops of links. A constraint network can be cyclic, e.g., both con(r1,t1)⊂con(r2,t2) and con(r2,t2)⊂con(r1,t1) exist, or acyclic. Let • denote a constraint, → denote the inclusive relationship, and - denote the exclusive relationship. There can be five possible relationships between two constraints con(r1,t1) and con(r2,t2) (
The study of network topologies has had a long history stretching back at least in the 1730s. The classic model of network, random network, was first discussed in the early 1950s (Solomonoff and Rapoport, 1951) and was rediscovered and analyzed in a series of papers published in the late 1950s and early 1960s (Erdos and Renyi, 1959, 1960, 1961). Most random networks assume that (1) undirected links; (2) at most one link between any two nodes; (3) a node does not have a link to itself. The degree of a node, d, is the number of links connected to the node. In a random network with n nodes and probability p to connect any pair of nodes, the maximum number of links in the network is ½n(n−1). The probability pd that a node has degree d is
which is also the fraction of nodes in the network that have degree d. The mean degree
The random network is homogeneous in the sense that most nodes in the network have approximately the same number of links. A typical example of the random network is the US highway system (Barabasi, 2002; Jeong, 2003). If a city is a node and the highway between two cities is a link, most cities in the system connect to approximately the same number of highways. Another type of networks that have been extensively studied is the small-world network (e.g., Watts and Strogatz, 1998). Compared to the random network, the small-world network has higher clustering coefficient, which means that there is a heightened probability that two nodes will be connected directly to one another if they have another neighboring node in common. The small-world network model is, however, not a very good model of real networks (Newman et al., 2006). There is little evidence that any known real-world network is substantially similar to the small-world network.
A common characteristic of both random and small-world networks is that the probability of having a highly connected node, i.e., a large d, decreases exponentially with d; nodes with large number of links are practically absent (Barabasi and Albert, 1999). Another type of networks that has been studied extensively and does capture the topology of many real-world networks is the scale-free network (e.g., Albert, et al., 1999; Barabasi and Albert, 1999; Broder et al., 2000; Price, 1965). In scale-free networks, the probability pd that a node has d degree follows a power law distribution, i.e., pd∞d−γ, where γ is between 2.1 and 4 for real-world scale-free networks (Barabasi and Albert, 1999). Compared to the nodes in random and small-world networks, nodes with large d have a better chance to occur in scale-free networks.
Both random and scale-free networks have small-world effect. The distance l between two nodes in a network can be defined as the minimum number of links existing in the network that connect the two nodes. If the distance l between nodes in a network scales logarithmically or slower with network size for fixed mean degree, the network has small-world effect. It can be shown that l=log n/log(p(n−1)) for random networks (Barabasi, 2002; Newman, 2003), and l=log n/log log n for scale-free networks (Cohen and Havlin, 2003). This implies that nodes in both random and scale-free networks are close to each other. Another important characteristic of scale-free networks is that they are resilient to random failures (Albert et al., 2000; Cohen et al., 2000) but are vulnerable to targeted attacks (Cohen et al., 2001).
Another type of networks worth mentioning is the Bose-Einstein condensation network (Bianconi and Barabasi, 2001b). The Bose-Einstein network was discovered in an effort to model the competitive nature of networks. Many real-world scale-free networks are formed following the rule that older nodes have high probability of obtaining links. This is true in many real networks. For example, an old business has more customers than a new business; a Web site that was established ten years ago links to many more Web sites than a Web site established one year ago does. This rule is not always true, however, and a very good example is Google.com, which is a relatively new Web site but links to so many other Web sites. Motivated by this and probably many other examples, a fitness model (Bianconi and Barabasi, 2001a) was proposed to assign a fitness parameter ηi to each node i. A node with higher ηi has higher probability to obtain links. ηi is randomly chosen from a distribution ρ(η). When ρ(η) follows certain distributions, e.g., ρ(η), (λ+1)(1−η)λ, λ>1, a Bose-Einstein condensation network forms.
The Bose-Einstein network shows “winner-takes-all” phenomena observed in competitive networks. This means that the fittest node acquires a finite fraction of the total links (about 80%, Bianconi and Barabasi, 2001 b) in the network. The fraction is independent of the size of the network. In contrast, the fittest nodes' share of all links decreases to zero in the scale-free network. The fittest node corresponds to the lowest energy level in the Bose gas. The fittest node acquires most links corresponds to the phenomenon that many particles are at the lowest energy level when the temperature is close to absolute zero. Bose-Einstein condensation was predicted in 1924 by Bose and Einstein and was created in 1995 by Cornell and Wieman who received the Nobel Prize in Physics 2001 together with Ketterle for their work on Bose-Einstein condensation. Table 5.1 summarizes the three types of networks discussed here.
The three types of networks discussed here can be used to model constraint networks for CEPD. There are many real-world examples in which constraint networks can be modeled with one of the three networks:
Three types of networks, RN, SFN, and BECN, are studied to validate the newly developed CEPD logic, evaluate its performance, and compare it with the Traditional CEPD Algorithm:
p≠0. The mean degree
The exponent −3 is the median of most real-world scale free networks (−2.1˜−4; Barabasi and Albert, 1999). The constant
is fixed by the requirement of normalization, which gives Σpk=1 when n→∞;
To provide a reliable validation of the CEPD logic, it is desirable that the three networks have the same amount of nodes and links. The number of nodes n is the same for all three networks as n→∞. The degree distribution function cd−γ of a SFN is a power series (d→∞), and the mean degree of a SFN converges if γ>2 and diverges if 1<γ≦2 (Arfken, 2005). The mean degree
To analyze the CEPD approaches, a list of parameters and assumptions is defined:
The above definitions are the foundation to evaluate the CEPD approaches with four performance measures:
The Centralized CEPD Logic starts at TS from Step 3 in Table 4.5 because The FIFO queue is empty at TS. Agents in the Decentralized CEPD Logic start at TS from Step 1c in Table 4.6. The Traditional CEPD Algorithm starts at TS and the central control unit randomly and uniformly selects a node to detect CEs. All three approaches can proceed if and only if there is at least one node that needs to be satisfied before TS. The probability that no nodes can be found that need to be satisfied before TS is
Because n→∞ and 0<TS<T,
This indicates that all three approaches start CEPD by detecting CEs at one or more nodes regardless of the value of TS.
After randomly and uniformly selecting a node at Step 4 in Table 4.5 Centralized CEPD Logic, the Centralized CEPD Logic moves to Step 5 to detect CEs. There is a pCEr+(1−pCE)(1−r) probability that a CE is detected and a pcE(1−r)+(1−pCE)r probability that a CE is not detected. When a CE is detected, Steps 6 and 7 are executed and the logic moves to Step 11. Steps 8 through 10 are skipped because the exclusive relationship between constraints is not considered. From Steps 11 to 20, the logic finds all nodes that are related to the current node and mark them as having CEs. When a CE is not detected, the logic repeats Steps 6 through 8 and mark all nodes that are related to the current node as not having CEs. Essentially, once a node is found at Step 3, the Centralized CEPD Logic marks all nodes that are related to the node whether or not a CE is detected.
1. Random Network (RN)
The performance of the Centralized CEPD Logic depends on the mean degree
guarantees a giant component appears in the RN. Since (n−1)p=π2/7.20, it requires that
The larger
is, the larger is the giant component. When pin=pout=0 and pboth=1,
is the largest,
and the giant component includes about 49% of all nodes in the RN. Let pg denote the portion of nodes in the giant component, pg is the function of
pboth=1 indicates that all links in the RN have two directions. The RN becomes an undirected network. There is a 49% probability that the node selected at Step 4 belongs to the giant component. Except the one giant component, there are many smaller components that fill the portion of the RN not occupied by the giant component and their average size is
When pboth=1, the average size of small components is 3.32.
When 0<pboth<1, the RN includes both directed and undirected links. The RN includes only directed links if pboth=0. A network is called a directed network if it has only directed links. Research has been conducted to study the size of the giant component in the directed network (Angeles Serrano and De Los Rios, 2007; Broder et al., 2000; Dorogovtsev et al., 2001; Newman et al., 2001; Tadic, 2001). Various results have been suggested under different assumptions. Because the connectivity of directed networks is more complex than the connectivity of undirected networks, there are also various definitions, e.g., strongly connected component, weakly connected component, IN component, and OUT component. More research is needed to help understand the characteristics of directed networks and networks with both directed and undirected links. In the following discussion, a network is a directed network if pboth<1.
Comparison of
It can be reasonably assumed that the time needed to find related nodes according to the relationship and mark the nodes is negligible compared to the detection time (mean is
Proof:
Suppose there is a directed network (Net 1) with pboth<1. It is possible to replace each directed link in Net 1 with an undirected link to form an undirected network, Net 2 (pboth=1). The TT of Net 2 is not larger than the of Net 1 because (1) some nodes require detection (Step 5) in Net 1, but do not require detection in Net 2; (2) because of (1), more communication time between nodes and the central control unit is needed in Net 1; (3) because of (1) and (2), more nodes may require CE detection (Step 3 in Table 4.5) in Net 1. This further increases the TT of Net 1. For each directed network, an corresponding undirected network can be formed. For each pair of such networks, the of the directed network is larger than or equal to that of the undirected network. Hence,
Comparison of
A critical difference between the undirected and directed networks is that certain nodes (say n′ nodes) that require detection for n′ times (one time for each node) in the directed network require detection for only one time in the undirected network. Let j be the number of times Step 5 is executed.
p
=1≦
This conclusion is true for the RN, SFN, and BECN when the Centralized CEPD Logic is applied.
Proof:
Suppose there are n nodes that (1) are in the same component and require detection once in the undirected network; (2) require detection if times in the directed network. In the undirected network, CA=1 with probability pCEr and CA=0 with probability 1−pCEr.
Apparently,
therefore. This completes the proof.
Comparison of
First, note that
Proof:
In the undirected network,
with probability pCEr and PA=0 with probability
in is the number of CEs prognosed and 0≦m≦n′. In the directed network, PA2=0 because each of n′ nodes needs to be detected.
Comparison of
Because
Calculation of
For an arbitrary value of TS, at least one node that needs to be satisfied before TS can be found at Step 3 because n→∞. This indicates j≧1. Note that j is the number of times Step 5 is executed.
i.e.,
Proof:
Let n″ be the number of nodes left in the undirected RN after j steps. n″≧0.51n−3.32(j−1). When
j∈o(n) because
If j∈o(n), the probability that no nodes that need to satisfied before j(
When TS+j(
This indicates at least one node that needs to satisfied before j(
When
and TT=T−TS. This conclusion is true for the RN, SFN, and BECN when the Centralized CEPD Logic is applied. j is the largest integer less than
Calculation of
When j=1,
When j increases,
If j is large such that 0.51j→0,
is large and
do not conflict and can both be true at the same time. For example, if
0.51j=1.42×10−6≈0 and
The understanding of the result is as follows. When j is large, there is almost a 100% probability (1−0.51j≈0) that a node belongs to the giant component is selected at Step 3. The
Calculation of
When j=1,
When j increases,
If j is large such that 0.51jj→0,
Calculation of
When j=1,
has two parts,
is the percentage of nodes that must be satisfied before TS+
where the second
is the average of the time that nodes must be satisfied; the first
is the average of the time that CEs exist. When j increases,
There is no simple form for
Where
If j is large such that 0.51j→0,
Scale-Free Network (SFN):
The degree distribution of the SFN is
Because the exponent is −3, the SFN has a giant component whose size scales linearly with the size of the network, n, but does not fill the whole network (Aiello et al., 2000; Newman et al., 2006). The size of the giant component can be described as pgn, where 0<pg<1. Beside the giant component, there are many non-giant components in the SFN. The second largest components have size of θ(log n) (Aiello et al., 2000). Because
the TT,
Bose-Einstein Condensation Network (BECN):
In the BECN, the fittest node has 80% of total links and 20% links are uniformly and randomly assigned to other nodes. The mean total number of links is
The fittest node has on average 0.55n links. Each of all other nodes has on average
nodes. The average size of the giant component is between 0.5517 and 0.6317 when pboth=1.
Proof:
The average number of nodes, including the fittest node and nodes that link to the fittest node directly is 0.55n+1≈0.55n. The nodes that link to the fittest node directly also link to other nodes. The average number of links between the nodes that link to the fittest node, and the nodes that do not link to the fittest node directly is
There are less than or equal to 0.55n×0.126 nodes that link to the fittest node indirectly and are one node away from the fittest node. This calculation continues and the average total number of nodes that link to the fittest node indirectly is less than or equal to 0.55n(0.126+0.1262+ . . . )≦0.08n. The size of the giant component is therefore between 0.55n and 0.63n when pboth=1. This completes the proof.
Note that all the nodes that do not link to the giant component directly or indirectly belong to small components because the average number of links they have is 0.28. Theorem 1 summarizes the analysis of the Centralized CEPD Logic.
Theorem 1:
the Centralized CEPD Logic has the following properties when it is applied to the RN, SFN, or BECN if
1. The total CEPD time TTp
2. The mean coverage ability
3. The mean prognostics ability
4. The mean total damage
Without the condition
p
=1≦
therefore) and j is large such that (1−pg)jj→0 (j is the number of times Step 5 is executed and pg is the portion of nodes that belong to the giant component):
1. The mean coverage ability
2. The mean prognostics ability
3. The mean total damage
and 0<pg<1 for the SFN; the fittest node of the BECN has 80% of links and 0.55≦pg≦0.63 for the BECN.
With the Decentralized CEPD Logic (Table 4.6), an agent is deployed at each node. For nodes that need to be satisfied before TS, their corresponding agents start CEPD from Step 1c. Without rigorous proof, it is clear that the
Because agents perform CEPD in parallel, the
where
is the mean number of nodes that need to satisfied before TS. If
i.e.,
0<c<∞, then
Because the average distance in an undirected RN is roughly l=log n/log((n−1)p) and the average distance in the giant component is less than log 0.49n/log((n−1)p), the upper bound of
The similar result can be obtained for the undirected SFN, i.e.,
For the undirected BECN,
When n→∞,
In many real-world situations,
and n=10000. With the Centralized CEPD Logic, TT=T−TS=0.5T because
With the Decentralized CEPD Logic,
With the Decentralized CEPD Logic, it is clear that
for any of the undirected RN, SFN, and BECN. The further analysis of
at least one node that needs to be satisfied before TS belongs to the giant component. For all the nodes that need to be satisfied after TS and belong to the giant component,
The
This result is valid if TT≦T−TS. Note that when ti=0, TT=max (td).
The calculation of
The
This result is valid if TT≦T−TS. It is expected that both
Even with the assumption that ti=0, the exact form of
Note that the above analysis is based on the assumption that only nodes that need to be satisfied before TS are detected for CEs. According to Steps 1c and 2c of the Decentralized CEPD Logic in Table 4.6, nodes that need to satisfied after TS may also be detected for CEs. One of the conditions in the analysis of the Centralized CEPD Logic is
One of the conditions in the analysis of the Decentralized CEPD Logic is ti=0. Because both conditions need to be satisfied for the comparison between the two approaches,
This means
which can be rewritten as
This indicates that
which means in a period of
There are therefore two situations: (a) only nodes that need to be satisfied before TS are detected for CEs, i.e., Steps 1c and 2c are executed by agents for nodes that need to be satisfied before TS; (b) not only nodes that need to be satisfied before TS, but also nodes that need to be satisfied after TS are detected for CEs. Because
situation (b) has the following properties: (1) TT=T−TS; (2) CA≈pCEr; (3) the
Theorem 2:
the Decentralized CEPD Logic has the following properties when it is applied to the RN, SFN, or BECN:
1. The total CEPD time TTp
2. The mean coverage ability
3. The mean prognostics ability PAp
4. The mean total damage
When pboth=1,
and only nodes that need to be satisfied before TS are detected for CEs (situation (a)), the Decentralized CEPD Logic has the following properties:
for the BECN; TT=max (td)) if ti=0;
when TT≦T−TS.
when TT≦T−TS and ti=0;
when TT≦T−TS and ti=0.
When
and nodes are detected for CEs when they are due for detection (situation (b)), the Decentralized CEPD Logic has the following properties:
pg=0.49 for the RN;
and 0<pg<1 for the SFN; the fittest node of the BECN has 80% of links and 0.55≦pg≦0.63 for the BECN.
The Traditional CEPD Algorithm detects CEs sequentially without considering the relationship between nodes. Regardless of the value of pboth and the types of networks, TT=T−TS,
PA=0, and
Theorem 3 summarizes the analysis of the Traditional CEPD Algorithm.
Theorem 3:
the Traditional CEPD Algorithm has the following properties when it is applied to the RN, SFN, or BECN if
Table 5.2, Table 5.3, and Table 5.4 summarize the analysis results in the three theorems. Analysis results show that (1) the decentralized CEPD Logic performs better than the Centralized CEPD Logic, and the Centralized CEPD Logic performs better than the Traditional CEPD Algorithm in terms of the mean of all four performance measures regardless of the type of undirected network; (2) both the Centralized and Decentralized CEPD Logic performs better over the undirected network than the corresponding directed network in terms of the mean of all four performance measures; (3) the Centralized Logic performs better over the undirected BECN than the undirected RN in terms of the mean of CA; (4) the Traditional CEPD Algorithm has the same performance regardless of the type of network (RN, SFN, or BECN; directed or undirected).
or ≈ pCEr
td ≦ T − TS
p
= 1 ≧
p
= 1 =
p
= 1 ≧
p
= 1 =
p
= 1 ≦
p
= 1 =
The objectives of the experiments are to: (1) verify the analytical results of the three CEPD approaches; (2) explore properties of the Centralized and Decentralized CEPD Logic that have not been analytically studied; (3) discover the properties of the CEPD approaches when certain conditions are relaxed; (4) identify improvements for the Centralized and Decentralized CEPD Logic to obtain better performance.
The experiments are conducted by executing a software program using AutoMod (AutoMod Version 11.1, 1998-2003) to simulate the three CEPD approaches. Three networks, RN, SFN, and BECN, must be generated in the experiments. Due to the limit of maximum 200 entities in AutoMod, n=150 for each network. In the SFN,
still holds when n=150. The parameters of the three networks are:
1≦d≦149;
The fittest node has 80% of total links, i.e., 82 links, and the other 20% links, i.e., 21 links, are uniformly and randomly assigned to other nodes.
The generation of the RN and BECN is straightforward. To generate the SFN, a four-step procedure has been used in previous research (Molloy and Reed, 1995; Newman et al., 2001; Newman et al., 2006):
As long as
is even, the procedure can always be completed. If
is odd, the procedure repeats Step 1 until
becomes even. In Step 3, each pair has two different nodes. If two nodes are the same, they are not paired up and Step 3 is repeated until two different nodes are selected. Two nodes are paired up only once. If two randomly selected nodes have already been paired up, Step 3 is repeated to find another pair of nodes. A possible failure that has not been discussed previously can be illustrated with the following example.
Suppose a network has three nodes, node 1, node 2, and node 3. Node 1 has one link; node 2 has one link; and node 3 has two links. There are four numbers on the list generated in Step 2: 1, 2, 3, 3. It is possible that the numbers 1 and 2 are first selected from the list. This causes a failure because node 3 cannot link to itself.
A small revision of the procedure can avoid the failure. This means to first pair up nodes that have more links. To generate a pair of nodes, the first node is chosen according to the number of links it has, i.e., the number of times it appears on the list; the second node is chosen randomly and uniformly from the list. The revision can avoid the situation in which nodes with more links have to link to itself. Note that, however, nodes with more links already have higher probability to be chosen following the procedure. The revision therefore may not be necessary. In the experiments, the four-step procedure described above is used to generate the SFN. If the failure discussed here occurs, all pairings will be deleted and the procedure repeats Step 3.
To verify the analytical results, the design of the experiments is as following:
the probability that no nodes that need to satisfied before j(
n′≧c1n−c2(j−1) where c1 is the portion of nodes that belong to small components (c1=0.51 for the undirected RN; 0<c1<1 for the undirected SFN; 0.37≦c1≦0.45 for the undirected BECN); c2 is the size of small components (c2=3.32 for the undirected RN; c2∈θ(log n) for the undirected SFN).
for the Centralized CEPD Logic, 8.33<j<12.5. (1−pg)j j≠0 where pg is the size of the giant component;
where pg is the size of the giant component;
There are 18 combinations (3×6) of independent variables. Two hundred experiments are conducted for each combination. Total 3600 experiments are conducted using AutoMod (AutoMod Version 11.1, 1998-2003). The experiments aim to verify the results summarized in Table 6.1.
0.45pg
Table 6.2 summarizes the experiment results for verification. Comparing Table 6.1 and Table 6.2, there is an excellent agreement between experiment and analytical results except for two values, the
Because n=150 in the experiments, the size of some non-giant components may be close to the size of the giant component. In other words, the effect of non-giant components is not negligible when n=150. This explains why the
1. Performance of the Three CEPD Approaches Over the Undirected Networks
The experiment results in Table 6.2 show that:
These results verify the analysis summarized in Table 5.2. For the large undirected constraint network, RN, SFN, or BECN, with a giant component, the Decentralized CEPD Logic is always preferred to detect, diagnose, and prognose CEs with the largest
2. Performance of the Three CEPD Approaches Over the Directed Networks
The experiment results in Table 6.2 show that for each of the three directed networks, RN, SFN, and BECN, the Decentralized CEPD Logic outperforms both the Centralized CEPD Logic and Traditional CEPD Algorithm. Compared to the other two approaches, the Decentralized CEPD Logic has smaller
3. Performance of the Three CEPD Approaches Over Different Networks (RN, SFN, or BECN), and the Corresponding Undirected and Directed Networks
The experiment results in Table 6.2 show that the Traditional CEPD Algorithm performs the same over the corresponding directed and undirected networks. The Centralized CEPD Logic performs better over the undirected network with larger
The experiment results in Table 6.2 show that for all three networks, RN, SFN, and BECN (all are directed or undirected), the performance of the Centralized CEPD Logic is BECN>RN>SFN. The performance of the Decentralized CEPD Logic and Traditional CEPD Algorithm is not sensitive to the type of network. The Decentralized CEPD Logic, however, performs a little better over the SFN than the RN and BECN because there are some large components in the SFN other than the giant component. These results verify the analysis summarized in Table 5.4. theorem 4 summarizes the selection of CEPD approaches to detect, diagnose, and prognose CEs.
Theorem 4:
for any of the directed or undirected constraint network, RN, SFN, or BECN, the Decentralized CEPD Logic is preferred to detect, diagnose, and prognose CEs with the largest
The CA and PA defined earlier to evaluate the performance of three CEPD approaches are not the CE coverage ability or CE prognostics ability. The CE coverage ability (CECA) is the quotient of the number of CEs that are detected, diagnosed, or prognosed, divided by the total number of CEs, 0≦CECA≦1. The CE prognostics ability (CEPA) is the quotient of the number of CEs that are prognosed, divided by the total number of CEs, 0≦CEPA≦1. The CECA and CEPA are the performance measures of interest, but cannot be analyzed mathematically because of the dependence between the numerator and denominator. Table 6.3 summarizes the experiment results of
It is expected and the results in Table 6.2 and Table 6.3 show that
In summary, the Decentralized CEPD Logic is preferred to detect, diagnose, and prognose CEs in terms of
It is assumed in the Centralized CEPD Logic that the time needed to find related nodes according to the relationship (e.g., Table 4.4) and mark the nodes (e.g., Table 4.3) is negligible. In the Decentralized CEPD Logic, it is assumed that the communication time between agents is zero, i.e., ti=0. As a result, the communication time between the central control unit and nodes in the Centralized CEPD Logic is also assumed to be zero for the purpose of valid comparison between the CEPD approaches.
In reality, the time needed to find related nodes and mark them is very small and close to zero because the task is completed automatically by computers or other computing devices. This is verified by simulation experiments which take less than 1 second of CPU time to complete one experiment. It is therefore reasonable to assume the time needed to search and mark nodes to be zero. The communication time in both Centralized and Decentralized CEPD Logic, however, may not be negligible, and is affected by many factors such as distance between nodes, routing algorithms, and transmission media. It is interesting to know how the three CEPD approaches perform when the communication time is larger than zero.
Additional experiments are conducted to study the performance of three CEPD approaches when ti≠0. In the experiments, it is assumed that ti and td follow the same distribution, i.e.,
The comparison between Table 6.4, and Table 6.2 and Table 6.3 verifies the expectations. The performance of all three CEPD approaches deteriorates when ti≠0. Among them, the Decentralized CEPD Logic is the most robust approach because its performance is affected the least when ti increases.
Based on the analysis and experiment results, it becomes clear that the performance of the Centralized CEPD Logic is sensitive to the network topology, i.e., RN, SFN, BECN, directed, or undirected. The performance of the Traditional CEPD Algorithm is not sensitive to the network topology at all. The performance of the Decentralized CEPD Logic is sensitive to the network topology to a certain degree, but not as much as the Centralized CEPD Logic. It is interesting to know if the Centralized or Decentralized CEPD Logic can take advantage of the network topology to improve their performance.
With the Decentralized CEPD Logic, each agent has local information about only one node, i.e., the constraint relationship (R) table such as Table 4.7 that describes the relationship between the node and other nodes. The only way for the Decentralized CEPD Logic to take advantage of the network topology is to let an agent have information about other nodes that are related to the node the agent is attached to. It is uncertain if this will improve the performance. Moreover, to enable an agent to obtain information about related nodes increases information overload and may be time consuming and error prone. It is therefore not recommended for agents to obtain information about related nodes.
A central control unit executes the Centralized CEPD Logic described in Table 4.5 to detect, diagnose, and prognose CEs. Because the central control unit has the complete information about all the nodes in the network (e.g., Table 4.3) and their relationship (e.g., Table 4.4), it is possible to explore the network topology to improve the performance of the Centralized CEPD Logic. Probably the most helpful improvement is to start CE detection at the node that has the most links, then go to the next node that has the second most links, and continue the process until all the nodes in the network are checked or the time that the CEPD must stop. This is relatively a minor change of the logic. In Step 3 of Table 4.5, originally con(r,t) is randomly selected. To improve, con(r,t) will be selected if it has the most links. The computation time of selecting the node with most links is negligible.
The improved CEPD Logic, Network-Adaptive Centralized CEPD Logic, is expected to have better performance, i.e., it is expected that
The comparison between Table 6.5, Table 6.2, and Table 6.3 verifies the expectations.
This research studies conflict and error prevention and detection. The unique and major theoretical contributions of this research are:
The guidelines of when to apply which CEPD approaches have been developed in this research and are described as follows. Table 7.1 summarizes the recommended CEPD logic and algorithm.
The general implementation steps to apply the CEPD Logic for a complex system are as following:
The above implementation steps can be illustrated with the electrical power grid of the western United States (Barabasi and Albert, 1999):
While the invention has been illustrated and described in detail in the drawings and foregoing description, the same is to be considered as illustrative and not restrictive in character, it being understood that only preferred embodiments have been shown and described and that all changes and modifications that come within the spirit of the invention are desired to be protected.
The following references are incorporated herein by reference:
This application claims priority to and is a continuation of U.S. application Ser. No. 13/174,424, filed Jun. 30, 2011, which in turn was a non-provisional of Provisional Application No. 61/359,994, filed Jun. 30, 2010, the entire disclosure of which is incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
61359994 | Jun 2010 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 13174424 | Jun 2011 | US |
Child | 14685352 | US |