1. Field of the Invention
The present invention relates to computing systems, and more particularly, to building intelligent reasoning models based on Bayesian networks.
2. Background
As a powerful framework for knowledge representation and intelligent reasoning, Bayesian networks are used in diagnostic and prognostic applications. However, with the lack of efficient tools for building high-quality Bayesian network models, the modeling process becomes a bottleneck to broad deployment of this technology. To build these models, the traditional method is to extract domain knowledge from human experts.
Conventional method for building models rely on manual input from domain experts. Typically, domain experts are interviewed for knowledge engineering, which results in a significant amount of interaction with human beings. The availability of experts is often limited and human judgment about probability is systematically error-prone. Therefore, the conventional knowledge engineering approach to model building is largely a manual and labor-intensive process and hence undesirable.
Therefore, what is needed is a method and system for automatically generating Bayesian networks for intelligent reasoning such as diagnosis and prognosis with minimum manual input/human interaction.
In one aspect of the present invention, a method of building a reasoning model using relational databases is provided. The method includes identifying data objects in relational databases; determining dependency relationships between the data objects; translating the data objects into nodes of a Bayesian network; and automatically translating the dependency relationships into a graphical structure of a Bayesian network.
A system for building a reasoning model using relational databases is provided. The system includes at least one server for storing data of a system having numerous interconnected parts; monitoring agents for monitoring the data of the numerous interconnected parts stored in the system; an events log for storing any event observed by the monitoring agents; and relational databases for storing data objects, the data objects correspond to the data of the numerous interconnected parts.
This brief summary has been provided so that the nature of the invention may be understood quickly. A more complete understanding of the invention can be obtained by reference to the following detailed description of the preferred embodiments thereof in connection with the attached drawings.
The foregoing features and other features of the present invention will now be described with reference to the drawings of a preferred embodiment. In the drawings, the same components have the same reference numerals. The illustrated embodiment is intended to illustrate, but not to limit the invention. The drawings include the following Figures:
The following detailed description is of the best currently contemplated modes of carrying out the invention. The description is not to be taken in a limiting sense, but is made merely for the purpose of illustrating the general principles of the invention, since the scope of the invention is best defined by the appended claims.
According to the present invention, a method for building intelligent reasoning models, based on Bayesian networks, from relational databases is provided. Reasoning models are particularly useful for the aircraft industry; however the method of the present invention can construct reasoning models that can be used to troubleshoot any system having a number of interconnected components, such as the complex systems created by the automotive, locomotive, marine, electronics, power generation, medical and computer industries. As more and more systems use relational databases as data repository and event log, this method of the present invention for automatically modeling Bayesian networks can be widely employed in other application domains.
Turning to
Host system 25 connects to a computer network (not shown) via network interface 23 (and through a network connection (not shown)). One such network is the Internet that allows host system 25 to download applications, code, documents and others electronic information.
Read only memory (“ROM”) 19 is provided to store invariant instruction sequences such as start-up instruction sequences or basic Input/output operating system (BIOS) sequences.
Input/Output (“I/O”) device interface 27A allows host 25 to connect to various input/out devices, for example, a keyboard, a pointing device (“mouse”), a monitor, printer, a modem and the like. I/O device interface 27A is shown a single block for simplicity and may include plural interfaces to interface with different types of I/O devices.
It is noteworthy that the present invention is not limited to the architecture of the computing system shown in
Turning to
A snapshot of a fragment of a Bayesian network generated from the method of the present invention is illustrated in
The web applications can be used to perform numerous functions such as document retrieval. Monitoring agents located in the third column 7 simulate web requests to the server by sending a request to a web application in the second column 5. The web application then responds to the monitoring agent by providing the requested document in a reasonable time frame. When the requested document is sent, an alert will be issued. The alerts are classified into three categories: critical, warning or normal. For example, if an observation node, in the fourth or fifth columns 9, 11 indicates a long delay between the request and the delivery of the document, a warning message is displayed. If the document was not received within the preset time-out threshold, a critical message is displayed indicating immediate attention is required. Not all nodes indicate the same problem as the observation nodes are connected to different nodes, thus each of the nodes are responsible for only a certain group of web applications or monitoring agents.
If an observation node, as shown in
It is possible to have multiple probable causes for an abnormal event. Depending on which node and which group of nodes have what kind of alert (such as critical or merely a warning as described above), the posterior probabilities of the probable causes can be computed based on the Bayesian network model to help fault isolation. For example, if a piece of hardware is slow, posterior probability might indicate how likely it will be for a particular web application to be slow or how likely a particular message is to occur. If a critical message is observed, it is possible to determine if there are problems with the related monitoring group.
Backwards reasoning based on the Bayesian network model is used to diagnose which monitoring group has a problem. In the reasoning, partial observed evidence is added on to the prior knowledge about the system behavior. With the combination of the evidence and prior knowledge, the posterior probability can be computed based on the probability theory. According to the updated belief of the posterior probabilities, a determination can be made as to what is the most likely cause of the problem or failure. There exists software to provide standard algorithms to perform the reasoning task.
The relational databases, as discussed above, are comprised of multiple tables of data.
Any event that occurs in the system, such as the failure of a component on an aircraft, is recorded in an events log 30 illustrated in
From the data recorded in events log 30, a frequency of events' occurrence can be computed and used to estimate the probability distribution for the corresponding node. In other words, based on the observed data, a probability of the event reoccurring is computed. For example in a web service domain; it can usually be estimated if the Internet is slow or has traffic. After the graphical structure is built and the probability distributions are obtained, the modeling process for a Bayesian network is complete. Then using the available reasoning engine for the Bayesian network framework, intelligent reasoning based on the model can be performed.
The Bayesian network which is generated can display the columns of nodes in various colors to easily identify the type of node. For example, yellow could indicate hardware such as a computer, host or Internet. Red could indicate software, such as a web application or a server. Pink could indicate monitoring agents and green could indicate observations or messages.
Although the present invention has been described with reference to specific embodiments, these embodiments are illustrative only and not limiting. Many other applications and embodiments of the present invention will be apparent in light of this disclosure and the following claims.