1. Field of the Invention
This invention relates generally to a system and method for determining events in a system or process and, more particularly, to a system and method for predicting multiple faults in a system or process using a liquid state machine approach.
2. Discussion of the Related Art
Various types of systems, such as manufacturing processes, can employ many different machines operating in a variety of different manners. For some of these systems, it is critical that the system operate efficiently without interruption because failure of any part of the system may cause the whole system to go down, which could be costly. Because of this, there has been great effort in various industries to monitor certain systems in an attempt to predict failures and faults that may be more effectively handled prior to the failure actually occurring. For example, it is known to monitor various detectors and sensors in a system in an attempt to predict a failure of the detection or sensor before it occurs. However, given the vast number of inputs for such systems, little success in predicting faults and failures has been achieved.
Traditional approaches to fault prediction are capable of processing only single and possibly uncorrelated fault types. When these approaches are used for processing more than one fault, they tend to provide less robust results because of the cross-talk between various faults impinging on the network nodes. The fundamental reason for this is that the training regime used is typically based on back-propagating weight changes in the network that is very susceptible to being trapped in a local minima. In those systems that predict different faults independently, such processes do not exploit correlations and are too expensive to be used to cross entire data sets. In those processes that predict faults using correlating models, the execution time of the process grows either exponentially or geometrically, and it is only feasible if the number of faults to predict is low and there is a known correlation.
Fault occurrences in these types of system are typically noisy and have a variable rate. Also, the fault occurrences have complex, non-linear dynamics and need to be uncovered for a robust prediction.
In accordance with the teachings of the present invention, a system and method are disclosed for determining events in a system or process, such as predicting fault events. The method includes providing data from the process, pre-processing the data and converting the data to one or more temporal spike trains having spike amplitudes and a spike train length. The spike trains are provided to a dynamical neural network operating as a liquid state machine that includes a plurality of neurons that analyze the spike trains. The dynamical neural network is trained by known data to identify events in the spike train, where the dynamical neural network then analyzes new data to identify events. Signals from the dynamical neural network are then provided to a readout network that decodes the states and predicts the future events.
Additional features of the present invention will become apparent from the following description and appended claims, taken in conjunction with the accompanying drawings.
The following discussion of the embodiments of the invention directed to a system and method for predicting multiple temporal events using a neural network and liquid state machine design is merely exemplary in nature, and is in no way intended to limit the invention or its applications or uses.
The present invention proposes a system and method for simultaneously predicting future occurrences of multiple fault events in a system or process, such as a production line or a manufacturing plant. The proposed approach derives its roots from spike train based neural networks and is robust and efficient in its predictions despite simultaneously modeling of several faults. One example of a spike rain based neural network is a liquid state machine (LSM) that uses an excitable medium, i.e., a liquid, to process temporal inputs in real-time, and simple read out units to extract temporal features in the medium and produce an estimation. While a traditional computation model relies on discreet states, such as 0/1 or on/off, that remain stable, the LSM uses continuous and transient states. LSM functions resemble a body of liquid and the inputs disturb the surface of the liquid to create unique ripples that propagate, interact and eventually fade away.
The present invention exploits the basic frame work of dynamical neural networks, such as liquid state machines. Because the state of the dynamical neural network is a function of its past inputs, it is proposed that it is possible to exploit these dynamical states as a window into past events and use that information to predict or classify an impending occurrence on a future event. Furthermore, the state of the dynamical system is independent of the source from which the input was derived. Because the liquid medium of the dynamical neural network adjusts its state automatically as input events impinge upon it, a single dynamical neural network can also accept a multiple series of input events. Thus, a single dynamical neural network can be used to process multiple faults simultaneously.
The present invention formulates the fault problem as a spike train based dynamic neural network. Particularly, the state of the dynamic neural network layer, composed of excitatory and inhibitory neurons, is changed due to inputs in the form of spike trains. An excitatory neuron adds signal strength to the neurons it is connected to and an inhibitory neuron attenuates signals. In one non-limiting embodiment, the neural network includes 80% excitatory neurons and 20% inhibitory neurons.
The dynamically changing state of a network layer provides an image of the network state. This image can be of a snapshot of the network at a given time and is dependent on the history of the past spikes that impinged on the network. This image is a non-linear transformation of the input space. By training a simple one layer network on top of this dynamic network layer, it is possible to simultaneously classify and predict multiple faults at the same time in a very robust fashion.
The basic operation for processing multiple faults using dynamic neural networks is given as follows. First, the raw fault event data is preprocessed by sorting the raw fault events by fault-code and time. The events are then resampled and classified. The process then selects a spike train encoding scheme to encode temporal occurrences of faults, and determines an appropriate length of an event window, referred to as a spike length. The process then generates a dynamical neural network, including generating a train set and test set of spike trains. The readouts are then trained by applying a semi-supervised learning algorithm to the trained data set, and the performance of the trained readouts on the test set data are predicted and evaluated.
The data from the various machines, detectors, sensors, etc. that is encoded to generate the spike train set 32 can be performed in any suitable technique. For example, a space encoding technique can be employed where data classes can be encoded with two binary digits. For example, for a four class problem, class 0 is encoded 00, class 1 is encoded 01, class 2 is encoded 10 and class 3 is encoded 11. The input events are encoded into two spike trains, a high digit train and a low digit train, and fed into the LSM with two input lines.
Also, a frequency-based encoding scheme can be employed where all of the spikes have the same magnitude. A weak stimulus is represented with a low frequency, i.e., a few spikes at a time interval, and a strong stimulus is represented with high frequency, more spikes in the time interval.
Further, a class-based encoding scheme can be employed where the number of spikes in the corresponding interval in the spike train is decided based on the class that the event belongs to. The class to which each event belongs can be decided by several standard means including among others any variation of data, model or expert-driven clustering. An event in class 1 is encoded into one spike in the corresponding interval, an event in class 2 is encoded into two spikes in the corresponding interval, an event in class 3 is encoded into three spikes in the corresponding interval, etc.
Also, data-based encoding can be employed that maps the actual data, such as down time or frequency, of the event into the number of spikes from one spike to N spikes. For the mapping or scaling function, a square root function can be initially used, and later a log function can be used.
The dynamical neural network has many adjustable parameters that will affect the performance and execution time of various applications. The neurons in the dynamical neural network have a refractory period where the neurons require time to recover after processing. In one embodiment, the interval for each event can be set to 25 ms and the refractory period can be set to 3 ms. Thus, each event interval can accept up to eight effective spikes. Among the various other parameters, the number of neurons and the ratio of excitatory to inhibitory neurons in the network are important. In one embodiment, 256 neurons can be employed and a 0.85 ratio of excitatory to inhibitory neurons can be used. Class accuracy is determined as the number of correct predictions divided by the number of test cases. The length of a spike train affects the performance of the system. Several variations of the spike train lengths can be tried. Each fault has different characteristics and shows peak performance on different spike train lengths. Thus, for this embodiment, there is no single optimal spike train length. It has to be estimated on a fault-by-fault or group-by-group basis.
Each readout monitors the dynamical network states and generates its estimation. A class that corresponds to a readout with a highest value is chosen as the predicted class. Machines are seldom down in a manufacturing plant, and they are rarely down for a long time. This implies that the data distribution for the various classes is different. The number of events in a no event class is very large and the number of events in a large class is very small. There is a large bias in the data set. In one embodiment, the number of cases in each class is counted, and the minimum number is determined. The minimum number is usually small. It is not appropriate to select the same minimum number of classes from all of the classes because that may abandon lots of useful data in other classes. Based on the minimum number, the maximum number of cases that will be included in the spike train data set for all classes are set. When the number of cases in the class is larger than the maximum, only a select number of selected cases are included. Some of the neurons 58 are selected as input neurons that receive the spike train data. Depending on which of the other neurons 58 the input neurons are connected to will determine which neurons are fired. For example, when the input neurons 64 and 66 receive a spike from the spike trains 54, they will send those spikes to the neurons 58 that they are coupled to. If a neuron gets enough spikes from other neurons that combination of the spike exceeds a threshold, then that neuron will fire and provide a spike to the neuron it is coupled to. Every one of the neurons 58 in the network 56 is coupled to each of the readout neurons 62.
The system 50 shows that the algorithm scales linearly in computation time with an increase in the number of faults. Normally for this kind of problem, the computation time increases exponentially given the event cross-correlation. The algorithm also shows that the false alarms can be decreased when the LSM is exposed to more fault data from the same operation while accuracy can be increased. Also, by simultaneously processing multiple faults, the LSM is able to improve by reducing the false alarms on faults as more faults are modeled because it is able to extract new correlations with more faults thereby improving its ability to make accurate predictions.
LSM is approximately linear in computation time in respect to the number of input variables. The event detection accuracy of the LSM is not significantly affected when the number of faults processed increases. Further, the false alarm rate of the LSM remains relatively low and constant when the number of faults processed increases. Also the LSM is a feasible alternative for heterogeneous multi-variable prediction. Heterogeneous variables are, for example, combinations of discrete/continuous data, periodic/aperiodic signals and symbolic/numeric qualifier/quantifiers.
The foregoing discussion discloses and describes merely exemplary embodiments of the present invention. One skilled in the art will readily recognize from such discussion and from the accompanying drawings and claims that various changes, modifications and variations can be made therein without departing from the spirit and scope of the invention as defined in the following claims.