1. Field of the Disclosed Subject Matter
The present disclosed subject matter relates to detecting, identifying and diagnosing fault events in an industrial plant, such as a refinery or petrochemical plant.
2. Description of Related Art
Conventional techniques for event detection include heuristic data-driven approaches, such as Principal Component Analysis (PCA) and parity space approaches, which develop detection models only based on statistics obtained during normal system operation. PCA based event detection generally defines normal operations based on historical relationships between measurements and determines that an event occurred when the deviation from the normal behavior crosses a user-defined limit. With respect to diagnosis, when an event is detected, the PCA model can attribute the most frequent causes to the sensor(s) most strongly correlated with certain loading vectors contributing to the detected deviation metric, and a human operator can then further diagnose and correct the situation based on prior experience.
Building such PCA models can require a large number of man-hours to screen the data to be utilized for the model, as well as to manually diagnose the causes of events when they occur. Additionally, the PCA models are generally determined by normal conditions and have low sensitivity due at least in part to not being specific to the emerging fault conditions. Furthermore, such models require additional efforts to “fine-tune” the models to suppress or eliminate false positive alerts. In addition, such models may need to be re-built each time there is a change to the equipment or control structure of the system being monitored. Furthermore, the PCA model output generally allows for relatively poor interpretation of faults, at least in part because the technique provides no direct correspondence to physical sensor variables or operational modes. The PCA model output also typically does not provide a suitable diagnostic function, at least in part because such techniques do not include an optimal estimator or classifier.
As such, there remains a need for improved systems and techniques for detecting, identifying and diagnosing fault events in an industrial plant.
The purpose and advantages of the disclosed subject matter will be set forth in and apparent from the description that follows, as well as will be learned by practice of the disclosed subject matter. Additional advantages of the disclosed subject matter will be realized and attained by the methods and systems particularly pointed out in the written description and claims hereof, as well as from the appended drawings.
To achieve these and other advantages and in accordance with the purpose of the disclosed subject matter, as embodied and broadly described, the disclosed subject matter includes techniques for detection of event conditions in an industrial plant. An exemplary technique includes receiving process data corresponding to one or more sensors, estimating normal statistics from the process data associated with normal operation of one or more components corresponding to the one or more sensors, estimating abnormal statistics from the process data with potentially abnormal operation of the one or more components, determining a fault model from the estimated normal and abnormal statistics, the fault model including a learning matrix, one or more fault indices indicating a likelihood of an occurrence of one or more fault events, and a fault threshold corresponding to the one or more sensors, receiving the one or more fault indices, the fault threshold, and further process data from the one or more sensors, determining one or more further fault indices from the further process data, applying the fault threshold to the one or more further fault indices, and indicating a further occurrence of the one or more fault events when a magnitude of the one or more further fault indices exceeds the fault threshold corresponding to the one or more sensors.
For example and as embodied here, estimating the abnormal statistics can include performing a minimum mean squared error (MMSE) fault estimate on the process data. Determining the one or more further fault indices can include performing one or more of Neyman-Pearson Hypothesis testing and generalized likelihood ratio testing (GLRT) on the further process data.
Furthermore, and as embodied here, the technique can include dynamically adjusting the fault model using the further process data. Dynamically adjusting the fault model can include continuously updating the learning matrix based on updated estimates of the normal statistics and the abnormal statistics. Additionally or alternatively, dynamically adjusting the fault model can include adjusting the fault threshold using the one or more further fault indices associated with normal and abnormal segments of the further process data received over a predetermined time window.
Additionally, and as embodied here, the fault model can include a fault sensor map to relate the one or more sensors to the one or more components, and in some embodiments, the technique can further include, when the fault event is indicated, determining a faulty component corresponding to the at least one of the one or more sensors. The fault model can further include a fault dictionary stored in a database or a memory to relate patterns of the determined faulty components to the one or more fault events and a label having an operational meaning.
In some embodiments, the fault model can further include a root cause map to relate first sensor conditions corresponding to a first fault event of a first component to second sensor conditions corresponding to a second fault event of a second component, and the technique can further include determining a faulty system or group of systems corresponding to the related first and second sensor conditions. The technique can further include partitioning the one or more sensors based at least in part on a statistical dependence among the one or more sensors from a corresponding type of measurement performed. Additionally or alternatively, the technique can include partitioning the one or more sensors by a statistical and dynamical characterization of the one or more fault events.
According to another aspect of the disclosed subject matter, techniques for identification of event conditions in an industrial plant are provided. An exemplary technique includes receiving process data corresponding to one or more sensors, estimating normal statistics from the process data associated with normal operation of one or more components corresponding to the one or more sensors, estimating abnormal statistics from the process data with potentially abnormal operation of the one or more components, determining a fault model from the estimated normal and abnormal statistics, the fault model including a learning matrix, one or more fault indices indicating a likelihood of an occurrence of one or more fault events, and a fault threshold corresponding to the one or more sensors, receiving the one or more fault indices, the fault threshold, and further process data from the one or more sensors, determining one or more further fault indices from the further process data, applying the fault threshold to the one or more further fault indices, indicating a further occurrence of the one or more fault events when a magnitude of the one or more further fault indices exceeds the fault threshold corresponding to the one or more sensors, relating the one or more components to the one or more sensors exceeding the corresponding fault threshold, and identifying a type of the fault event based on the relation of the one or more components to the one or more sensors exceeding the corresponding fault threshold.
For example and as embodied here, estimating the abnormal statistics can include performing a minimum mean squared error (MMSE) fault estimate on the process data. Determining the one or more further fault indices can include performing one or more of Neyman-Pearson Hypothesis testing and generalized likelihood ratio testing (GLRT) on the further process data.
Furthermore, and as embodied here, the technique can include dynamically adjusting the fault model using the further process data. Dynamically adjusting the fault model can include continuously updating the learning matrix based on updated estimates of the normal statistics and the abnormal statistics. Additionally or alternatively, dynamically adjusting the fault model can include adjusting the fault threshold using the one or more further fault indices associated with normal and abnormal segments of the further process data received over a predetermined time window.
Additionally, and as embodied here, the fault model can include a fault sensor map to relate the one or more sensors to the one or more components, and in some embodiments, the technique can further include, when the fault event is indicated, determining a faulty component corresponding to the at least one of the one or more sensors. The fault model can further include a fault dictionary stored in a database or a memory to relate patterns of the determined faulty components to the one or more fault events and a label having an operational meaning.
In some embodiments, the fault model can further include a root cause map to relate first sensor conditions corresponding to a first fault event of a first component to second sensor conditions corresponding to a second fault event of a second component, and the technique can further include determining a faulty system or group of systems corresponding to the related first and second sensor conditions. The technique can further include partitioning the one or more sensors based at least in part on a statistical dependence among the one or more sensors from a corresponding type of measurement performed. Additionally or alternatively, the technique can include partitioning the one or more sensors by a statistical and dynamical characterization of the one or more fault events.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and are intended to provide further explanation of the disclosed subject matter claimed.
The accompanying drawings, which are incorporated in and constitute part of this specification, are included to illustrate and provide a further understanding of the disclosed subject matter. Together with the description, the drawings serve to explain the principles of the disclosed subject matter.
Reference will now be made in detail to the various exemplary embodiments of the disclosed subject matter, exemplary embodiments of which are illustrated in the accompanying drawings. The structure and corresponding techniques of the disclosed subject matter will be described in conjunction with the detailed description of the system.
The apparatus and methods presented herein can be used for event detection and/or diagnosis in any of a variety of suitable industrial systems, including, but not limited to, processing systems utilized in refineries, petrochemical plants, polymerization plants, gas utility plants, liquefied natural gas (LNG) plants, volatile organic compounds processing systems, liquefied carbon dioxide processing plants, and pharmaceutical plants. For purpose of illustration only and not limitation, and as embodied here, the systems and techniques presented herein can be utilized to identify and diagnose fault events in a refinery or petrochemical plant.
In accordance with one aspect of the disclosed subject matter herein, exemplary techniques for detecting, identifying and diagnosing fault events in an industrial plant generally include receiving process data corresponding to one or more sensors. Normal statistics are estimated from the process data associated with normal operation of one or more components corresponding to the one or more sensors. Abnormal statistics are estimated from the process data with potentially abnormal operation of the one or more components. A fault model is determined from the estimated normal and abnormal statistics, and the fault model includes a learning matrix, one or more fault indices indicating a likelihood of an occurrence of one or more fault events, and a fault threshold corresponding the one or more sensors. The one or more fault indices, the fault threshold, and further process data from the one or more sensors are received. One or more further fault indices are determined from the further process data. The fault threshold is applied to the one or more further fault indices. A further occurrence of the one or more fault events is indicated when a magnitude of the one or more further fault indices exceeds the fault threshold corresponding to the one or more sensors.
The accompanying figures, where like reference numerals refer to identical or functionally similar elements throughout the separate views, serve to further illustrate various embodiments and to explain various principles and advantages all in accordance with the disclosed subject matter. For purpose of explanation and illustration, and not limitation, exemplary systems and techniques for identifying and diagnosing fault events in an industrial plant in accordance with the disclosed subject matter are shown in
According to one aspect of the disclosed subject matter, with reference to
A detection processor 112 can receive the fault estimate 104 from the learning matrix 102. The detection processor can perform one or more fault event detection techniques, which can include, for example and without limitation, binary hypothesis testing, described as follows. Additionally or alternatively, a fault analysis processor 114 can perform identification and/or diagnosis, for example by mapping fault sensors corresponding to one or more fault events. As a further alternative, a root cause analysis processor 116 can perform root cause analysis of the fault, for example by temporal and/or spatial mapping of the components corresponding to one or more fault events, as discussed further herein.
For purpose of illustration, and as embodied herein, event detection can include binary hypothesis testing. For example, measurement data y[n] can be received, and observation models for normal and fault event hypotheses, respectively represented as H0 and H1, can be utilized as follows:
H0:y[n]=x[n] (1)
H1:y[n]=x[n]+f[n] (2)
As such, n can represent a time index, and x[n] and f[n] can represent the normal process data and the process data associated with one or more fault events, respectively. In some embodiments, for fault diagnosis among several different types of faulty events, the binary hypothesis framework described here can be generalized to multiple hypothesis testing with Hj for each jth type of fault.
Furthermore, and as embodied here, hypothesis testing can be performed according to a Neyman-Pearson hypothesis test, which can provide an improved or optimal detection probability at a given false positive rate. Additionally or alternatively, other suitable hypothesis tests can be performed, including and without limitation a Bayesian criterion test, which can reduce or minimize decision error for known prior data of Hj. For purpose of illustration and not limitation, and as embodied here, the Neyman-Pearson hypothesis test can be represented by following likelihood ratio testing at each time instant:
p(y|H0) and p(y|H1) can represent a likelihood function associated with each hypothesis, L(y) can represent a likelihood ratio, and r can represent a threshold value. The threshold value r can be chosen based at least in part on a desired balance between the resulting detection rate and false alarm rate of the fault detection. That is, increased values of r can reduce false positive rates but can also reduce detection probability, and reduced values of r can increase detection probability but can also increase false positives. For example, and with reference to
With further reference to
The detection probability and false positive rates can be represented as
P
d
=p(L(y)>r|H1), (4)
and
P
f
=p(L(y)>r|H0) (5)
respectively. Generally, the detection probability and false positive rate can be considered universal, that is not specific to particular probability distributions of x, y, and f, and can be specialized and simplified to particular forms, including when x and f assume certain statistical models, such as, Gaussian regression models and the dynamic state-space models.
For example, and as embodied here, x and f can by represented as a Gaussian model, and as such, the log of the likelihood ratio, denoted as LL(y), can be represented as a function of a minimum mean squared error (MMSE) estimate of the faulty component, {circumflex over (f)}[n]. That is, LL(y) can be represented as
LL(y[n])=g(y[n],{circumflex over (f)}[n])=yt[n]Qy−1μf+yt[n]Qx−1{circumflex over (f)}[n], (6)
and the MMSE fault estimate {circumflex over (f)}[n] can be represented as
{circumflex over (f)}[n]=μ
f
+Q
f
Q
y
−1(y[n]−μy) (7)
where Qf, Px=Qx−1, Py=Qy−1 can represent a covariance matrix of the estimated process data associated with a fault event f[n], the inverse covariance of the estimated normal process data x[n], and the inverse covariance of the observed process data y[n], respectively, and μf, μy can represent the mean of the potential fault event data and the input process data respectively. For purpose of illustration, the exemplary result described here represents estimated normal process data x[n] having a zero mean, and thus μf can equal μy, for example according to eq. (2). However, it is understood that the results herein can be extended to estimated normal process data x[n] having a non-zero mean.
As described herein, both the log likelihood ratio LL(y) and the MMSE fault estimate {circumflex over (f)}[n] can be determined by utilizing Qf, Px, Py and μf. Furthermore, in operation, the observed process data y[n] can be obtained as a stream of measurement data received from the one or more sensors of the industrial plant. As such, Qf, Px, Py and μf can be estimated from the observed process data y[n]. For example, and as embodied herein, the normal process data y[n] can be represented as a multivariate time series, and as such, the covariance can be approximated by a sampling covariance matrix estimated over K sample points, which can be represented as
{circumflex over (Q)}
y
[n]=1/KΣi=n-K+1ny[i]yt[i] (8)
The inverse covariance Py can be estimated as the inverse of {circumflex over (Q)}y. Additionally, and as embodied herein, various constrained inverses can be used to obtain Py from {circumflex over (Q)}y, as discussed further herein below.
The fault event covariance matrix Qf can be estimated from the received streaming data and the updated estimate of the normal statistics. For purpose of illustration, the faulty component data can be uncorrelated with the normal process data, and Qf can be determined as the difference between {circumflex over (Q)}y and the normal covariance estimate {circumflex over (Q)}x, and can thus be represented as
{circumflex over (Q)}
f
[n]={circumflex over (Q)}
y
[n]−{circumflex over (Q)}
x
[n]. (9)
Symmetric non-negativity can be provided by projecting the resulting covariance estimate onto a positive convex space.
The normal covariance {circumflex over (Q)}x[n] can be calculated from a predetermined set of historical process data known to be normal. Additionally or alternatively, the normal covariance {circumflex over (Q)}x[n] can be updated from the stream of measurement data received from the one or more sensors of the industrial plant during one or more periods when no fault is detected. As a further alternative, which can be used for example to obtain an initial estimate, {circumflex over (Q)}x[n] can be obtained by averaging process data y[n] over a suitably long period of time such that the time duration of fault events becomes negligible compared to the total time duration. Furthermore, the inverse of {circumflex over (Q)}x[n], represented as {circumflex over (P)}x, can be estimated as described further herein below.
The mean of the potential fault event data μf can be estimated by mean-centering the process data to remove the normal process mean level and determining a local running average of the mean-centered process data. Additionally, and as embodied herein, the estimated normal process data and the measured process data can be updated, for example, using a moving average of the measured process data over a predetermined time window. Additionally or alternatively, the estimated normal process data and the measured process data can be updated using dynamic models of both the estimated normal process data x[n] and the estimated fault event process data f[n]. For example, dynamic models including state-space models can be constructed for x[n] utilizing both first principle models and recent process data cleared of faulty events, and can be represented as
x[n+1]=Ax[n]+Bu[n]+w[n] (10)
where the model coefficients A and B can be fitted or calibrated against the recent normal process data and used for updating the normal statistics. For the fault event data f[n], heuristic statistical state-space models corresponding to the dynamics of the data can be used.
As such, Qf, Px, Py and μf can be replaced by corresponding estimates {circumflex over (Q)}f, {circumflex over (P)}y, {circumflex over (P)}x, and {circumflex over (μ)}f, respectively, and the log likelihood ratio of eq. (6) in the Neyman-Pearson detector can thus be determined as
LL
g(y[n])=g(y[n],{circumflex over (f)}[n])=yt[n]{circumflex over (P)}y[n]{circumflex over (μ)}f[n]+yt[n]{circumflex over (P)}x[n]{circumflex over (f)}[n], (11)
which can represent the generalized log likelihood ratio (GLRT), and the MMSE fault estimate can be represented as
{circumflex over (f)}[n]={circumflex over (μ)}
f
+{circumflex over (Q)}
f
[n]{circumflex over (P)}
y
[n](y[n]−{circumflex over (μ)}f). (12)
As discussed herein, Qf, Px, Py and μf can be utilized to determine the generalized likelihood ratio test (GLRT) of eq. (11) and the MMSE fault estimation in eq. (12). However, estimating Py and Px as the inverse of {circumflex over (Q)}y and {circumflex over (Q)}x, i.e., the sample covariance of y[n] and x[n], respectively, can be challenging when {circumflex over (Q)}y or {circumflex over (Q)}x is singular, which can occur, for example, due at least in part to insufficient data samples and/or cross-correlation among different element variables of y[n] or x[n]. As such, estimation of Py from {circumflex over (Q)}y can be regularized as
{circumflex over (P)}y=arg minP>0−log det(P)+tr(P{circumflex over (Q)}y)+λ∥P∥η (13)
where ∥P∥η is a matrix norm of P, which can be, for example and without limitation, the l1 norm of P when η=1. Such a norm can penalize on the absolute sum over all entries of P and thus can enhance sparsity. λ can represent a weighting factor on the regularization term. For example and without limitation, λ can equal 0, and thus eq. (13) can be determined by the maximum-likelihood estimate of P. λ can increase, and thus the solution of P can become more sparse. Although a closed-form solution to eq. (13) can be unavailable, eq. (13) can nevertheless be solved, for example and without limitation, using a graphical lasso technique, which can include one or more variants, such as exact covariance thresholding based accelerated graphical lasso. Similar techniques can be applied to obtain Px from {circumflex over (Q)}x.
With reference now to
For purpose of illustration, and as embodied herein, one or more time window buffers can be utilized to collect the fault index values associated with recent normal and fault data, and can be updated as new data is processed. In this manner, the threshold level can be chosen such that a desired false positive rate and detection probability can be met using the fault indices from both buffers. Additionally or alternatively, the threshold level can be determined using metric minimization, such as linear discriminant analysis (LDA). The determined threshold level can be further smoothed to improve robustness against outliers. Such adaptive thresholding techniques can be performed automatically or, if desired, can be tunable to incorporate operator inputs. In operation, real process data can be subject to drifting or dynamic change. As such, the adaptive thresholding techniques described herein can provide suitable desired detection performance according to the recent process characteristics, which can improve the performance and usability of the detector.
With reference now to
As shown in
Referring now to
In
For example and without limitation, and as embodied herein, inverse covariance estimation can be performed according to eq. (13), as discussed above. Furthermore, inverse covariance estimation in eq. (13) with η=1 can be referred to as a covariance selection problem, and can be related to the Gaussian Graphical model (GGM) representation of the multivariate sample data. An undirected graph G can be represented by a collection of nodes and the edges connecting the nodes, which can be represented as G=(V, E), where V, E can represent the set of nodes and edge coefficients respectively. In GGM the set of nodes V can be considered as the set of variables (i.e., tags) in the data and the edge coefficients E can be determined by the inverse covariance matrix of the data, e.g., Py for y[n], as described herein. The connection between the nodes can have a statistical meaning. That is, the connection between the nodes can correspond to the conditional independence between nodes or variables. For example, unconnected nodes or variables can be considered conditionally independent, while connected nodes or variables can be considered dependent on each other.
Furthermore, and as embodied herein, Py can be determined as described herein, for example for calculating the Neyman-Pearson hypothesis test and the MMSE fault estimator. Accordingly, the same Py can be utilized to directly determine the graph structure of the GGM graph structure of the process data. For purpose of illustration,
In operation, for example in a relatively large-scale plant or production unit, the number of tag variables can be on the order of thousands. Nevertheless, a fault event, at least in an early stage, typically occurs at a local node before propagating to other nodes. As a result, a graph such as the GGM representation of
As a further example, as illustrated in
Referring now to
For purpose of illustration and without limitation, and as embodied herein, the sequence of MMSE fault estimate {circumflex over (f)}[n] calculated according to eq. (12) can be utilized to determine the faulty components corresponding to each tag variable as a function of time. In such a calculation, according to the disclosed subject matter, the mean squared error can be reduced or minimal. For example and as embodied herein, a database of estimated faults and a corresponding fault labels can be represented as Lib({fi,si}), where fi can represent the ith estimated fault data and si can represent an annotated fault label corresponding to the estimated fault data. The annotated fault label can be an operationally meaningful label, for example a textual or graphical label denoting that the fault corresponds to flooding or partial burning of a faulty component. As such, a newly detected and estimated fault can be represented as fn, and classification of the fault fn can be performed. That is, the annotated label of the fault fn can be represented as
s
n
=D(fn,Lib({fi,si})) (14)
D(fn, Lib({fi, si})) can represent the classification map function, which can be obtained various ways. For example and without limitation, the classification map function can be obtained by unsupervised techniques, such as clustering or metric learning. Additionally or alternatively, the classification map function can be obtained by supervised techniques, such as by a support vector machine (SVM) technique.
Referring now to
Referring now to
In some embodiments, at 153, historical process data can be utilized to determine initial values for the covariance estimates {circumflex over (Q)}x and the threshold value r.
At 154, the estimated statistics of normal data and fault data can be updated from the recent process data and any new data received, and the covariance estimates {circumflex over (Q)}x and {circumflex over (Q)}y can be determined as described herein. At 155, fault estimation can be performed using the updated statistics. For example, the MMSE estimate of a potential faulty component {circumflex over (f)}[n] can be determined and used to test the likelihood ratio L(y).
At 156, fault detection can be performed. For example, the log likelihood ratio LL(y) can be compared to the threshold r to determine the existence of a fault event, as described herein. Furthermore, in some embodiments, the threshold value r can be chosen based on recent process data to achieve a desired balance between the resulting detection rate and false alarm rate.
At 157, fault isolation and/or diagnosis can be performed. For example, as described herein, the MMSE estimate of the faulty component {circumflex over (f)}[n] can be utilized to determine the faulty components corresponding to each tag variable as a function of time. Classification of the fault fn can be performed, for example by classification mapping, as described herein. At 158, in some embodiments, tag variables can be partitioned into groups for diagnosis and root cause analysis, as described herein.
Additionally or alternatively, the disclosed subject matter can include one or more of the following embodiments:
A technique for detection of event conditions in an industrial plant includes receiving process data corresponding to one or more sensors, estimating normal statistics from the process data associated with normal operation of one or more components corresponding to the one or more sensors, estimating abnormal statistics from the process data with potentially abnormal operation of the one or more components, determining a fault model from the estimated normal and abnormal statistics, the fault model including a learning matrix, one or more fault indices indicating a likelihood of an occurrence of one or more fault events, and a fault threshold corresponding to the one or more sensors, receiving the one or more fault indices, the fault threshold, and further process data from the one or more sensors, determining one or more further fault indices from the further process data, applying the fault threshold to the one or more further fault indices, and indicating a further occurrence of the one or more fault events when a magnitude of the one or more further fault indices exceeds the fault threshold corresponding to the one or more sensors.
The technique of any of the foregoing Embodiments, wherein estimating the abnormal statistics includes performing a minimum mean squared error (MMSE) fault estimate on the process data.
The technique of any of the foregoing Embodiments, wherein determining the one or more further fault indices includes performing one or more of Neyman-Pearson Hypothesis testing and generalized likelihood ratio testing (GLRT) on the further process data.
The technique of any of the foregoing Embodiments, including dynamically adjusting the fault model using the further process data.
The technique of Embodiment 4, wherein dynamically adjusting the fault model includes continuously updating the learning matrix based on updated estimates of the normal statistics and the abnormal statistics.
The technique of Embodiment 4 or 5, wherein dynamically adjusting the fault model includes adjusting the fault threshold using the one or more further fault indices associated with normal and abnormal segments of the further process data received over a predetermined time window.
The technique of any of the foregoing Embodiments, wherein the fault model includes a fault sensor map to relate the one or more sensors to the one or more components, and the technique includes, when the fault event is indicated, determining a faulty component corresponding to the at least one of the one or more sensors.
The technique of Embodiment 7, wherein the fault model includes a fault dictionary stored in a database or a memory to relate patterns of the determined faulty components to the one or more fault events and a label having an operational meaning.
The technique of any of the foregoing Embodiments, wherein the fault model includes a root cause map to relate first sensor conditions corresponding to a first fault event of a first component to second sensor conditions corresponding to a second fault event of a second component, and the technique includes determining a faulty system or group of systems corresponding to the related first and second sensor conditions.
The technique of any of the foregoing Embodiments, including partitioning the one or more sensors based at least in part on a statistical dependence among the one or more sensors from a corresponding type of measurement performed.
The technique of any of the foregoing Embodiments, including partitioning the one or more sensors by a statistical and dynamical characterization of the one or more fault events.
A technique for identification of event conditions in an industrial plant includes receiving process data corresponding to one or more sensors, estimating normal statistics from the process data associated with normal operation of one or more components corresponding to the one or more sensors, estimating abnormal statistics from the process data with potentially abnormal operation of the one or more components, determining a fault model from the estimated normal and abnormal statistics, the fault model including a learning matrix, one or more fault indices indicating a likelihood of an occurrence of one or more fault events, and a fault threshold corresponding to the one or more sensors, receiving the one or more fault indices, the fault threshold, and further process data from the one or more sensors, determining one or more further fault indices from the further process data, applying the fault threshold to the one or more further fault indices, indicating a further occurrence of the one or more fault events when a magnitude of the one or more further fault indices exceeds the fault threshold corresponding to the one or more sensors, relating the one or more components to the one or more sensors exceeding the corresponding fault threshold, and identifying a type of the fault event based on the relation of the one or more components to the one or more sensors exceeding the corresponding fault threshold.
The technique of any of the foregoing Embodiments, wherein estimating the abnormal statistics includes performing a minimum mean squared error (MMSE) fault estimate on the process data.
The technique of any of the foregoing Embodiments, wherein determining the one or more further fault indices includes performing one or more of Neyman-Pearson Hypothesis testing and generalized likelihood ratio testing (GLRT) on the further process data.
The technique of any of the foregoing Embodiments, including dynamically adjusting the fault model using the further process data.
The technique of Embodiment 15, wherein dynamically adjusting the fault model includes continuously updating the learning matrix based on updated estimates of the normal statistics and the abnormal statistics.
The technique of Embodiment 15 or 16, wherein dynamically adjusting the fault model includes adjusting the fault threshold using the one or more further fault indices associated with normal and abnormal segments of the further process data received over a predetermined time window.
The technique of any of the foregoing Embodiments, wherein the fault model includes a fault sensor map to relate the one or more sensors to the one or more components, and the technique includes, when the fault event is indicated, determining a faulty component corresponding to the at least one of the one or more sensors.
The technique of Embodiment 18, wherein the fault model includes a fault dictionary stored in a database or a memory to relate patterns of the determined faulty components to the one or more fault events and a label having an operational meaning.
The technique of any of the foregoing Embodiments, wherein the fault model includes a root cause map to relate first sensor conditions corresponding to a first fault event of a first component to second sensor conditions corresponding to a second fault event of a second component, and the technique includes determining a faulty system or group of systems corresponding to the related first and second sensor conditions.
The technique of any of the foregoing Embodiments, including partitioning the one or more sensors based at least in part on a statistical dependence among the one or more sensors from a corresponding type of measurement performed.
The technique of any of the foregoing Embodiments, including partitioning the one or more sensors by a statistical and dynamical characterization of the one or more fault events.
While the disclosed subject matter is described herein in terms of certain preferred embodiments, those skilled in the art will recognize that various modifications and improvements can be made to the disclosed subject matter without departing from the scope thereof. Moreover, although individual features of one embodiment of the disclosed subject matter can be discussed herein or shown in the drawings of the one embodiment and not in other embodiments, it should be apparent that individual features of one embodiment can be combined with one or more features of another embodiment or features from a plurality of embodiments.
In addition to the specific embodiments claimed below, the disclosed subject matter is also directed to other embodiments having any other possible combination of the dependent features claimed below and those disclosed above. As such, the particular features presented in the dependent claims and disclosed above can be combined with each other in other manners within the scope of the disclosed subject matter such that the disclosed subject matter should be recognized as also specifically directed to other embodiments having any other possible combinations. Thus, the foregoing description of specific embodiments of the disclosed subject matter has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the disclosed subject matter to those embodiments disclosed.
It will be apparent to those skilled in the art that various modifications and variations can be made in the method and system of the disclosed subject matter without departing from the spirit or scope of the disclosed subject matter. Thus, it is intended that the disclosed subject matter include modifications and variations that are within the scope of the appended claims and their equivalents.
This application claims priority to U.S. Provisional Application Ser. No. 61/919,854 filed Dec. 23, 2013, herein incorporated by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
61919854 | Dec 2013 | US |