Machinery, such as aircraft engines and turbines are subject to failure for numerous reasons. For example, an aircraft engine may fail due to a problem with an engine component such as a combustor or a fan. Known machinery failures are typically detected by sensors and, once a failure is detected, the failure is only then reported to an operator for correction.
Conventional strategies employed for the detection of failures are typically developed based on known problems that have previously occurred in the machinery. These prior occurrences may be determined by automatically inferring sensor profiles that correspond to known abnormal behavior associated with the particular problem. However, for problems that have never had prior occurrences, failures often come without any warning or prior indication. In this situation, the cost of repair may be significantly greater than if the failure had been detected early. Furthermore, late detection of a failure may jeopardize the safety of the machinery. It would therefore be desirable to provide systems and methods to detect unknown abnormal behavior in machinery in an automatic and accurate manner.
In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the embodiments. However, it will be understood by those of ordinary skill in the art that the embodiments may be practiced without these specific details. In other instances, well-known methods, procedures, components and circuits have not been described in detail so as not to obscure the embodiments.
The present embodiments relate to a novel framework for unsupervised anomaly detection associated with industrial multivariate time-series data. Unsupervised detection may be essential in “unknown-unknown” scenarios, where operators are unaware of potential failures and haven't observed any prior occurrences of such unknown failures. The framework described herein may comprise a comprehensive suite of algorithms, data quality assessment, missing value imputation, feature generation, validation and evaluation modules. The framework may determine unknown failures based on comparing a normal engine profile model (e.g., all sensors indicting values in a normal range) with reported differences in a current state of the engine. Sensors may be associated with various measurable elements of a piece of machinery such as, but not limited to, vibration, temperature, pressure, and environmental changes, etc. In some embodiments, determining unknown failures (e.g., evaluation) relates to discovering a failure that is about to happen (e.g., early detection). In some embodiments, determining unknown failures relates to early detection as well a case where a failure has happened in the past.
As used herein, the term “model” may refer to, for example, a structured model that includes information about various items, and relationships between those items, and may be used to represent and understand a piece of machinery. By way of example, the model might relate to a learned model of specific types of: jet engines, gas turbines, wind turbines, etc. Note that any of the models described herein may include relationships between sensors within the piece of machinery or phases of the machinery. By way of examples only, a phase of a piece of machinery may relate a function of the piece of machinery at a particular time. For example, a jet engine may be associated with a takeoff phase, an in-flight phase and a landing phase, etc.
Therefore, by comparing how a normal engine profile model deviates from normal operation for a particular phase, an operator may be presented with anomalies that serve as indicators which may point to a cause of the anomalies as well as the sensors/drivers that are behind the anomalies.
As used herein, devices, including those associated with the system 100 and any other device described herein, may exchange information via any communication network which may be one or more of a Local Area Network (LAN), a Metropolitan Area Network (MAN), a Wide Area Network (WAN), a proprietary network, a Public Switched Telephone Network (PSTN), a Wireless Application Protocol (WAP) network, a Bluetooth network, a wireless LAN network, and/or an Internet Protocol (IP) network such as the Internet, an intranet, or an extranet. Note that any devices described herein may communicate via one or more such communication networks.
The evaluation platform 120 may receive input data 110 from a plurality of sensors, from a database, or from another system such as an onboard data collection system. The database (not shown in
The preprocessor 130 may receive the input data 110 and cleanse the input data 110 to remove spurious data associated the plurality of sensors. For example, the preprocessor 130 may remove noise or data associated with problematic sensors or remove data associated with a time frame when an engine was in a repair shop which may have created data unrelated to in-use potential faults.
The detector 140 may use one or more of a plurality of algorithms to determine anomalies associated with the input data 110. For example, examples of different algorithms will be discussed with respect to
In some embodiments, the system 100 may be broken down into a plurality of modules. For example, the system 100 may comprise the following modules:
Preprocessing: This module may offer automatic data imputation, relevant feature selection, and component level compartmentalization of sensor observations. For example, preprocessing may cleanse input data to remove spurious data associated the plurality of sensors such as noise, data associated with problematic sensors or data unrelated to in-use potential faults.
Feature generation: This module may offer several strategies for automatically generating feature representations that are relevant, interpretable and optimal for the problem setting.
Detector suite: This module may provide a generalized interface to a library of diverse anomaly detection algorithms. Each of the algorithms in the library may provide mechanisms for training the model by inferring the usual behavior (e.g., inferring sensors results that indicate values in a normal range) of the asset family being monitored, and then predict anomaly scores for future observations. Algorithms may have been chosen and developed to enable the capture of a diversity of anomalies.
Alertization: This module may provide anomaly scores generated by the Detector suite that can be converted into alerts in an intelligent manner, with the goal of reducing spurious alarms and false alarms.
Evaluation: The evaluation module may provide a comprehensive scoring of methods being tested. It may include metrics of recall, precision, coverage and false positive rate, but may also provide lead-time-to-detection, performance under specific lead-time criteria (e.g., 30-days in advance). It may also provide comparison of the framework to data associated with a repair shop visit as well as existing models for anomaly detection and alert generation.
Feature importance: The feature importance module may be supported by each of the anomaly detector algorithms, wherein, the algorithm may provide an importance score for each of the underlying features (or sensors) at each time-instant by automatically identifying the contribution of a feature to the anomaly score at that time instant. For example, sensors or groups of sensors may be ranked based on their contribution to feature information (e.g., a number of sensors associated with each feature). This may enable a validation of the algorithmic scores, and may also enable root-cause analysis by guiding an analyst to a correct component for inspection.
In a case that the system 100 detects anomalies in time-series data to a degree that an operator is alerted, the evaluation platform 120 may provide a dashboard for the operator to further analyze the anomalies. For example,
Referring now to
At a next lower level, the high-level view 210 may be broken down into a second level view such as a serial number level view 220 (e.g., an ESN view). The serial number level view 220 may break down the high-level group of machinery into serial number groupings. In the present example, the serial number view 220 may break down the high-level group 210 into five different groups with each of the five groups starting with serial number 123. In the present example, serial number grouping 123113 illustrates a high number of anomalies. In some embodiments, once a serial number group from the serial number view 220 is determined to have anomalous indicators, the particular serial number group from the serial number level view 220 may be broken down into functional subsets at a subset view 230. For example, the functional subsets may be associated with a hot section, an operational system, or a control system of machinery associated with a particular serial number (e.g., 123113).
Each subset from the subset view 230 may be broken down into a plurality of features which may be displayed in the feature view 240. In the present example, the hot section may be selected. In some embodiments, since each subset from the subset view 230 may be associated with a plurality of features, the features associated with each subset may be ranked and displayed in an order of importance. The ranking may be based on, for example, a predetermined ranking retrieved from a database or may be based on a number of sensors associated with each feature (e.g., a greater number of sensors equates to a higher ranking). Each feature of a subset may be associated with one or more sensors and each sensor may be associated with a desired operating value. However, in a case that the desired value is indicated as being out of range, then one or more of the underlying sensors, associated with the feature, may be contributing to the feature having a high number of anomalies (e.g., a high anomaly score). In this case, it may be desirable to determine which sensor is indicating a high number of anomalies.
The individual driver view 250 may break down individual features from the feature view 240 into the particular sensors associated with each feature. In this way, the data from a particular sensor, associated with one or more anomalies, may be examined. Examining individual sensors associated with anomalies may facilitate determining a future failure of an unknown type. This may indicate to an engineer or repair personnel a particular component that is failing or is indicating as potentially failing.
Now referring to
At S310, time-series data associated with a piece of machinery may be received. The time-series data may be received at an evaluation system. The piece of machinery might be associated with, for example, a physical engine, a rotor, a turbine or other electrical and/or mechanical device.
The embodiments described herein may be implemented using any number of different hardware configurations. For example,
The processor 410 also communicates with a storage device 430. The storage device 430 may comprise any appropriate information storage device, including combinations of magnetic storage devices (e.g., a hard disk drive), optical storage devices, and/or semiconductor memory devices. The storage device 430 may store programs 412 and 414 for controlling the processor 410. The processor 410 performs instructions of the programs 412, 414, and thereby operates in accordance with any of the embodiments described herein. For example, the processor 410 may receive sensor data associated with a piece of machinery. In some embodiments, a preprocessor may cleanse the received sensor data. The preprocessor may be associated with a processor, a co-processor or one or more processor cores.
The processor 410 may automatically determine an anomaly associated with the piece of machinery by comparing received time-series data with a normal engine profile model associated with the piece of machinery. The normal engine profile model may be based on the piece of machinery with all related sensor values in a predicted range (i.e., a healthy state of the machinery). Furthermore, the processor 410 may automatically determine that the anomaly is not a known fault based on performing a lookup of known failure modes. Known failure modes may be determined based on fault characteristic data that is stored in a database. The fault characteristic data may comprise data such as, but not limited to, temperatures, currents, resistances, etc. that is associated with and may be used to identify known faults. For example, a specific failure mode may be associated with a component that has a specific temperature range over a period of time and exhibits a high resistance.
The programs 412, 414 may be stored in a compressed, uncompiled and/or encrypted format. The programs 412, 414 may furthermore include other program elements, such as an operating system, clipboard application a database management system, and/or device drivers used by the processor 410 to interface with peripheral devices.
As used herein, information may be “received” by or “transmitted” to, for example: (i) the evaluation platform 400 from another device; or (ii) a software application or module within the evaluation platform 400 from another software application, module, or any other source.
In some embodiments (such as shown in
Referring back to
The covariance relationship may be calculated by first calculating a covariance σ(x,y) for each pair of features x and y. N may be set to 50 for covariance of 50 cycles. Therefore, a covariance relationship formula may be as follows:
As illustrated at 530 in
As illustrated in the example associated with graph 540, an anomaly score over a period of 7 months may be very low (e.g., less than 0.1) but over approximately a one month period, the anomaly score rapidly increases (e.g., to 1.0) which may trigger an alert that the piece of machinery being monitored may be subject to a failure. Furthermore, as illustrated at 540A, when the anomaly scores increase to a predefined level (e.g., 0.5) an alert is activated.
Turning now to
Referring back to
Furthermore, and as illustrated in graph 640, an anomaly score over a period of eight months may be very low (e.g., less than 0.1) but over a next month, the anomaly score rapidly increases (e.g., to 0.6). In this case, after entering an alert zone associated with a predetermined amount of anomalies (e.g., in this case the alert zone starts at 0.3) an alert that the piece of machinery may be subject to a failure may be triggered.
Turning now to
For example, time-series data 810 may be associated with a plurality of sensors. The time-series data 820 may be modified by the application of physics derived features which are associated with physical efficiencies of the piece of machinery. The results of the application of physics derived features may be illustrated at 820. Once the physics derived features are applied, the time-series data having the physics derived features 820 applied may then be transformed into transform results 830.
Graph 840, illustrates a comparison of an anomaly score 850 using a first technique, such as that described with respect to
Referring back to
The following illustrates various additional embodiments of the invention. These do not constitute a definition of all possible embodiments, and those skilled in the art will understand that the present invention is applicable to many other embodiments. Further, although the following embodiments are briefly described for clarity, those skilled in the art will understand how to make any changes, if necessary, to the above-described apparatus and methods to accommodate these and other embodiments and applications.
Although specific hardware and data configurations have been described herein, note that any number of other configurations may be provided in accordance with embodiments of the present invention (e.g., some of the information associated with the databases described herein may be combined or stored in external systems).
The present invention has been described in terms of several embodiments solely for the purpose of illustration. Persons skilled in the art will recognize from this description that the invention is not limited to the embodiments described, but may be practiced with modifications and alterations limited only by the spirit and scope of the appended claims.
This application claims benefit of U.S. provisional patent application No. 62/315,989, filed Mar. 31, 2016 entitled “SYSTEM AND METHOD FOR UNSUPERVISED ANOMALY DETECTION ON INDUSTRIAL TIME-SERIES DATA”, which application is incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
62315989 | Mar 2016 | US |