This application claims priority to EP Application No. 22176499.6, having a filing date of May 31, 2023, the entire contents of which are hereby incorporated by reference.
The following relates to a method and system for detecting sensor anomalies.
To reliably operate complex systems such as automated factories, plants, or electrical grids, operators rely heavily on sensor readings to understand whether the system is operating correctly. The appearance of incorrect operation can result from failures in the system or failures in the sensor to report accurate values. Developing a machine learning algorithm to automatically detect undesirable operating behavior is often difficult because it is rare to obtain labelled data for this task. For this reason, algorithms for detecting undesirable operating behavior typically formulate the problem as anomaly detection. While this approach is convenient since it requires no labelled data, users typically find that the resulting algorithms frequently indicate anomalies are present even when the system is in fact behaving normally (i.e., high false positive rate).
In the state of the conventional art, historical data from sensors are used to establish a “normal” model of system behavior. Often simple parametric models like Gaussian distributions are used. Based on historical data a mean and variance is learned. Any sensor observation deviating significantly (more than 3 standard deviations) is flagged as an anomaly.
An aspect relates to a method and system for detecting sensor anomalies that provide an alternative to the state of the conventional art.
According to embodiments of the method for detecting sensor anomalies, the following operations are performed by components, wherein the components are software components executed by one or more processors and/or hardware components:
The system for detecting sensor anomalies comprises:
The following advantages and explanations are not necessarily the result of the object of the independent claims. Rather, they may be advantages and explanations that only apply to certain embodiments or variants.
In connection with embodiments of the invention, unless otherwise stated in the description, the terms “training”, “generating”, “computer-aided”, “calculating”, “determining”, “reasoning”, “retraining” and the like relate to actions and/or processes and/or processing steps that change and/or generate data and/or convert the data into other data, the data in particular being or being able to be represented as physical quantities, for example as electrical impulses.
The term “computer” should be interpreted as broadly as possible, in particular to cover all electronic devices with data processing properties. Computers can thus, for example, be personal computers, servers, clients, programmable logic controllers (PLCs), handheld computer systems, pocket PC devices, mobile radio devices, smartphones, devices, or any other communication devices that can process data with computer support, processors, and other electronic devices for data processing. Computers can in particular comprise one or more processors and memory units.
In connection with embodiments of the invention, a “memory”, “memory unit” or “memory module” and the like can mean, for example, a volatile memory in the form of random-access memory (RAM) or a permanent memory such as a hard disk or a Disk.
In an embodiment, the method and system, improve the performance of sensor anomaly detection by incorporating additional domain knowledge about the structure of the system in the form of relational constraints.
In an embodiment, the method and system, reduce the prediction error for anomaly detection in problems involving material flow (lower false positive rate).
In an embodiment, the method and system, provide increased training efficiency by leveraging domain knowledge.
In an embodiment, the method and system, require less data to achieve a highly performant model.
In an embodiment, the method and system, help to guarantee that model predictions are consistent with physical laws (satisfy aggregation constraints).
In an embodiment, the method and system, increase trustworthiness and ease of use in adopting AI-based algorithms.
In an embodiment, the method and system, reduce costs that are associated with false or missed anomalies.
In an embodiment of the method and system, the extracting operation is performed by a material flow tracking system that is processing the sensor measurements.
In an embodiment of the method and system, the machine learning processes previous sensor measurements when executing the forecasting operation.
An embodiment of the method comprises the additional operation of automatically halting at least a part of the industrial system after detecting the anomaly.
An embodiment of the method comprises the additional operation of outputting, by a user interface, an alert to an operator after detecting the anomaly.
In an embodiment of the method and system, the machine learning model has been initially trained by a Gradient-based Reconciling Propagation algorithm in order to learn trainable parameters of a projection matrix, wherein the projection matrix is used to project base forecasts to coherent forecasts in a hierarchically-coherent solution space, and wherein the coherent forecasts contain the predicted time series values.
In an embodiment of the method and system, the Gradient-based Reconciling Propagation algorithm ensures that information propagation between forecasts is restricted to nodes who are connected through an ancestral and descendant relation, by masking entities of the projection matrix by a second matrix, thereby constraining the effects of the projection matrix.
A computer program product (non-transitory computer readable storage medium having instructions, which when executed by a processor, perform actions) has program instructions for carrying out the method.
The provision device for the computer program product stores and/or provides the computer program product.
Some of the embodiments will be described in detail, with reference to the following figures, wherein like designations denote like members, wherein:
In the following description, various aspects of embodiments of the present invention and embodiments thereof will be described. However, it will be understood by those skilled in the conventional art that embodiments may be practiced with only some or all aspects thereof. For purposes of explanation, specific numbers and configurations are set forth in order to provide a thorough understanding. However, it will also be apparent to those skilled in the conventional art that the embodiments may be practiced without these specific details.
The described components can each be hardware components or software components. For example, a software component can be a software module such as a software library; an individual procedure, subroutine, or function; or, depending on the programming paradigm, any other portion of software code that implements the function of the software component. A combination of hardware components and software components can occur, in particular, if some of the effects according to embodiments of the invention are exclusively implemented by special hardware (e.g., a processor in the form of an ASIC or FPGA) and some other part by software.
In this embodiment of the invention the computer program product 104 comprises program instructions for carrying out embodiments of the invention. The computer program 104 is stored in the memory 103 which renders, among others, the memory and/or its related computer system 101 a provisioning device for the computer program product 104. The system 101 may carry out embodiments of the invention by executing the program instructions of the computer program 104 by the processor 102. Results of invention may be presented on the user interface 105. Alternatively, they may be stored in the memory 103 or on another suitable means for storing data.
In this embodiment the provisioning device 201 stores a computer program 202 which comprises program instructions for carrying out the invention. The provisioning device 201 provides the computer program 202 via a computer network/Internet 203. By way of example, a computer system 204 or a mobile device/smartphone 205 may load the computer program 202 and carry out embodiments of the invention by executing the program instructions of the computer program 202.
The embodiments shown in
Hierarchical time series as well as grouped time series and corresponding algorithms for forecasting are known, for example, from Hyndman, R. J., & Athanasopoulos, G. (2018): “Forecasting: principles and practice”, 2nd edition, OTexts: Melbourne, Australia, chapter 10, available on the internet at https://otexts.com/fpp2/ on 31 May 2022. The entire contents of that document are incorporated herein by reference.
The following embodiments are targeting applications where material flow is present. For example, material flows through a factory according to input to the production line to produce products that are assembled and eventually flow out of various production phases. More concretely, if four wheels flow into an automobile production phase for wheel assembly, then a car with four wheels will flow out. Similarly, in electrical circuits physical laws require that the total current flowing into a node is equal to the total current flowing out of a node. In problems involving flow, the embodiments leverage the known structure of the material flow to impose additional knowledge on an anomaly detection system and achieve improved performance.
Another example would be an assembly line where weight sensors measure the weight of a first, second, third and fourth component that are entering the assembly line. The measurements of these weight sensors provide the time series values at the lowest level of the nodes in
As modern manufacturing systems can be very complex, other embodiments can feed raw sensor measurements into a material flow tracking system that analyzes and/or simulates material flow in the manufacturing system. Material flow tracking systems are known from the state of the conventional art, for example from the field of material flow analysis, and are also available as readily deployable commercial products. The hierarchical times series values for the different nodes in
At a high level, the idea is to use the structural knowledge of an industrial system (in terms of relational information) to train a machine learning model. The machine learning model is responsible for predicting what the sensor readings should be if the industrial system is working correctly. In essence, the machine learning model represents the expected normal industrial system behavior. We can then compare the expected sensor values with actual sensor values (e.g., by taking the absolute difference). A significant deviation (i.e., large residual value) indicates that the industrial system is behaving abnormally.
Hierarchical relations among time-series sensor data can be represented as a tree, a directed acyclic graph G∈{V, E} where V is the set nodes of the graph where each node is associated to a time-series. The cardinality of the set |V|=n is the number of time-series to forecast. The set of edges E∈V×V represent parent-child relations where the value of the times-series at a parent node equals the sum of values of all child nodes.
Let yt=[yv
Let yB
where Im is the m×m dimensional identity matrix and Ssum∈{0,1} is the summation matrix where the values of ith row of Ssum indicate which values in yB,t to aggregate to define the ith value of yA,t. For the hierarchical time-series example in
For the grouped time-series setting with groupings shown in
Historically, the reconciliation of forecasts is commonly addressed by applying post-processing to the base forecasts. To distinguish between the reconciled forecasts and base forecasts, we denote the base forecasts with the tilde accent where ŷt+h is the reconciled forecasts from the base forecasts {tilde over (y)}t+h. Previous work has shown that {tilde over (y)}t+h can be reconciled by the following matrix multiplications
ŷ
t+h
=SP{tilde over (y)}
i+h
where P∈m×n and its values determine the propagation of the time-series through aggregations or dis-aggregations. The reconciliation transformation can be viewed as a projection matrix, where reconciliation from all levels can be applied through the matrix multiplications of the matrix, SP∈2×n.
The embodiment uses a Gradient-based Reconciling Propagation method which aims to learn the values of a projection matrix Po, which is a matrix of trainable parameters that projects the base forecasts into a hierarchically-coherent solution space. The resulting coherent forecasts are defined as
ŷ
t+h
=S(ST*Po){tilde over (y)}t+h,
where * denotes an element-wise multiplication and {tilde over (y)}t+h is a vector of n dimensions. As this equation is differentiable, it is therefore possible to use a gradient-based approach to learn the values of Po which minimizes forecast error. This approach can either be used as a post-processing step to reconcile a set of base forecasts or integrated into a neural network architecture as the output layer to yield coherent forecasts, meaning {tilde over (y)}t+h can either be a set of base forecasts or the outputs of a hidden layer of n dimensions. The element-wise multiplication of (ST*Po) ensures that the information propagation between forecasts is restricted to nodes who are connected through an ancestral and descendant relation.
The training algorithm depicted in
The machine learning model is trained on historical sensor data to learn the normal behavior of the industrial system by a forecasting task. The training data can be obtained by recording sensor values which are known to be anomaly free. A second option is to utilize historical data that may contain anomalies, but the anomaly frequency must be low (e.g., less than 1%).
Once the machine learning model has been trained, sensor data can be fed to the machine learning model to produce a prediction about what normal sensor readings should look like. By subtracting the observed sensor readings from the predicted sensor readings, a residual value is computed. A large residual value indicates that the industrial system is operating in an anomalous state.
If the algorithm depicted in
In a forecasting operation OP1, a machine learning model, wherein the machine learning model models a material flow in an industrial system, in particular in a production line, as a hierarchical time series, wherein the hierarchical time series represents a structure of the material flow using a directed acyclic graph with a set of nodes and a set of edges, wherein each node is associated to a time series, and wherein the edges represent parent-child relations where each value of a time series at a parent node equals the sum of the respective values of its child nodes, predicts time series values for all nodes.
In a receiving operation OP2, current sensor measurements from sensors placed in the industrial system are received.
In an extracting operation OP3, observed time series values for at least some or all of the nodes are extracted from the current sensor measurements.
In a computing operation OP4, a difference between the predicted time series values and the observed time series values is computed.
In a detecting operation OP5, an anomaly is detected if the difference exceeds a threshold.
For example, the method can be executed by one or more processors. Examples of processors include a microcontroller or a microprocessor, an Application Specific Integrated Circuit (ASIC), or a neuromorphic microchip, in particular a neuromorphic processor unit. The processor can be part of any kind of computer, including mobile computing devices such as tablet computers, smartphones or laptops, or part of a server in a control room or cloud. The above-described method may be implemented via a computer program product including one or more computer-readable storage media having stored thereon instructions executable by one or more processors of a computing system. Execution of the instructions causes the computing system to perform operations corresponding with the acts of the method described above.
The instructions for implementing processes or methods described herein may be provided on non-transitory computer-readable storage media or memories, such as a cache, buffer, RAM, FLASH, removable media, hard drive, or other computer readable storage media. Computer readable storage media include various types of volatile and non-volatile storage media. The functions, acts, or tasks illustrated in the figures or described herein may be executed in response to one or more sets of instructions stored in or on computer readable storage media. The functions, acts or tasks may be independent of the particular type of instruction set, storage media, processor or processing strategy and may be performed by software, hardware, integrated circuits, firmware, micro code, and the like, operating alone or in combination. Likewise, processing strategies may include multiprocessing, multitasking, parallel processing, and the like.
Although the present invention has been disclosed in the form of embodiments and variations thereon, it will be understood that numerous additional modifications and variations could be made thereto without departing from the scope of the invention.
For the sake of clarity, it is to be understood that the use of “a” or “an” throughout this application does not exclude a plurality, and “comprising” does not exclude other steps or elements.
Number | Date | Country | Kind |
---|---|---|---|
22176499.6 | May 2022 | EP | regional |