The invention is about a data processing device capable of performing problem diagnosis in a production system with a plurality of robots.
The invention is further about a method for performing unsupervised diagnosis in a production system with a plurality of robots.
In a production line, an industrial robot is used together with a number of other robots or machines. In many applications, large number of industrial robots are installed in one site. Thus, even when only one robot fails to operate properly, the whole production line may be terminated. Therefore, there is a need for means to detect problems occurring at a robot in a production line with a plurality of robots that may lead to malfunction of the robot early enough to prevent the halt of the production line.
Detecting problems in a robot as part in a large fleet of robots is not an easy task. Not always are problems visible to the operators in the line—for instance synchronization problems might call delays of a couple of seconds that are unnoticed, but add-up to a serious problem if they keep on occurring for a longer time.
One source of information to monitor a robot is the event log, that each robot produces during operation in production. But analysis of the raw event logs is a very tedious task and does not scale. Event data are monitored for each robot in a dashboard, so there are at least as many dashboards of robot event data as there are robots in the line, which makes it nearly impossible for an operator to monitor or analyze these data.
In an embodiment, the present invention provides a data processing device capable of performing problem diagnosis in a production system with a plurality of robots, comprising: a first time series obtaining part configured to obtain historical event data used for determining some historical alarm indicator in time series and to store the historical event data as first time series data; a historic alarm indicator calculation part configured to calculate a series of historic alarm indicators using statistic characteristics of the first time series data; a threshold definition part configured to define at least one threshold value based on a statistical distribution of the historical alarm indicators; a second time series obtaining part configured to obtain operational event data during operation of the robots used for determining some operational alarm indicator in time series and to store the operational event data as second time series data; an operational alarm indicator calculation part configured to calculate a series of operational alarm indicators using statistic characteristics of the second time series data; an alarm notification part configured to give alarm notifications to one of operational, maintenance, or troubleshooting personnel for alarm indicators above the at least one threshold level; and an event highlighting part configured to highlight to one of the operational, maintenance, or troubleshooting personnel the events that mainly contribute to the operational alarm indicator, in order to determine the events that mainly contribute to the operational alarm indicator.
The present invention will be described in even greater detail below based on the exemplary figures. The invention is not limited to the exemplary embodiments. The attached new drawing sheet presents additional
In an embodiment, the present invention provides a data processing device and a method capable of performing problem diagnosis in a plurality of robots that allows to realize an alarm system to perform fault diagnosis of one or more robots in a fleet of a plurality of robots, that allows for an algorithmic analysis of the log-files of the robots without large engineering effort, and which reduces the complexity of monitoring information to a degree that can be processed by a human operator.
The problem is solved with respect to the device according to the invention by a data processing device as described herein.
The problem is solved with respect to the method according to the invention by a method as described herein.
A data processing device according to the invention comprises
a first time series obtaining part configured to obtain historical event data used for determining some historical alarm indicator in time series and store the historical event data as first time series data,
a historic alarm indicator calculation part configured to calculate a series of historic alarm indicators ai,h using statistic characteristics of the first time series data,
a threshold definition part configured to define at least one threshold value based on the statistical distribution of the historical alarm indicators,
a second time series obtaining part configured to obtain operational event data during operation of the robots used for determining some operational alarm indicator ai in time series and store the operational event data as second time series data,
an operational alarm indicator calculation part configured to calculate a series of operational alarm indicators ai using statistic characteristics of the second time series data,
an alarm notification part configured to give alarm notifications to one of operational, maintenance or troubleshooting personnel for alarm indicators above the at least one threshold level,
an event highlighting part configured to highlight to one of operational, maintenance or troubleshooting personnel the events that mainly contribute to the operational alarm indicator ai, in order to determine the events that mainly contribute to the operational alarm indicator ai.
The solution according to the present invention is an algorithmic analysis of the log files that highlights robots that require attention. It derives an alarm logic from historical event data, provides alarms and additional diagnostics information about the type of problem without human intervention.
The data processing device according to the invention provides high level alarm logic robust to single nuisance alarms without the need to manually define or tune alarm thresholds.
It further provides an automatic overview of ‘hot-areas’ both in production structure, i.e. for example line or cell, and technical nature of the problems, i.e. for example application specific, communication, mechanical . . . .
The invention provides a generic approach to monitoring dashboards based on robot event data.
The data processing device 1 comprises a first time series obtaining part 3 configured to obtain historical event data used for determining some historical alarm indicator in time series and store the historical event data as first time series data. It further comprises a historic alarm indicator calculation part 4 configured to calculate a series of historic alarm indicators ai,h using statistic characteristics of the first time series data. It further comprises a threshold definition part 5 configured to define at least one threshold value based on the statistical distribution of the historical alarm indicators. It further comprises a second time series obtaining part 6 configured to obtain operational event data during operation of the robots 21, 22, 2n used for determining some operational alarm indicator ai in time series and store the operational event data as second time series data. It further comprises an operational alarm indicator calculation part 7 configured to calculate a series of operational alarm indicators ai using statistic characteristics of the second time series data. It further comprises an alarm notification part 8 configured to give alarm notifications to one of operational, maintenance or troubleshooting personnel for alarm indicators above the at least one threshold level. It further comprises an event highlighting part 9 configured to highlight to one of operational, maintenance or troubleshooting personnel the events that mainly contribute to the operational alarm indicator ai, in order to determine the events that mainly contribute to the operational alarm indicator (ai).
Each of the first time series obtaining part 3, historic alarm indicator calculation part 4, threshold definition part 5, second time series obtaining part 6, operational alarm indicator calculation part 7, alarm notification part 8, and event highlighting part 9 may be realized as program modules of the robot control and monitoring software.
The overall process as executed by the data processing device 1 is depicted with reference to
In an initial step S, some historic alarm indicator values are calculated based on historical data. The historic alarm indicator can be defined for time-windows, e.g. hours or days, or alarm episode, e.g. chunk of events with less than 5 minutes between the subsequent events.
Based on the distribution of the historic alarm indicator values, a threshold value is defined and selected in the following step S2.
In the next step S3, during live operation, new event data is used to calculate an operational alarm indicator. The operational alarm indicator calculation part 4 may for this purpose use the same algorithm as is used in the historic alarm indicator calculation part 7.
For values above the threshold, in step S4 alarm notifications are given to operational, maintenance, or troubleshooting personal.
In a further step S5, the events that mainly contribute to the operational alarm indicator are highlighted to the operational, maintenance or troubleshooting personal.
The basic idea of the invention is to use statistic characteristic of the event log to calculate an alarm indicator (ai).
For instance, the alarm indicator for one day could be calculated by:
With
x′=numbers event in an observation period (day, hour)
σ=standard deviations in number of events
The ai captures, how much of the current number of the events deviating from the average can be accounted for the standard deviation and how much is uncommon. Based on the statistical distribution of ai in the historical data, one or several threshold values can be defined, e.g.:
ai>0.3=>yellow alarm
ai>0.5=>orange alarm
ai>0.8=>red alarm
The values could be for instance the 0.75; 0.85 and 0.95 percentile of the distribution of ai.
The parameters of the ai are determined in an initial training step on historical data.
At runtime, new ‘chunks’ of incoming events are used to calculate an updated value for the alarm indicator ai. If ai is above the threshold, an alarm with all relevant information, e.g. production line, cell, robot, are displayed to the monitoring personnel.
Furthermore, additional information highlighting the current condition of the robot/line/cell are provided. Such additional information may comprise:
A list of events that cause the high value of ai, e.g. unlikely events, uncommon frequent events, missing events and so on, or
a visualization of the alarm indicator in a hierarchal fashion, e.g. visualizing how much ai is caused by one line/cell/robot, how much ai is caused by what category of event, which category of event may be one of communication, tool, application, electrical, etc.
Other examples of alarm indicators can be:
Based on the frequency of overall events, event categories, or single events;
Based on distance to k-th nearest neighbors using some distance or similarity measure like cosine, jacquard, Euclidian or other;
Error rates of machine learning algorithms like autoencoder networks, regression algorithms, bayes classifier or others;
Based on event probabilities and likelihoods, e.g., estimated by kernel density estimation
A further advantageous embodiment can be the combination of a machine learning algorithm, e.g. once-class-svm, knn anomaly detection, local-outlier-factor, autoencoder networks, to detect anomalies in the event data and an alarm indicator to identify the events that probably cause the decision of the machine learning algorithms towards anomaly.
A further advantageous embodiment may be an integration into a—e.g. state-based or event based—alarm system. Such an alarm system, in case it is state-based, will trigger an alarm while the score is above a threshold. In case it is event-based, it will trigger an alarm when the alarm exceeds a threshold.
A further advantageous embodiment may be to include arbitrary input data like I/O or analog signals.
Input for the calculation of alarm Key Performance Indicators (KPIs) is the log file of the robots or production system. Below is an example of a log file produced by one or several robots. The example log file has the attributes category and message. A log file can have more attributes, like several message arguments or event severity.
This example how to calculate an alarm indicator based on the number of events in a specified time-slot. Input is the historical time-series as shown in the table above with vents from one or several robots. The robot event are have different categories, e.g. motion (e.g. collision event, path executed), communication (e.g. I/O status change, I/O card missing), controller (e.g. backup, high temperature) depending on the source of the event documented in the log. In a first step, the number of events per hour is calculated. For ten hours the results in the following number per hour:
The following tables show the average and standard deviation for the total number of events and the number of event in each category:
An alarm KPI can be calculated by the formula:
with
The 10% percentile for AI is 0.68, the 25% percentile is 0.85 and the 0.75 percentile is 1. This can be used to define the following alarm thresholds:
Assuming that different categories of events exists like motion event, tool events, controller events additional information can be provided about the main contributors to events. The contribution can be for instance calculated by applying the alarm indicator formula to the event categories.
During operation, the event generated during the last 60 minutes are collected from the robot and the alarm indicator is calculated. Assuming the following two example hours:
The events in hour A result in an alarm indicator is AIA=0.6 1 and thus will not trigger an alarm. The events in hour B result in an alarm indictor AIB=0.6 and will result in showing a red-alarm. Furthermore, the alarm indicator for hour B per category are AIB,motion=0.7 and AIB,communication=1 and AIB,controller=1 implying that a possible issue is related to the motion of the robot.
This example shows how a machine learning model can be used to calculate and alarm KPI. Input is again the historical logfile of one or several robots. The log file table is transformed into samples for a decision tree classification:
Where event category is the category of an event from the event log, category 1st event before is the category of the event one row before the event, and category 2nd event is the category of the event two rows before the event. Each row (beside the ‘row from original table’ column) is used as one sample for a decision tree training, where the decision tree is trained to predict the category of the event when knowing the category of the 1st event and 2nd event before. This is a typical classification problem in machine learning.
Other machine learning model might use regression or probability estimations (e.g. kernel density estimation) and their output to derive an alarm indicator.
While the invention has been illustrated and described in detail in the drawings and foregoing description, such illustration and description are to be considered illustrative or exemplary and not restrictive. It will be understood that changes and modifications may be made by those of ordinary skill within the scope of the following claims. In particular, the present invention covers further embodiments with any combination of features from different embodiments described above and below. Additionally, statements made herein characterizing the invention refer to an embodiment of the invention and not necessarily all embodiments.
The terms used in the claims should be construed to have the broadest reasonable interpretation consistent with the foregoing description. For example, the use of the article “a” or “the” in introducing an element should not be interpreted as being exclusive of a plurality of elements. Likewise, the recitation of “or” should be interpreted as being inclusive, such that the recitation of “A or B” is not exclusive of “A and B,” unless it is clear from the context or the foregoing description that only one of A and B is intended. Further, the recitation of “at least one of A, B and C” should be interpreted as one or more of a group of elements consisting of A, B and C, and should not be interpreted as requiring at least one of each of the listed elements A, B and C, regardless of whether A, B and C are related as categories or otherwise. Moreover, the recitation of “A, B and/or C” or “at least one of A, B or C” should be interpreted as including any singular entity from the listed elements, e.g., A, any subset from the listed elements, e.g., A and B, or the entire list of elements A, B and C.
Number | Date | Country | Kind |
---|---|---|---|
18158091.1 | Feb 2018 | EP | regional |
This application is a continuation of International Patent Application No. PCT/EP2018/081202, filed on Nov. 14, 2018, which claims priority to European Patent Application No. EP 18158091.1, filed on Feb. 22, 2018. The entire disclosure of both applications is hereby incorporated by reference herein.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/EP2018/081202 | Nov 2018 | US |
Child | 16996928 | US |