The present invention relates to a pre-processor and a diagnosis device.
In abnormality portent diagnosis technology based on clustering, it is necessary to divide multi-dimensional chronological sensor data changing over time (hereinafter referred to as “sensor data”) into a plurality of intervals (period of time) and process them. The reason is that since there is data of different states when there are a plurality of states, clusters for expressing a complicated space distribution are not necessarily able to distinguish different states, and when states are not distinguished, abnormality detection sensitivity is likely to be lowered.
For example, when current sensor data including three states of “OFF,” “ON,” and “transient” is diagnosed, although a current value is 0.5 times a set value, and it is actually abnormal “ON” current data, there are cases in which it belongs to a cluster of the state of “transient” changing over time, and it is determined to be normal data. In order to solve this problem, a method of dividing the sensor data into a plurality of state intervals and processing each state interval before the abnormality diagnosis (preprocessing) is considered to be effective.
For abnormality portent diagnosis of sensor data changing over time, various data division techniques have been developed until now. A technique of dividing a mode in accordance with a driving state using event information of a facility is disclosed in JP 2015-172945 A. A technique in which event information can be used, but when there is no event information, a state transition point is detected in accordance with a variation range of sensor data is disclosed in JP 2011-243118 A. A technique of calculating a difference between sensor data and sensor data adjacent thereto and dividing data in accordance with a value of the difference is disclosed in JP 2013-175108 A.
Using the techniques disclosed in JP 2015-172945 A, JP 2011-243118 A, and JP 2013-175108 A, it is possible to divide data within ranges disclosed therein. However, in the technique disclosed in JP 2015-172945 A, since the event information is necessary, there is a problem that it depends on a type of facility and a setting of a sensor, and division is unable to be performed when the event information is insufficient.
Further, in the technique disclosed in JP 2011-243118 A, since a known steady state is detected to estimate a transient state interval, it is a condition that a known steady state is necessary, and thus it is difficult to divide data which is not steady but changes over time.
In the technique disclosed in JP 2013-175108 A, since noise of sensor data has big influence on a result of a difference calculation with adjacent sensor data, there is a problem in that may fine intervals caused by noise occur even in a division result based on the difference although a smoothing process is performed. Further, since a process of optimizing important parameters for sensor data of various devices, for example, sensor data of a plurality of sensors having different SN ratios such as smoothing is not disclosed, there is a problem in that the accuracy of data division changes depending on a different sensor.
Further, generally, actual sensor data often includes both a large variation (trend) and a small variation. In this case, there is a demand for a function that enables the user to define or check a degree of variation which is dealt as noise or whether or not it is necessary to deal it as a signal and divide it, but a technique of responding to the demand is not disclosed in JP 2013-175108 A, and it is difficult for the user to customize the division.
In this regard, it is an object of the present invention to provide a technique of dividing sensor data while suppressing influence of the noise for diagnosis.
A representative pre-processor of a diagnosis device according to the present invention pre-processes sensor data, and includes: an interval setting unit that acquires chronological sensor data, calculates a first index and a second index indicating different influences for noise included in the sensor data, and sets an interval in which influence of the noise is suppressed in accordance with the first index and the second index; a division processing unit that sets a dividing point at which a trend of the sensor data changes in units of intervals in which the influence of the noise is suppressed, the intervals being set by the interval setting unit; and a dividing point extraction processing unit that generates a dividing point vector from a time of the dividing point set by the division processing unit and outputs the dividing point vector.
According to the present invention, it is possible to provide a technique of dividing sensor data while suppressing influence of the noise for diagnosis.
An embodiment of the present invention relates to an abnormality portent diagnosis device or an abnormality portent diagnosis system including a pre-processor and an abnormality portent diagnosis processing unit. As an example of the embodiment, each of a pre-processor and an abnormality portent diagnosis processing unit may be a general computer, an implementation of software that performs processing in accordance with a program, or an implementation of dedicated hardware other than a general computer.
Further, it may be implemented such that dedicated hardware is incorporated into a computer, and an implementation of software is combined with an implementation of hardware. The pre-processor may be externally connected as preprocessing of the abnormality portent diagnosis processing unit, may be incorporated into the abnormality portent diagnosis processing unit and internally connected, or may be externally connected as a module shared with other data processing. Hereinafter, an embodiment will be described with reference to the accompanying drawings.
An example of dividing sensor data changing with the passage of time into a plurality of intervals and performing diagnosis on each interval or a period of time between dividing points in which a plurality of intervals are collected will be described.
An example of the abnormality portent diagnosis system includes an abnormality portent diagnosis processing unit 2, a pre-processor 1 serving as a pre-processing device thereof, and a multi-dimensional sensor 3 as illustrated in
The pre-processor 1 includes an interval setting unit 11, a data division processing unit 12, a dividing point merge processing unit 13, a division result display control unit 14, a data extraction processing unit 15, a dividing point extraction processing unit 16, and a feature quantity calculating unit 17. The abnormality portent diagnosis processing unit 2 includes an abnormality portent diagnosis device 21, a dividing point time abnormality diagnosis processing unit 22, an abnormality portent diagnosis device 23, and a comprehensive diagnosis result determination processing unit 24.
In the pre-processor 1, for sensor data collected from the multi-dimensional sensor 3 serving as a monitoring device, a temporal change indicating a variation(trend) in sensor data, that is, a temporal change in an inclination when sensor data is indicated by a graph is obtained. To this end, the interval setting unit 11 obtains a time range indicated by one inclination as an interval and obtains a temporal change in sensor data by obtaining an inclination of the sensor data for each interval.
Here, one inclination has any one of a positive value, a negative value, or a substantially zero value. The data division processing unit 12 collected consecutive intervals having the same inclination value into one and obtains a point at which an inclination value changes, that is, a point serving as a boundary point of the collected interval, as a dividing point. The dividing point merge processing unit 13 merges periods of time before and after the dividing point in accordance with a condition such as a distance (period of time) between the dividing points or the number of dividing points. In other words, the dividing point is deleted, and two periods of time that are divided are merged into one.
The division result display control unit 14 causes the sensor data and the dividing point to be displayed on a screen, receives correction of division performed by the user, transmits correction content to the interval setting unit 11 as a parameter (a coefficient), and causes the interval setting unit 11 to perform reprocessing.
The data extraction processing unit 15 extracts sensor data for each period of time between the dividing points and transmits the extracted sensor data to abnormality portent diagnosis processing unit 2. The dividing point extraction processing unit 16 extracts a time of the dividing point and transmits the extracted time to the abnormality portent diagnosis processing unit 2. The feature quantity calculating unit 17 calculates a feature quantity of the sensor data for each period of time between the dividing points and transmits the extracted feature quantity to the abnormality portent diagnosis processing unit 2. The feature quantity will be described later with reference to
The abnormality portent diagnosis processing unit 2 receives the sensor data, the time of the dividing point, and the feature quantity from the pre-processor 1. The abnormality portent diagnosis device 21 performs abnormality diagnosis in accordance with a known clustering technique using the sensor data of each period of time between the dividing points. The dividing point time abnormality diagnosis processing unit 22 performs abnormality diagnosis using the time of the dividing point. The abnormality portent diagnosis device 23 performs abnormality diagnosis in accordance with a known clustering technique using the feature quantity of each period of time between the dividing points.
The comprehensive diagnosis result determination processing unit 24 comprehensively determines whether or not the sensor data is abnormal for each period of time between the dividing points on the basis of the abnormality diagnosis results of abnormality portent diagnosis device 21, the dividing point time abnormality diagnosis processing unit 22, and the abnormality portent diagnosis device 23.
Intervals are set to “3,” “11,” and “31” for the sensor data of the graph 30, one inclination is calculated from the value of the sensor data for each interval, and graphs indicating relations between the calculated inclination and the time as the temporal change are indicated by graphs 31, 32, and 33.
The graph 31 indicates a graph when the interval is set to “3.” As illustrated in the graph 31, when the interval is too short, since influence of the noise on an inclination calculation is large, and the trend of the sensor data is not distinguished from the noise, the trend of the sensor data is not accurately reflected.
The graph 32 indicates a graph when the interval is set to “11.” As illustrated in the graph 32, when an appropriate interval is set, the trend of the sensor data is accurately reflected. The reason is that when a long interval is set, an inclination according to linear regression is calculated using more pieces of data, and thus there is an averaging effect. In other words, the influence of the noise is reduced by a large number of sensor data indicating the trend.
However, when the interval is too long, an excessive averaging effect occurs, leading to an adverse effect. The graph indicates a graph when the interval is set to “33.” As illustrated in the graph 33, not only the noise but also the trend of part of the sensor data included in original sensor data is lost, and the trend of the sensor data is not accurately reflected.
In this situation, it is necessary to optimize the interval in order to reduce both the influence of the noise and the influence of the excessive averaging effect. Since the widths of the noise and the trend of the data are different depending on sensor data, the optimal interval is different. Next, an evaluation index used for optimizing the interval will be described.
Further, when the multi-dimensional sensor 3 detects a state of a certain device, there are cases in which the sensor data abruptly changes as in sensor data in a period of time 30a, and a pattern of the change indicates a constant pattern specific to the device. For example, there are cases in which a pattern (for example, a rising waveform) of sensor data when a device is powered on is constant, and the occurrence of this pattern may be a time reference of the sensor data. The constant pattern may be a change of the value of the sensor data from “0.”
In the example illustrated in
In
Since the index decreases as the interval increases, a part in which the inclination lasts long is useful. In the case of an interval longer than the first period of time, for example, an inclination of a straight line 36a is obtained. Hereinafter, this index is referred to as an “index A.” In order to calculate the index A, an inclination changing over time to be described later in
When the inclination is calculated from sensor data of each interval using a linear regression technique, a sum of residuals between the straight line and the sensor data calculated by the linear regression technique is calculated. The residual indicates an error between the trend and actual sensor data when the trend of the sensor data in the interval is indicated by one straight line. Then, when the sensor data in the interval follows the same trend, the residual is the error caused by the noise included in the actual sensor data.
When the interval is too long, and sensor data of different trends is included in one interval, since a phenomenon that the trend is different is also included in the residual as the error, the residual tends to be larger than usual. Due to this feature, in the present embodiment, the residual is used for evaluating the influence of the excessive averaging effect as the index.
In
In the case of an interval longer than the second period of time, for example, an inclination of the straight line 36b straddling the two trends is obtained, and although a residual according to fine positive (+) and negative (−) inclinations in a region 39 is based on noise, a residual occurs between the sensor data indicating the first trend and the sensor data indicating the second trend as in a left end and a right end of the region 39.
As described above, since a value of this index increases as the interval increases, this index is opposite to the index A, and a part in which the inclination lasts short is useful. Hereinafter, this index is referred to as an “index B.” In order to calculate the index B, an inclination changing over time to be described later in
An optimization range of the length of the interval is set to values “3” to “51” in advance, and in order to perform a calculation on all odd values “3,” “5,” “7,” . . . , “49,” and “51,” and thus the interval setting unit 11 extracts one length of the interval which is not yet used for the calculation from the optimization range and sets the extracted length as the length of the interval of the calculation target (step 51). For this reason, the odd value “3” is set in first execution of step 51, and the odd value “5” may be set in second execution.
Further, in advance, a period of time of the sensor data to be used for optimization (for example, an operation period of time of a target from which sensor data is detected such as eight periods of time) is set in advance, and in order to perform the calculation while moving the position of the interval at all times within the period of time, the interval setting unit 11 extracts one time which is not yet used for the calculation from the set period of time, and sets the extracted time as the position of the interval of the calculation target or the time of the center of the interval (Step 52). This setting will be described later with reference to
The interval setting unit 11 calculates the inclination of the graph of the sensor data in the interval set in steps 51 and 52, calculates the index B (step 53), and calculates the index A (step 54). Then, it is determined whether or not up to the last sensor data of the period of time set in advance is used as the calculation target (step 55), and when up to the last sensor data is determined not to be used as the calculation target, the process returns to step 52, the interval is moved so that the position of the next sensor data is included, and when up to the last sensor data is determined to be used as the calculation target, the process proceeds to step 56.
The linear regression indicated by (Formula 3) is performed on (length of interval−1)/2) pieces of data before and after the sensor data at the center of the interval serving as the calculation target.
S=a×T+b (Formula 3)
Here, (Formula 3) is a formula indicating a straight line when a vertical axis is S, and a horizontal axis is T, and S (dependent variable) indicates a value corresponding to the sensor data, T (independent variable) indicates a time, a coefficient “a” indicates the inclination, a coefficient “b” indicates a value corresponding to the sensor data at an end (section) of the interval.
In
Since the calculation is unable to be performed on the first sensor data and the last sensor data of the period of time of the sensor data used for optimization set in advance (the period of time serving as the evaluation target) since the sensor data before and after it is insufficient, zero or a value which can be distinguished from the inclination may be set. In the example of
Further, in the present embodiment, it is defined that the inclination at the center of the interval is calculated, but another position within the interval may be defined depending on an implementation state. For example, when the calculation is performed in real time, it may be defined that an inclination at a calculation time point is calculated with the sensor data included in an interval prior to the calculation time point. Further, preprocessing such as an outlier process and a normalization process may be performed before the above process.
Here, for example, the calculation of the index B in step 53 may be a sum of five differences calculated by calculating the sensor data 63 from the sensor data 61, calculating the coefficient “a” and the coefficient “b” of (Formula 3) from T, then calculating S of (Formula 3) in which the time of the sensor data 61 is T using the calculated coefficient “a” and the coefficient “b,” calculating a difference (corresponding to the residual) between the calculated S and the value of the sensor data 61, and repeating the calculation of the difference up to the sensor data 63.
In the calculation of the index A in step 54, the value 65 is stored, it is determined whether or not the stored value 65 and the value 66 are a pair of different signs when the value 66 is stored, the number of pairs of different signs is counted up when the stored value 65 and the value 66 are determined to be a pair of different signs, and the storage process, the determination process, and the counting-up process may be performed even after the value 66.
Referring back to
As described above, the index A increases as the interval decreases, and the index B increases as the interval increases. For this reason, solid lines 71 and 72 of values obtained by multiplying the index A by a coefficient A increase as the interval decreases, and a dashed line 73 of a value obtained by multiplying the index B by a coefficient B increases as the interval increases. In the example of
In this case, it is possible to change degrees of importance of the index A and the index B in accordance with settings of the coefficient A and the coefficient B, and the length of the optimized interval is influenced. For example, when the coefficient A is increased for the solid line 71, it becomes the solid line 72, a degree of influence of the index A increases, the length 74 of the interval changes to the length 75 of the interval, and the interval increases. The coefficient A and the coefficient B may be set by the user or may be set in accordance with a division result of test data.
An example of another optimization evaluation will be described using (Formula 1) and (Formula 2).
Evaluation value C=index A×coefficient A+index B×coefficient B (Formula 1)
Evaluation value D=index A×index B (Formula 2)
The length of the interval in which the evaluation value C calculated by (Formula 1) is the smallest may be taken as the length of the optimized interval. Further, the length of the interval in which the evaluation value D calculated by (Formula 2) becomes maximum may be used as the length of the optimized interval. The optimization illustrated in
In the present embodiment, a technique of adjusting the coefficient A and the coefficient B in accordance with to a result of correcting the division result by the user is provided. Accordingly, it is possible to customize the division process easily although the user does not understand a division process mechanism. The detailed description will be given later.
However, for substantially zero, since noise is included in actual sensor data, there are few perfect zero points, when an absolute value of the inclination value is smaller than a preset threshold value, it is defined as substantially zero indicating a steady trend which is rising nor falling. Depending on this definition, it may be defined as positive or negative when the absolute value of the inclination value is larger than the same threshold value.
The data division processing unit 12 first sets an inclination value of a temporally first interval as an inclination A, sets a representative time of the first interval as both a start points and an end point, and sets the first interval as a target interval of the following process (Step 81). The representative time of the interval may be a time corresponding to the sensor data at the center of the interval or a time corresponding to another sensor data in the interval.
The data division processing unit 12 sets a temporally next interval of the target interval set in step 81 or previous step 82 (in a loop of step 82 to step 87) as a new target interval, and sets an inclination value of the new target interval and a representative time as an inclination B and a time point B.
The data division processing unit 12 determines whether the inclination A and the inclination B have the same inclination value, and when the inclination A and the inclination B are determined to have the same inclination value, they are not divided at this end point, and thus in order to set the time point B as a new provisional end point, the time of the time point B is set as the end point, and the end point is updated (step 84). Further, when the inclination A and the inclination B are determined not to have the same inclination value, the process proceeds to step 85 since they are divided at this end point.
The data division processing unit 12 stores the values of the inclination A, the start point, and the end point (step 85), and records information indicating that the division is performed at the end point. Then, the inclination A, the start points, and the end point are updated by setting the value of the inclination B as the inclination A and setting the value of the time point B as both the start point and the end point (step 86) and used as a reference for finding a new dividing point.
The data division processing unit 12 determines whether or the target interval is a temporally last interval (step 87), and when the target interval is determined not to be the last interval, the process returns to step 82, a next interval of the target interval is set as a new target interval, and the process is repeated, and when the target interval is determined to be the last interval, the division process ends. The target interval in step 87 is the target interval set in step 82. Further, when the process proceeds from step 84 to step 87, and the division process ends, the values of the inclination A, the start point, and the end point may be stored.
Each of the start point and the end point stored in execution of step 85 is the dividing point, and a period of time between the start point and end point is referred to as a period of time between the dividing points. Then, when a plurality of intervals are included in the period of time between the dividing points, since the inclination values (positive, negative, or substantially zero) of the included intervals are identical to one another, one period of time between the dividing points is likely to indicate one trend.
Further, in a period of time between dividing points and a temporally next period of time between dividing points, two dividing points, that is, the dividing point serving as the end point of the former period of time between the dividing points and the dividing point serving as the start point of the latter period of time between the dividing points are adjacent to each other, and thus one of the two adjacent dividing points may be regarded as one dividing point between the former period of time between the dividing points and the latter period of time between the dividing points.
In addition, when the number of period of times between the dividing points is known, it is possible to improve the accuracy of division by merging the periods of time between the dividing points. For example, if a condition of dividing the sensor data into three is set in advance, the periods of times between the dividing points are merged in the ascending order of the lengths of the period of times between the dividing points so that three periods of time between the dividing points remain finally. In the present embodiment, the adjacent periods of time between the dividing points are merged starting from the shortest period of time between the dividing points.
To this end, the dividing point merge processing unit 13 first finds the shortest period of time between the dividing points among the periods of times between the dividing points on the basis of the dividing points stored in step 85 or the like by the data division processing unit 12 (step 91). In this example, a shorter period of time between dividing points out of two periods of time between dividing points before and after the found period of time between the dividing points is selected as a merge target (step 92). Then, the period of time between the dividing points of the selected merge target is merged with the shortest period of time between the dividing points (step 93).
In step 92 of this example, the shorter period of time between the dividing points is selected as the merge target, but the longer period of time between the dividing points may be selected as the merge target depending on actual data. Alternatively, the merge target may be selected on the basis of other merge target selection criteria set by the user.
The merged period of time between the dividing points is stored as one period of time between the dividing points, and it is determined whether or not a division condition is satisfied (step 94), and when the division condition is determined not to be satisfied, the process returns to step 91, and the process of finding the shortest period of time between the dividing points using the merged period of time between the dividing points as one period of time between the dividing points is repeated until the division condition is satisfied.
For example, the division condition may be a condition that the number of sensor data included in the shortest period of time between the dividing points is “5” or more, a condition that the number of periods of time between the dividing points is “3,” or any other condition. The merge process may be skipped depending on the purpose of division, and when the merge process is skipped, an output of the data division processing unit 12 is transmitted to the division result display control unit 14. Further, when the division condition is satisfied, the dividing point merge processing unit 13 may transmit the dividing points divided by the data division processing unit 12 to the division result display control unit 14 without merging it.
A scroll bar of a horizontal axis is displayed in order to display previous division results. A scroll bar of a vertical axis is displayed in the display portion 1411 in order to display the division results for five or more sensors. For the period of time to be displayed without scrolling, a selection menu such as 30 minutes, 1 hours, or 1 day is displayed in a display range 1412. When click (touch) to the monitor button 1413 is detected, switching to the monitoring screen 145 is performed, and the monitoring screen is displayed.
In the monitoring screen 145, a division result of a selected sensor is enlarged and displayed with a simple screen such as a display portion 1453. In a selection menu 1451, an identifier of a selected sensor is displayed. A display range 1452 is a display allowing selection similar to the display range 1412. When click on a set button 1454 is detected, switching to the setting screen 141 is performed, and the setting screen is displayed. Further, when the monitoring screen 145 is displayed, since there is no correction operation performed by the user, information of the sensor data and the dividing point is transmitted to a next processing unit without change.
The setting screen 141 is a screen used for the user to correct the division result, but since it is necessary to know the division process mechanism in order to directly correct the dividing point, there are cases in which it is hard for some users to correct. In order to solve the problem, the present embodiment provides an intuitively understandable parameter. For example, as illustrated in a calculation parameter setting 1420, a trend indication accuracy slide and a noise reduction strength side are displayed, and movement of the slide is detected so that a value according to the position of the slide can be input.
A value input in accordance with the position of the slide is converted into a parameter for the division process. For example, when a slide operation of increasing the value of the trend indication accuracy is detected, the value of the coefficient B of (Formula 1) used in the process of the interval setting unit 11 is increased, and the degree of influence of the index B is increased. Further, when a slide operation of increasing the value of the noise reduction strength is detected, the value of the coefficient A of (Formula 1) used in the process of the interval setting unit 11 is increased, and the degree of influence of the index A is increased. Accordingly, the user is able to easily control the operation of the division process. A division state of a result of changing the parameter may be displayed on a display portion 1416 in real time.
A more convenient input function is a function of receiving direct correction performed by the user in the division result screen. In a selection menu 1414, an identifier of a selected sensor is displayed, and an enlarged view of the sensor data and the division result of the sensor selected through the selection menu 1414 is displayed on a display portion 1416. Here, a display range 1415 is a display enabling selection similar to the display range 1412.
In the display portion 1416, a vertical dashed line is an example of the dividing point, and when the vertical dashed line is designated by the user, the dividing point corresponding to the designated vertical dashed line becomes a correction target. A move button 1417, an add button 1418, and a delete button 1419 are buttons for giving an instruction to perform correction of setting an offset to the position of the dividing point, an instruction to perform correction of adding the dividing point, and an instruction to perform correction of deleting the dividing point. Designation of the vertical dashed line or designation of a time with no vertical dashed line in the display portion 1416 is detected, click to each button is detected, and processes based on the detections are performed.
For example, when the user observes the division result, it is understood that one important variation is not divided since the width of the variation is small. At that time, when the user clicks an add button 1418 and clicks a position at which the dividing point is desired to be added in the display portion 1416 of the setting screen 141, the division result display control unit 14 which has obtained information of the click finds a parameter which is divided at the position of the dividing point designated by the clock.
In a case in which the dividing point is added, since the calculation interval is too long, and so there is a possibility that an excessive average effect will occur, the degree of influence of the index B may be increased until the added dividing point is calculated due to the increase in the coefficient B of (Formula 1). When a parameter by which the correction of the user can be realized is found, the parameter may be transmitted to the interval setting unit 11 and reflected in a subsequent division process.
Further, an all apply button 1421 for giving an instruction to apply the corrected parameter to data of a plurality of sensors, a single apply button 1422 for giving an instruction to apply the corrected parameter to data of one sensor, a store button 1423 for giving an instruction to store the parameter, and a read button 1424 for reading the stored parameter are also displayed on the setting screen 141.
For example, when a plurality of pieces of sensor data are clustered for each interval, since it is unable to be expressed in a multi-dimensional space if there is data loss, it is necessary to cause it to coincide with the dividing point of sensor data of a sensor having no data loss. Therefore, in this division, a dividing point of sensor data of a selected representative sensor is applied to all pieces of sensor data.
In the example of
As a process of selecting the representative sensor, an SN ratio of sensor data of each sensor may be calculated, and a sensor having the highest SN ratio may be selected as the representative sensor. Further, a sample of sensor data of each sensor may be divided, and a sensor having the largest number of divided intervals may be selected as the representative sensor. Further, a sensor having high sensitivity for detecting abnormality in accordance with a previous abnormality portent diagnosis case may be selected as the representative sensor.
In the example of
An example of a procedure of transmitting divisional data from the pre-processor 1 to the abnormality portent diagnosis processing unit 2 will be described with reference to
Here, in order to further facilitate processing, the time or the index is converted into a real number. In the example of
Thus, for example, “Mar. 1, 2015, 08:10:00” which is the time of the start point of the period of time between the dividing points is compared with “Jan. 1, 2015, 00:00:00” which is a preset reference time, and a time difference (seconds) between both times is calculated and converted into a real number of “5127000.”
As apparent from the division result of
The normal data has regularity in the variation pattern (trend pattern), and when actual sensor data in a normal state and normal data are divided through the same process, a large difference does not occur in the division result indicating the variation pattern, that is, the dividing point vector. However, actual sensor data in an abnormal state, that is, abnormal data has a variation different from that of the normal data, and a new dividing point is likely to occur due to the variation, and thus it is possible to detect the abnormality on the basis of the difference in the dividing point. Examples of the difference in the dividing point include ae difference in the number of dividing points and a difference in the position.
First, the dividing point time abnormality diagnosis processing unit 22 generates a normal class using the normal data as a learning process (step 1701). In order to generate the class, a label of the class is added to data belonging to the class. For this reason, the normal class is a set of data with a label indicating normality, and particularly, the normal data class is a dividing point vector in which a label is added to each element. Hereinafter, in order to distinguish from the dividing point vector of the sensor data of the diagnosis target, the dividing point vector of the normal class is referred to as a “normal vector.”
There may be a plurality of types of normal data, and there may be a plurality of types of normal labels. For example, there may be a plurality of types of detection modes of a sensor, a plurality of types of operation modes in a device serving as a sensor detection target, and different variation patterns may be present as a normal pattern in accordance with the detection mode or the operation mode.
In a case in which the variation pattern of the abnormal data is known in advance, an abnormal class may be created. Further, when there are a plurality of abnormal states, and a plurality of types of abnormal data are found out, a plurality of classes such as an abnormal class A, an abnormal class B, an abnormal class C may be generated. The generation of the class in step 1701 may be executed in advance before the abnormality diagnosis.
In order to compare the dividing point vector of the sensor data of the diagnosis target with the normal vector of the normal data, the dividing point time abnormality diagnosis processing unit 22 calculates a degree of similarity (step 1702). A specific process of calculating the degree of similarity is a process of steps 1711 to step 1714 to be described later. Then, it is determined whether or not the calculated degree of similarity is lower than a preset threshold value (step 1703), and when the calculated degree of similarity is determined to be lower than a preset threshold value, the diagnosis result is determined to be abnormal (step 1704). For this reason, a value indicating an abnormality may be stored in a predetermined storage region or the like.
In order to calculate the degree of similarity, the dividing point time abnormality diagnosis processing unit 22 first extracts one normal vector which is not previously extracted from the normal class including a plurality of normal vectors (step 1711). Then, the distance between the extracted normal vector and the dividing point vector is calculated. A specific process of calculating the distance is a process of steps 1721 to step 1724 to be described later.
When it is determined that a normal vector which is not extracted remains in the normal class, and the extraction is not completed (step 1713), the dividing point time abnormality diagnosis processing unit 22 causes the process to return to step 1711 and extracts other normal vectors. When the extraction is determined to be completed in step 1713, a minimum value is specified among distances calculated from a plurality of normal vectors, and a reciprocal of the specified minimum value is calculated and used as the degree of similarity (step 1714).
In order to calculate the distance, the dividing point time abnormality diagnosis processing unit 22 first extracts one data which is not previously extracted from the dividing point vector including a plurality of pieces of data (step 1721). Then, a difference with data with the closest value in the normal vector extracted in step 1711 is calculated (step 1722).
When it is determined that data which is not extracted remains in the dividing point vector, and the extraction is not completed (step 1723), the dividing point time abnormality diagnosis processing unit 22 causes the process to return to step 1721 and extracts other data. When the extraction is determined to be completed in step 1723, a sum of the differences calculated in step 1722 is calculated as a distance.
In the above process, for example, the dividing point vector of the sensor data of the diagnosis target may be indicated by SD, the three normal vectors (the dividing point vectors of the normal class) may be indicated by ND-A, ND-B, and ND-C, each of the dividing point vectors may be the dividing point vector including the start point (the dividing point) of a plurality of intervals described above with reference to
Further, some patterns (for example, a rising waveform) of the dividing point vector may be set in advance, the time of the dividing point vector matching with the set pattern may be used as the reference time as described above with reference to
Further, since the data included in the dividing point vector is the converted real number, the difference between the two real numbers may be calculated as a part of the distance, and a combination in which the value of the difference is small may be specified as being close. A sum of differences between two real numbers may be calculated, and if a distance between SD and ND-A is indicated by DIST-A, a distance between SD and ND-B is indicated by DIST-B, and a distance between SD and ND-C is indicated by DIST-C, 1/min (DIST-A, DIST-B, and DIST-C) may be calculated as the degree of similarity.
Here, the distance calculation may be a calculation in which both the difference in the number of dividing points and a difference in the position of the dividing point (the time or the converted real number) are reflected. ND-A, ND-B, and ND-C may be a plurality of classes such as a normal class A, a normal class B, and a normal class C which differ in a degree of normality, and when the degree of similarity with the normal class C with the lowest degree of normality is high, the probability of abnormality may be determined to be high. Since the degree of similarity with a plurality of classes is determined, it contributes to more accurate diagnosis and determination of a type or a cause of abnormality.
The abnormality portent diagnosis device 23 performs the abnormality diagnosis through the clustering technique using the calculated feature quantity as multi-dimensional data. For example, as a normal sample, each of a plurality of pieces of sensor data for learning is divided into a plurality of periods of time between dividing points, and a normal cluster is obtained in advance using a K-Means for each period of time between corresponding dividing points in a plurality of pieces of sensor data.
The abnormality portent diagnosis device 23 compares the sensor data of the diagnosis target divided into the periods of time between the dividing points with the normal cluster for each period of time between the dividing points, and determines that it is different from the sensor data of the diagnosis target when the shortest distance from the center of the cluster exceeds a predetermined threshold value.
In this example, the diagnosis results 192, 193, and 194 are either “normal” or “abnormal,” and when any one of the diagnosis results 192 and 193 is “abnormal” as in the “data C,” the “data D,” and the “data E,” the diagnosis result 194 is determined to be “abnormal.” Further, when both of the diagnosis results 192 and 193 are “normal” as in the “data A” and the “data B,” the diagnosis result 194 is determined to be “normal.”
Here, the example in which the diagnosis results 194 is determined from the two diagnosis results 192 and 193 has been described, but a comprehensive diagnosis result may be determined on the basis of three or more diagnosis results. The comprehensive diagnosis result may be determined on the basis of the diagnosis result of the abnormality portent diagnosis device 21 according to the sensor data output from the data extraction processing unit 15, the diagnosis result of the dividing point time abnormality diagnosis processing unit 22 according to the dividing point output from the dividing point extraction processing unit 16, and the diagnosis result of the abnormality portent diagnosis device 23 according to the feature quantity output from the feature quantity calculating unit 17.
Further, the comprehensive diagnosis result determination processing unit 24 may perform a process of a statistical technique as the comprehensive determination. For example, when a proportion of “abnormal” among a plurality of diagnosis results is equal to or greater than a preset percentage, it may be determined to be “abnormal,” and a proportion of “abnormal” calculated by weighting each of a plurality of diagnosis results is equal to or greater than a preset percentage, it may be determined as to be “abnormal.”
Since the interval is set using the index A and the index B as described above, it is possible to perform the data division in which the noise of the sensor data is suppressed. Further, since the intervals having the same inclination state are collected, it is possible to indicate the trend, and since the dividing point is set in the intervals having the different inclination states, it is possible to indicate the change in the trend. Then, since the short period of time between dividing points is merged with another period of time between dividing points, the influence of the noise can be further reduced.
Further, since the sensor data and the dividing point are displayed, the user is able to check the dividing point, and since the coefficient A and the coefficient B are changed in accordance with the user's operation on the change in the dividing point, it is possible to correct the dividing point easily even when the user has no knowledge about the division process. Furthermore, it is possible to set the dividing point even in each piece of sensor data of a plurality of sensors, and it is also possible to perform comprehensive determination on the basis of a plurality of types of diagnosis results.
Number | Date | Country | Kind |
---|---|---|---|
2016-193276 | Sep 2016 | JP | national |