WORKING CONDITION STATE MODELING AND MODEL CORRECTING METHOD

Description

TECHNICAL FIELD

The present invention relates to the technical field of computer science, in particular to a working condition state modeling and model correcting method.

BACKGROUND

The maintenance function has become more and more important in the past few decades. Unexpected downtime may greatly influence the maintenance function, and will cause operational disruption and productivity loss, or even production accidents, it is difficult to realize timely maintenance under limited maintenance resources and personnel. The efficiency of abnormality diagnosis methods often depends on the quality of diagnosis models. The methods of establishing mathematical models can be roughly classified into two categories: mechanism analysis modeling methods and statistical modeling methods.

The mechanism analysis modeling method is to establish a mathematical equation between key variables and other measurable variables according to the physical and chemical laws in the production process from a process mechanism, and to establish a mathematical model of a system of equations describing the process through derivation. This modeling has the advantage that the internal structure and relationship of the system can be clearly shown, and the nature of the actual process is reflected, However, this method is difficult to model, long in cycle and difficult to obtain various structural parameters and physical parameters in the model, and is limited in the application.

The statistical modeling method is to directly model a system as a black box only according to the relationship between input and output data in a research object instead of analyzing its internal mechanism. The model has strong online correction capability, and can be applicable to highly nonlinear and seriously uncertain systems, so as to provide an effective way for solving the model problem of complex system process parameters. However, the statistical modeling method has certain limitations. For complex nonlinear processes, sample data generally only comprises some areas, and cannot cover the entire area. The increase of the range of a sample data set may cause a complex model and increased difficulty in solving.

SUMMARY

Aiming at the defects of the prior art, the present invention provides a working condition state modeling and model correcting method, which introduces expert prior knowledge based on the statistical modeling method to solve the problem that the existing statistical modeling method cannot cover the whole area.

To realize the above-mentioned purpose, the present invention adopts the technical solution:

A working condition state modeling and model correcting method comprises the following steps:

step 1: collecting data, and arranging the data in a chronological order to form a time sequence data set;

step 2: preprocessing the time sequence data set;

step 3: clustering the preprocessed time sequence data set, computing a central point data set of the cluster, and generating a working condition data set and a working condition process data set;

step 4: counting a working condition transition probability for the working condition process data set to form a working condition transition probability model data set;

step 5: collecting the data, and detecting and processing the data;

step 6: computing a working condition state transition mode phase by phase and processing.

The step 1 comprises:

marking time sequence labels for the collected data (x₁, x₂, . . . , x_m) to form a time sequence data set (t_i, x_i1, x_i2, . . . , x_im),wherein m represents the number of parameters; t_irepresents the time sequence labels which are gradually increased; and x represents different parameters.

The step 2 comprises:

deleting irrelevant parameters in the time sequence data in the time sequence data set t_i, x_i1, x_i2, . . . , x_im) to obtain a time sequence data set (t_i, x_i1, x_i2, . . . , x_in) after dimension reduction, n≤m, wherein t_irepresents the time sequence labels which are gradually increased; m represents the number of parameters; n represents the number of parameters after dimension reduction; and x represents different parameters.

The dimension reduction comprises:

respectively computing a variance for each dimension of the parameters to obtain (σ₁, σ₂, . . . , σ_m); computing the mean value

$\overline{σ} = \frac{(σ_{1} + σ_{2} + \dots + σ_{m})}{m}$

of the variance, and deleting the values in (σ₁, σ₂, . . . , σ_m) that are less than Σ to obtain (σ₁, σ₂, . . . , σ_n), thereby obtaining a time sequence data set (t_i, x_i1, x_i2, . . . , x_in) after dimension reduction; wherein t_irepresents the time sequence labels which are gradually increased; m represents the number of parameters; n represents the number of parameters after dimension reduction; x represents different parameters; and σ_mrepresents variances of corresponding parameters.

A k-means algorithm is used for clustering, specifically:

the input serving as a data set (x_i1, x_i2, . . . , x_in) after dimension reduction, the range of k values being [K_min, K_max];

conducting k-means clustering on the data set (x_i1, x_i2, . . . , x_in) after dimension reduction for each k value, and solving the sum of squared errors (SSE) value in clusters for each clustering result;

using cluster partitions (C₁, C₂, . . . , C_K) as output when min(SSE) is taken,

wherein C₁, C₂, . . . , C_Krepresent a set of clusters, and K represents the number of partitioned clusters, i.e., the number of working condition types.

The generating the working condition data set and the working condition process data set comprises:

firstly, marking the cluster partitions (C₁, C₂, . . . , C_K) of the data set (x_i1, x_i2, . . . , x_in) with the working condition types to form a working condition data set expressed as (x_i1, x_i2, . . . , x_in, y_k); and simultaneously, respectively computing the central points of the cluster partitions to form a central point data set (c_k1, c_k2, . . . , c_kn, y_k),wherein y represents the working condition types and the number of y is the same as the number of the cluster partitions, i.e., k≤K; C represents parameters corresponding to the working condition data set (x_i1, x_i2, . . . , x_in, y_k);

then, computing a distance from each data in a cluster to a central node in the cluster, and taking a maximum distance value D_max;

finally, adding the time sequence labels for the working condition data set by taking the time sequence data set as a reference, to form a working condition process data set expressed as t_i, x_i1, x_i2, . . . , x_in, y_k) ,wherein y represents the working condition types and the number of y is the same as the number of the cluster partitions, i.e., k≤K; t_irepresents the time sequence labels which are gradually increased.

The working condition transition probability model data set is P(y_a_M+1|y_a₁, y_a₂, y_a₃, . . . , y_a_M), wherein M is a window size;

$⌊ M \leq \frac{K}{2} ⌋;$

K is the number of the working condition types; 1≤a₁, a₂, a₃, a_M, a_M+1≤n; and n represents the number of the parameters after dimension reduction,

In the working condition transition mode y_a₁, y_a₂, y_a₃, . . . , y_a_M, a working condition type y_a₁appears firstly, then a working condition type y_a₂appears and next a working condition type y_a₃appears, and so on until the working condition type y_a_Mappears, wherein 1≤a₁, a₂, a₃, a_m≤n, and n represents the number of the parameters after dimension reduction.

The collecting the data, and detecting and processing the data comprises:

collecting the data and taking n-dimensional parameters as input data (x′₁, x′₂, . . . , x′_n), wherein n represents the number of the parameters after dimension reduction, and the parameters are the same as the parameters selected in the data set (x_i1, x_i2, . . . , x_in) after dimension reduction; computing a distance from the input data to the central point data set, and taking a minimum value d of the distance;

if d≤D_max, taking the working condition type of the central point with a distance of d; adding the time sequence labels to form time sequence data (t′, x′₁, x′₂, . . . , x′_n, y′); and saving the data into a data set (t′₁, x′_i1, x′_i2, . . . , x′_iny′_k′) to be processed;

d>D_maxindicating that the input data is not matched with any working condition type; and modifying the working condition data set and the central point data set, wherein D_maxrepresents the maximum value of the distance from each data in the cluster to the central node in the cluster.

The step 6 comprises:

continuously taking the working condition transition mode (y_i, y_i+1, . . . , y_M, y_M+1) with a sliding window size of M for the data set (t′_i, x′_i1, x′_i1, . . . , x′_in, y′_k′) to be processed according to the chronological order; inquiring and counting the probability p in the working condition transition probability model; if p>ϵ, continuing to compute the working condition of the time sequence of a next group of data parameters; if 0≤p≤ϵ, correcting a corresponding probability in the working condition transition probability model, wherein a represents ϵ probability value defined according to expert knowledge.

The corresponding probability in the working condition transition probability model comprises:

when p=0, adding a probability value of the working condition transition mode to be corrected to the working condition transition probability model, recorded as ϵ; accordingly, reducing the probability values of other working condition transition modes in the data set of the working condition transition probability model on average;

when 0<p≤ϵ, modifying the probability value of the working condition transition mode to be corrected to the working condition transition probability model, recorded as p+ϵ; accordingly, reducing the probability values of other working condition transition modes in the data set of the working condition transition probability model on average,

wherein ∈ represents a probability value defined according to expert knowledge, and ∈=ϵ.

The present invention has the following beneficial effects and advantages:

1. The present invention is based on a counting modeling method, introduces expert prior knowledge to correct the established model gradually, enables the model range to cover the overall system working condition state and solves the problem of low coverage rage in the mechanism analysis modeling methods and the counting modeling method.

2. The present invention can be used as the input of an abnormal working condition diagnosis method, and can effectively improve the accuracy rate of abnormality diagnosis.

DESCRIPTION OF DRAWINGS

FIG. 1 is a flow chart of establishing a working condition state model.

FIG. 2 is a flow chart of correcting a working condition state model.

FIG. 3 is a schematic diagram of a working condition transition mode with a window size of 2.

DETAILED DESCRIPTION

The present invention will be further described in detail below in combination with the drawings and the embodiments.

To make the above-mentioned purpose, features and advantages of the present invention more clear and understandable, specific embodiments of the present invention will be described below in detail in combination with the drawings. In the following description, many specific details are elaborated to thoroughly understand the present invention. However, the present invention can be implemented in other modes different from those described herein. Those skilled in the art can make similar improvement without departing from the connotation of the present invention. Therefore, the present invention is not limited by specific embodiments disclosed below.

Unless otherwise defined, all technical and scientific terms used herein have the same meanings as those generally understood by those skilled in the art in the present invention. The terms used in the description of the present invention are intended to merely describe concrete embodiments, not to limit the present invention.

FIG. 1 is a flow chart of establishing a working condition state model.

Step 1: collecting data, and forming time sequence data; collecting the gathered data and representing the data as (x₁, x₂, . . . , x_m), wherein m represents the number of parameters; marking the time sequence labels to form a time sequence data set represented as (t_i, x_i1, x_i2, . . . , x_im), wherein t_irepresents the time sequence labels which are gradually increased; and m represents the number of parameters; the collected data is the data taken from a real-time database in a site production process.

Step 2: preprocessing the time sequence data parameters. The preprocessing course is to delete irrelevant parameter from the time sequence data set (t_i, x_i1, x_i2, . . . , x_im) to obtain a time sequence data set after dimension reduction, represented as (t_i, x_i1, x_i2, . . . , x_in), m≤m, wherein n represents the number of the parameters after dimension reduction and x represents different parameters. The specific dimension reduction process is as follows:

respectively computing a variance for each dimension of the parameters to obtain (σ₁, σ₂, . . . , σ_m); computing the mean value of the variances

$\overline{σ} = \frac{(σ_{1} + σ_{2} + \dots + σ_{m})}{m};$

deleting me values in (σ₁, σ₂, . . . , σ_m) less than σ to obtain (σ₁, σ₂, . . . , σ_n); accordingly, obtaining a time sequence data set (t_i, x_i1, x_i2, . . . , x_in,) after dimension reduction, wherein t_irepresents the time sequence labels which are gradually increased; m represents the number of parameters; n represents the number of parameters after dimension reduction; x represents different parameters; and σ_mrepresents variances of corresponding parameters. The time sequence labels are not considered during dimension reduction.

Step 3: clustering the preprocessed time sequence data set, computing a central point data set of the cluster, and generating a working condition data set and a working condition process data set, and comprising the following specific steps:

firstly, clustering the preprocessed time sequence data sets, and neglecting the time labels during clustering, i.e., the time labels have no influence on the clustering result; using a k-means algorithm for clustering; input: a data set (x_i1, x_i2, . . . , x_in) after dimension reduction, and the range [K_min, K_max] of k values needs to be determined according to expert knowledge; process: conducting k-means clustering on the data set (x_i1, x_i2, . . . , x_in) after dimension reduction for each k value, and solving the sum of squared errors (SSE) value in clusters for each clustering result; output: using cluster partitions C=(C₁, C₂, . . . , C_k) when min(SSE) is taken, wherein C₁, C₂, . . . , C_Krepresent a set of clusters, and K represents the number of partitioned clusters, i.e., the number of working condition types.

Then, marking the cluster partitions (C₁, C₂, . . . , C_K) of the data set (x_i1, x_i2. . . , x_in) with the working condition types according to the expert knowledge to form a working condition data set expressed as (x_i1, x_i2, . . . , x_in, y_k); and simultaneously, respectively computing the central points of the cluster partitions to form a central point data set (c_k1, c_k2, . . . , c_kn, y_k), wherein y represents the working condition types and the number of y is the same as the number of the cluster partitions, i.e., k≤K; c represents parameters corresponding to the working condition data set (x_i1, x_i2, . . . , x_in, y_k).

Next, computing a distance from each data in a cluster to a central node in the cluster, and taking a maximum distance value D_max.

Finally, adding the time sequence labels for the working condition data set by taking the time sequence data set as a reference, to form a working condition process data set expressed as (t_i, x_i1, x_i2, . . . , x_in, y_k), wherein y represents the working condition types and the number of y is the same as the number of the cluster partitions, i.e., k≤K; t_irepresents the time sequence labels which are gradually increased.

Step 4: counting a working condition transition probability for the working condition process data set to form a working condition transition probability model data set. counting a working condition transition probability for the working condition process data set (t_i, x_i1, x_i2, . . . x_in, y_k) in the step 3 according to the size of a sliding window M; representing the formed working condition transition probability model data set as P(y_a_M+1|y_a₁, y_a₂, y_a₃, . . . , y_a_M) , i.e., the emergence probability of y_a₁, y_a₂, y_a₃, . . . , y_a_M→y_a_M+1counted from the working condition process data set, namely the working condition process counts the corresponding probability according to the emergence order of the working condition transition modes y_a₁, y_a₂, y_a₃, . . . y_a_M, y_a_M+1, wherein M is a window size;

$⌊ M \leq \frac{K}{2} ⌋;$

K is the number of the working condition types; 1≤a₁, a₂, a₃, a_M, a_M+1≤n; and n represents the number of the parameters after dimension reduction.

Step 5: continuing to collect the data after the model is built, and correcting an original model; collecting the data and taking n-dimensional parameters as input data (x′₁, x′₂, . . . , x′_n), wherein n represents the number of the parameters after dimension reduction, and the parameters are the same as the parameters selected in the data set (x_i1, x_i2, . . . , x_in) after dimension reduction; computing a distance from the input data to the central point data set, and taking a minimum value d of the distance; if d≤Dmax, taking the working condition type of the central point with a distance of d; adding the time sequence labels to form time sequence data (t′, x′₁, x′₂, . . . , x′_n, y′); and saving the data into a data set to be processed; d>D_maxindicating that the input data is not matched with any working condition type; and modifying the working condition data set and the central point data set, wherein D_maxrepresents the maximum value of the distance from each data in the cluster to the central node in the cluster.

FIG. 2 shows a flow chart of correcting a working condition state model.

(1) The process of modifying the working condition data set is as follows:

directly adding the data (x′₁, x′₂, . . . , x′_n, y′) to the working condition data set

(2) The process of modifying the central point data set is as follows:

directly adding the data (x′₁, x′₂, . . . , x′_n, y′) to the central point data set (c_k1, c_k2, . . . , c_kn, y_k).

Step 6: computing a working condition state transition mode phase by phase and processing. The working condition transition mode is defined as y_a₁, y_a₂, . . . , which indicates that a working condition type y_a₁appears firstly, then a working condition type y_a₂appears and next a working condition type y_a₃appears, and so on, wherein 1≤a₁, a₂, a₃≤n, and n represents the number of the parameters after dimension reduction. FIG. 3 shows a schematic diagram of a working condition transition mode with a window size of 2. Steps: continuously taking the working condition transition mode (y_i, y_i+1, . . . , y_M, y_M+1) with a sliding window size of M for the data set (t′_i, x′_i1, x′_i2, . . . , x′_in, y′_k′) to be processed according to the chronological order; inquiring and counting the probability p in the working condition transition probability model; if p>ϵ, continuing to compute the working condition of the time sequence of a next group of data parameters; if 0≤p≤ϵ, correcting a corresponding probability in the working condition transition probability' model wherein r represents a probability value defined according to expert knowledge.

The process of correcting the working condition transition probability model is specifically as follows:

(1) When p=0, it indicates that the working condition transition mode appears for the first time.

The working condition transition modes to be added are assumed as y_a1, y_a2, y_a3. . . y_a4, y_aM+1.

Probability values P(y_a_M+1|y_a₁, y_a₂, y_a₃, . . . , y_a_M) of the Working condition transition modes y_a₁, y_a₂, y_a₃, . . . , y_a_M, y_a_M+1to be corrected are added to the working condition transition probability model, and recorded as ϵ; and accordingly, the probability values of other working condition transition modes in the data set of the working condition transition probability model are reduced on average.

(2) When 0≤p≤ϵ, it indicates that the appearance probability of the working condition transition mode is very low. The working condition transition modes to be modified are assumed as y_a₁, y_a₂, y_a₃, . . . , y_a_M, y_a_M+1.

The probability P(y_a_M+2|y_a₁, y_a₂, y_a₃, . . . , y_a_M) of modifying y_a₁, y_a₂, y_a₃, . . . , y_a_M, y_a_M+1in the working condition transition probability model is p+ϵ; and accordingly, the probability values of other working condition transition modes in the data set of the working condition transition probability model are reduced on average.

wherein ∈ represents a probability value defined according to expert knowledge, and ∈<68 .

Claims

1. A working condition state modeling and model correcting method, characterized by comprising the following steps: step 1: collecting data, and arranging the data in a chronological order to form a time sequence data set;step 2: preprocessing the time sequence data set;step 3: clustering the preprocessed time sequence data set, computing a central point data set of the cluster, and generating a working condition data set and a working condition process data set;step 4: counting a working condition transition probability for the working condition process data set to form a working condition transition probability model data set;step 5: collecting the data, and detecting and processing the data,step 6: computing a working condition state transition mode phase by phase and processing.
2. The working condition state modeling and model correcting method according to claim 1, characterized in that the step 1 comprises: marking time sequence labels for the collected data (x1, x2, . . . , xm) to form a time sequence data set (ti, xi1, xi2, . . . , xim), wherein in represents the number of parameters; ti represents the time sequence labels which are gradually increased; and x represents different parameters.
3. The working condition state modeling and model correcting method according to claim 1, characterized in that the step 2 comprises: deleting irrelevant parameters in the time sequence data in the time sequence data set (ti, xi1, xi2, . . . , xim) to obtain a time sequence data set (ti, xi1, xi2, . . . , xin) after dimension reduction, n≤m, wherein ti represents the time sequence labels which are gradually increased; m represents the number of parameters; n represents the number of parameters after dimension reduction; and x represents different parameters.
4. The working condition state modeling and model correcting method according to claim 3, characterized in that the dimension reduction comprises: respectively computing a variance for each dimension of the parameters to obtain (σ1, σ2, . . . , σm); computing the mean value
5. The working condition state modeling and model correcting method according to claim 1, characterized in that a k-means algorithm is used for clustering, specifically: the input serving as a data set (xi1, xi2, . . . , xin) after dimension reduction, the range of k values being [Kmin, Kmax];conducting k-means clustering on the data set (xi1, xi2, . . . , xin) after dimension reduction for each k value, and solving the sum of squared errors (SSE) value in clusters for each clustering result;using cluster partitions (C1, C2, . . . , CK) as output when min(SSE) is taken,wherein C1, C2, . . . , CK represent a set of clusters, and K represents the number of partitioned clusters, i.e., the number of working condition types.
6. The working condition state modeling and model correcting method according to claim 1, characterized in that the generating the working condition data set and the working condition process data set comprises: firstly, marking the cluster partitions (C1, C2, . . . , CK) of the data set (xi1, xi2, . . . , xin) with the working condition types to form a working condition data set expressed as (xi1,i2, . . . , xin, yk); and simultaneously, respectively computing the central points of the cluster partitions to form a central point data set (ck1, ck2, . . . , ckn, yk), wherein y represents the working condition types and the number of y is the same as the number of the cluster partitions, i.e., k≤K; C represents parameters corresponding to the working condition data set (xi1, xi2, . . . , xin, yk);then, computing a distance from each data in a cluster to a central node in the cluster, and taking a maximum distance value Dmax;finally, adding the time sequence labels for the working condition data set by taking the time sequence data set as a reference, to form a working condition process data set expressed as (ti, xi1, xi2, . . . xin, yk), wherein y represents the working condition types and the number of y is the same as the number of the cluster partitions, i.e., k≤K; ti represents the time sequence labels which are gradually increased.
7. The working condition state modeling and model correcting method according to claim 1, characterized in that the working condition transition probability model data set is P(yaM+1|ya1, ya2, ya3, . . . , yaM), wherein M is a window size;
8. The working condition state modeling and model correcting method according to claim 1, characterized in that in the working condition transition mode ya1, ya2, ya3, . . . , yam, a working condition type ya1 appears firstly, then a working condition type ya2, appears and next a working condition type ya3 appears, and so on until the working condition type yam appears, wherein 1≤a1, a2, a3, am≤n, and n represents the number of the parameters after dimension reduction.
9. The working condition state modeling and model correcting method according to claim 1, characterized in that the collecting the data, and detecting and processing the data comprises: collecting the data and taking n-dimensional parameters as input data (x′1, x′2, . . . , x′n), wherein n represents the number of the parameters after dimension reduction, and the parameters are the same as the parameters selected in the data set (xi1, xi2, . . . , xin) after dimension reduction;computing a distance from the input data to the central point data set, and taking a minimum value d of the distance;if d≤Dmax taking the working condition type of the central point with a distance of d; adding the time sequence labels to form time sequence data (t′, x′1, x′2, . . . , x′n, y′); and saving the data into a data set (t′j, x′i1, x′i2, . . . , x′in, y′k′) to be processed;d>Dmax indicating that the input data is not matched with any working condition type; andmodifying the working condition data set and the central point data set, wherein Dmax represents the maximum value of the distance from each data in the cluster to the central node in the cluster.
10. The working condition state modeling and model correcting method according to claim 1, characterized in that the step 6 comprises: continuously taking the working condition transition mode (yi, yi+1, . . . , yM, yM+1) with a sliding window size of M for the data set (t′i, x′i1, x′i2, . . . , x′in, y′k′)to be processed according to the chronological order; inquiring and counting the probability p in the working condition transition probability model; if p>ϵ, continuing to compute the working condition of the time sequence of a next group of data parameters; if 0≤p≤ϵ, correcting a corresponding probability in the working condition transition probability model, wherein ϵ represents a probability value defined according to expert knowledge.
11. The working condition state modeling and model correcting method according to claim 10, characterized in that the corresponding probability in the working condition transition probability model comprises: when p=0, adding a probability value of the working condition transition mode to be corrected to the working condition transition probability model, recorded as ∈; accordingly, reducing the probability values of other working condition transition modes in the data set of the working condition transition probability model on average;when 0<p≤ϵ, modifying the probability value of the working condition transition mode to be corrected to the working condition transition probability model, recorded as p+∈;accordingly, reducing the probability values of other working condition transition modes in the data set of the working condition transition probability model on average, wherein ∈ represents a probability value defined according to expert knowledge, and ∈<ϵ.

Priority Claims (1)

Number	Date	Country	Kind
201811541159.9	Dec 2018	CN	national

PCT Information

Filing Document	Filing Date	Country	Kind
PCT/CN2019/075663	2/21/2019	WO	00

WORKING CONDITION STATE MODELING AND MODEL CORRECTING METHOD

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

Priority Claims (1)

PCT Information