METHOD FOR PREDICTING REMAINING USEFUL LIFE (RUL) OF AERO-ENGINE BASED ON AUTOMATIC DIFFERENTIAL LEARNING DEEP NEURAL NETWORK (ADLDNN)

CROSS REFERENCE TO RELATED APPLICATION

This patent application claims the benefit and priority of Chinese Patent Application No. 202111261992.X, filed with the China National Intellectual Property Administration on Oct. 28, 2021, the disclosure of which is incorporated by reference herein in its entirety as part of the present application.

TECHNICAL FIELD

The present disclosure relates to the field of remaining useful life (RUL) prediction for aero-engines, and in particular to a method for predicting a RUL of an aero-engine based on an automatic differential learning deep neural network (ADLDNN).

BACKGROUND

Aero-engine, a highly complex and precise thermal machine, is the engine that provides the aircraft with the necessary power for flights. It is more susceptible to faults for the complex internal structure and harsh operating environment. Hence, accurate prediction on a RUL of the aero-engine is of great significance to operation and maintenance of the aero-engine.

With the development of sciences and technologies, the long short-term memory (LSTM) and convolutional neural network (CNN) have been widely applied to predict a RUL of a rotary machine. However, existing neural networks all process data in a uniform mode, cannot mine different levels of feature information in various feature extraction modes, and have a poor prediction accuracy.

SUMMARY

An objective of the present disclosure is to provide a method for predicting a RUL of an aero-engine based on an ADLDNN, which can be used to predict the RUL of the aero-engine.

The objective of the present disclosure is implemented with the following technical solutions. A method for predicting a RUL of an aero-engine based on an ADLDNN includes the following specific steps:

1) data acquisition: acquiring multidimensional degradation parameters of an aero-engine to be predicted, analyzing a stable trend, and selecting a plurality of parameters capable of reflecting degradation performance of the aero-engine to obtain acquired data;

2) data preprocessing: segmenting the acquired data by a sliding window (SW) to obtain preprocessed data;

3) model construction: constructing a RUL prediction model of the aero-engine based on an ADLDNN, the RUL prediction model including a multibranch convolutional neural network (MBCNN) model, a multicellular bidirectional long short-term memory (MCBLSTM) model, a fully connected (FC) layer FC1, and a regression layer;

4) feature extraction: taking the preprocessed data as input data of the MBCNN model, extracting an output of the MBCNN model, taking the output of the MBCNN model and recursive data as input data of the MCBLSTM model, and extracting an output of the MCBLSTM model; and

5) RUL prediction: taking the output of the MCBLSTM model as an input of the FC layer FC1 to obtain an output of the FC layer FC1, and inputting the output of the FC layer FC1 to the regression layer to predict a RUL.

Further, the MBCNN model includes a level division unit, and a spatial feature alienation-extraction unit; and

the MCBLSTM model includes a bidirectional trend-level division unit, and multicellular update units.

Further, the extracting an output of the MBCNN model in step 4) specifically includes:

4-1-1) level division: taking the preprocessed data in step 2) as the input data, inputting input data x_tat time t to the level division unit of the MBCNN model for level division, the level division unit including an FC layer FC2 composed of five neurons, and performing softmax normalization on an output D_tof the FC layer FC2 to obtain a level division result D_1t:

D
_t=tanh(w_xd₁x_t+b_d₁) (1)

D
_1t=soft max(D_t)=[d_11td_12td_13td_14td_15t] (2)

where in equations (1) and (2), w_xd₁and b_d₁respectively represent a weight and a bias of the FC layer FC2, d_11t, d_12t, d_13t, d_14tand d_15trespectively represent an important level, a relatively important level, a general level, a relatively minor level and a minor level, and a position of a maximal element in D_1trepresents a level division result of a present input; and

4-1-2) feature extraction: inputting, according to a level division result D₁of the input data, the input data to different convolution paths of the spatial feature alienation-extraction unit for convolution, and performing automatic differential processing on an input measured value according to the level division result and five designed convolution paths to obtain a health feature h_t¹:

h
_ti
¹
=P
₁₅(C₁₅(P₁₄(C₁₄(P₁₃(C₁₃(P₁₂(C₁₂(P₁₁(C₁₁(x_t))))))))))

h
_tj
¹
=P
₂₄(C₂₄(P₂₃(C₂₃(P₂₂(C₂₂(P₂₁(C₂₁(x_t))))))))

h
_tk
¹
=P
₃₃(C₃₃(P₃₂(C₃₂(P₃₁(C₃₁(x_t))))))

h
_tl
¹
=P
₄₂(C₄₂(P₄₁(C₄₁(x_t))))

h
_tm
¹
=P
₅₁(C₅₁(x_t))

h
_t
¹
=D
_1t
[h
_ti
¹
h
_tj
¹
h
_tk
¹
h
_tl
¹
h
_tm
¹]^T (3)

where in equation (3), P_ijand C_ijrespectively represent a jth convolution operation and a jth pooling operation for an ith convolution path, h_ti¹is a convolution output of data of the important level, h_tj¹is a convolution output of data of the relatively important level, h_tk¹is a convolution output of data of the general level, h_tl¹is a convolution output of data of the relatively minor level, and h_tm¹is a convolution output of data of the minor level.

Further, the extracting an output of the MCBLSTM model in step 4) specifically includes:

4-2-1) trend division: taking an output h_t¹of the MBCNN model at time t and recursive data h_t-1²of the MCBLSTM model at time t−1 as input data of the MCBLSTM at time t, and inputting the input data to the bidirectional trend-level division unit for trend division, the bidirectional trend-level division unit including an FC layer FC3 and an FC layer FC4 for dividing a trend level of the input data along forward and backward directions, the FC layer FC3 and the FC layer FC4 each including five neurons, and the FC layer FC3 and the FC layer FC4 respectively having an output {right arrow over ({tilde over (D)})}_2tand output custom-character _2t:

$\begin{matrix} \begin{matrix} {\overset{\vec{~}}{D}}_{2 t} = \tanh ({\vec{h}}_{t}^{1} {\vec{w}}_{{xd}_{2}} + {\vec{h}}_{t - 1}^{2} {\vec{w}}_{{hd}_{2}} + {\vec{b}}_{d_{2}}) \\ {\overset{\overset{\leftarrow}{~}}{D}}_{2 t} = \tanh ({\overset{\leftarrow}{h}}_{t}^{1} {\overset{\leftarrow}{w}}_{{xd}_{2}} + {\overset{\leftarrow}{h}}_{t - 1}^{2} {\overset{\leftarrow}{w}}_{{hd}_{2}} + {\overset{\leftarrow}{b}}_{d_{2}}) \end{matrix} & (4) \end{matrix}$

where in equation (4),

${\vec{w}}_{{xd}_{2}} and {\vec{w}}_{{hd}_{2}}$

each are a weight of the FC layer FC3,

${\overset{\leftarrow}{w}}_{{xd}_{2}}$

and

${\overset{\leftarrow}{w}}_{{hd}_{2}}$

each are a weight of the FC layer FC4, {right arrow over (b)}_d₂is a bias of the FC layer FC3, and custom-character _d₂is a bias of the FC layer FC4; and

respectively performing a softmax operation on the {right arrow over ({tilde over (D)})}_2tand the custom-character _2tto obtain forward and backward trend levels {right arrow over (D)}_2tand _2t:

{right arrow over (D)}
_2t=soft max({right arrow over ({tilde over (D)})}_2t)=[{right arrow over (d)}_21t{right arrow over (d)}_22t{right arrow over (d)}_23t{right arrow over (d)}_24t{right arrow over (d)}_25t]

custom-character
_2t=soft max (_2t)=[_21t_22t_23t_24t_25t] (5)

where in equation (5), {right arrow over (d)}_21t( custom-character _21t), {right arrow over (d)}_22t(_22t), {right arrow over (d)}_23t(_23t), {right arrow over (d)}_24t(_24t), and {right arrow over (d)}_25t(_25t) respectively represent a local trend, a medium and short-term trend, a medium-term trend, a medium and long-term trend and a global trend in bidirectional calculation, and {right arrow over (d)}_{2 max t}and custom-character _{2 max t}in {right arrow over (D)}_2tand _2trepresent trend levels along two directions at the time t; and

4-2-2) feature extraction: inputting, according to the trend division results {right arrow over (D)}_2tand custom-character _2t, data of different trends to the multicellular update units and , which perform differential learning along the two directions, for update, the ^lc, comprising five subunits (i), (j), (k), (l), (m), and comprising five subunits (i), (j), (k), (l), and (m):

$\begin{matrix} {\overset{r}{i}}_{t} = σ ({\overset{r}{w}}_{{ih}^{1}} {\overset{r}{h}}_{t}^{1} + {\overset{r}{w}}_{{ih}^{2}} {\overset{r}{h}}_{t - 1}^{2} {\overset{r}{b}}_{i}) & (6) \end{matrix}$

${\overset{r}{f}}_{t} = σ ({\overset{r}{w}}_{{fh}^{1}} {\overset{r}{h}}_{t}^{1} + {\overset{r}{w}}_{{fh}^{2}} {\overset{r}{h}}_{t - 1}^{2} {\overset{r}{b}}_{f})$

${\overset{s}{i}}_{t} = σ ({\overset{s}{w}}_{{ih}^{1}} {\overset{s}{h}}_{t}^{1} + {\overset{s}{w}}_{{ih}^{2}} {\overset{s}{h}}_{t - 1}^{2} {\overset{s}{b}}_{i})$

${\overset{s}{f}}_{t} = σ ({\overset{s}{w}}_{{fh}^{1}} {\overset{s}{h}}_{t}^{1} + {\overset{s}{w}}_{{fh}^{2}} {\overset{s}{h}}_{t - 1}^{2} {\overset{s}{b}}_{f})$

${\overset{r}{c}}_{t} (m) = {\overset{r}{c}}_{t - 1}$

${\overset{s}{c}}_{t} (m) = {\overset{s}{c}}_{t - 1}$

${\overset{r}{c}}_{t} (i) = \frac{r}{c_{t}} = \tanh ({\overset{r}{W}}_{{ch}^{1}} {\overset{r}{h}}_{t}^{1} + {\overset{r}{W}}_{{ch}^{2}} {\overset{r}{h}}_{t - 1}^{2} + {\overset{r}{b}}_{c})$

${\overset{s}{c}}_{t} (i) = \frac{s}{c_{t}} = \tanh ({\overset{s}{W}}_{{ch}^{1}} {\overset{s}{h}}_{t}^{1} + {\overset{s}{W}}_{{ch}^{2}} {\overset{s}{h}}_{t - 1}^{2} + {\overset{s}{b}}_{c})$

${\overset{r}{c}}_{t} (k) = {\overset{r}{f}}_{t} e {\overset{r}{c}}_{t - 1} + {\overset{r}{i}}_{t} e \frac{r}{c_{t}}$

${\overset{s}{c}}_{t} (k) = {\overset{s}{f}}_{t} e {\overset{s}{c}}_{t - 1} + {\overset{s}{i}}_{t} e \frac{s}{c_{t}}$

${\overset{r}{c}}_{t} (l) = s_{1} ({\overset{r}{f}}_{t} e {\overset{r}{c}}_{t - 1} + {\overset{r}{i}}_{t} e \frac{r}{c_{t}}) + (1 - s_{1}) {\overset{r}{c}}_{t - 1}$

${\overset{s}{c}}_{t} (l) = s_{3} ({\overset{s}{f}}_{t} e {\overset{s}{c}}_{t - 1} + {\overset{s}{i}}_{t} e \frac{s}{c_{t}}) + (1 - s_{3}) {\overset{s}{c}}_{t - 1}$

${\overset{r}{c}}_{t} (j) = s_{2} ({\overset{r}{f}}_{t} e {\overset{r}{c}}_{t - 1} + {\overset{r}{i}}_{t} e \frac{r}{c_{t}}) + (1 - s_{2}) \frac{r}{c_{t}}$

${\overset{s}{c}}_{t} (j) = s_{4} ({\overset{s}{f}}_{t} e {\overset{s}{c}}_{t - 1} + {\overset{s}{i}}_{t} e \frac{s}{c_{t}}) + (1 - s_{5}) \frac{s}{c_{t}}$

where in equation (6), arrows → and ← respectively represent forward and backward processes, custom-character (m), (m) are corresponding data update units of the global trend in the bidirectional calculation, (i), (i) are corresponding data update units of the short-term trend in the bidirectional calculation, (k), (k) are corresponding data update units of the medium-term trend in the bidirectional calculation, custom-character (l), (l) are corresponding data update units of the medium and long-term trend in the bidirectional calculation, (j), (j) are corresponding data update units of the medium and short-term trend in the bidirectional calculation, σ is a sigmod activation function,

${\overset{I}{w}}_{{ih}^{1}}, {\overset{S}{w}}_{{ih}^{1}} and {\overset{I}{w}}_{{ih}^{2}}, {\overset{S}{w}}_{{ih}^{2}}$

are weights of input gates of the MCBLSTM model,

${\overset{I}{W}}_{{fh}^{1}}, {\overset{S}{W}}_{{fh}^{1}} and {\overset{I}{W}}_{{fh}^{2}}, {\overset{S}{W}}_{{fh}^{2}}$

are weights of forget gates of the MCBLSTM model,

${\overset{I}{w}}_{{ch}^{1}}, {\overset{S}{w}}_{{ch}^{1}} and {\overset{I}{w}}_{{ch}^{2}}, {\overset{S}{w}}_{{ch}^{2}}$

are weights of cell storage units of the MCBLSTM model, custom-character _iand _iare biases of the input gates of the MCBLSTM model, _fand _fare biases of the forget gates of the MCBLSTM model, _cand _care biases of the cell storage units of the MCBLSTM model, ⊙ is a dot product operation, and s₁, s₂, s₃and s₄each are a mix proportion factor obtained by learning; and

combining weights of alienation outputs of daughter-cell units in the multicellular update units according to update results of five alienation units and the trend division results {right arrow over (D)}_2tand custom-character _2tto obtain outputs and of the multicellular update units, and controlling output gates {right arrow over (o)}_tand _tof the MCBLSTM model to obtain an output h²_tof the MCBLSTM model at the time t:

{right arrow over (c)}
_t
={right arrow over (D)}
_2t
[{right arrow over (c)}
_t(i){right arrow over (c)}_t(j){right arrow over (c)}_t(k){right arrow over (c)}_t(l){right arrow over (c)}_t(m)]^T

custom-character =_2t[(i)(j)(k)(l)(m)]^T

{right arrow over (o)}
_t=σ({right arrow over (w)}_ox{right arrow over (h)}_t¹+{right arrow over (w)}_oh{right arrow over (h)}_t-1²+{right arrow over (b)}_o)

custom-character
_t=σ(_ox+_oh+)

{right arrow over (h)}
_t
²
={right arrow over (o)}
_t□tanh(c_t)

custom-character =□tanh()

h
²
_t
={right arrow over (h)}
_t
²⊕ custom-character (7)

where in equation (7), custom-character _ox, _ohand _ox, _ohare weights of the output gates of the MCBLSTM model, and σ and tanh each are an activation function.

Further, the predicting a RUL in step 5) specifically includes:

inputting h²_tto the FC layer FC1, preventing overfitting by Dropout to obtain an output h³_tof the FC layer FC1, and inputting the h³_tto the regression layer to obtain a predicted RUL y_t:

$\begin{matrix} h_{t}^{3} = dropout (Relu (w_{h_{2} h_{3}} h_{t}^{2} + b_{h_{3}})) & (8) \end{matrix}$

$\begin{matrix} y_{t} = Linear (w_{h_{3} y} h_{t}^{3} + b_{y}) & (9) \end{matrix}$

where in equations (8) and (9), w_h₂_h₃is a weight of the FC layer FC1, b_h₃is a bias of the FC layer FC1, w_h₃_yis a weight of the regression layer, and b_yis a bias of the regression layer.

By adopting the foregoing technical solutions, the present disclosure achieves the following advantages:

1. The present disclosure constructs a deep mining model (ADLDNN model) according to different sensitivities of different measured values for mechanical faults in different periods, automatically screens features through the ADLDNN model and combines with differential learning, thereby improving the accuracy and generalization of RUL prediction.

2. Input data are classified by a level division unit of an MBCNN model. Classified data are input to an MBCNN, in which each branch can execute corresponding feature extraction in accordance with a level of its input data. A bidirectional trend-level division unit of the MBCNN model is used to classify output features of the MBCNN into various levels of degradation trends along the forward and backward directions. Multicellular update units are then used to perform corresponding feature learning on bidirectional trend levels of input features to output health indexes. The present disclosure can better mine different degradation trends for a health state of the aero-engine.

Other advantages, objectives and features of the present disclosure will be illustrated in the subsequent description in some degree, and will be apparent to those skilled in the art in some degree based on study on the following description, or those skilled in the art may obtain teachings by practicing the present disclosure. The objectives and other advantages of the present disclosure can be implemented and obtained by the following description and claims.

BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings of the present disclosure are described as follows:

FIG. 1 is a flowchart of a method for predicting a RUL of an aero-engine according to the present disclosure;

FIG. 2 is a structural view of an ADLDNN model according to the present disclosure;

FIG. 3 is a schematic view of an SW for preprocessing data according to the present disclosure;

FIG. 4 illustrates a predicted result on a subset FD001 according to a prediction method of the present disclosure;

FIG. 5 illustrates a predicted result on a subset FD002 according to a prediction method of the present disclosure;

FIG. 6 illustrates a predicted result on a subset FD003 according to a prediction method of the present disclosure; and

FIG. 7 illustrates a predicted result on a subset FD004 according to a prediction method of the present disclosure.

DETAILED DESCRIPTION OF THE EMBODIMENTS

The present disclosure will be further described below in conjunction with the accompanying drawings and embodiments.

As shown in FIGS. 1-3, a method for predicting a RUL of an aero-engine based on an ADLDNN specifically includes the following steps:

1) Data acquisition: Multidimensional degradation parameters of an aero-engine to be predicted are acquired, a stable trend is analyzed, and a plurality of parameters capable of reflecting degradation performance of the aero-engine are selected to obtain acquired data, specifically:

1-1) Degradation data of the aero-engine are simulated by commercial modular aero-propulsion system simulation (C-MAPSS) to acquire the multidimensional degradation parameters of the aero-engine to be predicted, as shown in Table 1:

TABLE 1

Outputs of 21 sensors in operation of the engine

Symbol
Description
Unit
Trend

1
T2
Total temperature at fan inlet
°R
—

2
T24
Total temperature at low pressure
°R
↑

compressor (LPC) outlet

3
T30
Total temperature at high pressure
°R
↑

compressor (HPC) outlet

4
T50
Total temperature at low pressure
°R
↑

turbine (LPT) outlet

5
P2
Pressure at fan inlet
psia
—

6
P15
Total pressure in bypass-duct
psia
—

7
P30
Total pressure at HPC outlet
psia
↓

8
Nf
Physical fan speed
rpm
↑

9
Nc
Physical core speed
rpm
↑

10
Epr
Engine pressure ratio
—
—

11
Ps30
Static pressure at HPC outlet
psia
↑

12
Phi
Ratio of fuel flow to Ps30
pps/psi
↓

13
NRf
Corrected fan speed
rpm
↑

14
NRc
Corrected core speed
rpm
↓

15
BPR
Bypass ratio
—
↑

16
farB
Burner fuel-air ratio
—
—

17
htBleed
Bleed enthalpy
—
↑

18
NF_dmd
Demanded fan speed
rpm
—

19
PCNR_dmd
Demanded Corrected fan speed
rpm
—

20
W31
High pressure turbine
lbm/s
↓

(HPT) coolant bleed

21
W32
LPT coolant bleed
lbm/s
↓

As shown in Table 2, the C-MAPSS dataset is divided into four sub-datasets according to different operating conditions and fault modes:

TABLE 2

C-MAPSS dataset

Subset
FD001
FD002
FD003
FD004

Number of engines
100
260
100
249

Operating condition
1
6
1
6

Fault mode
1
1
2
2

Maximum running cycle
362
378
525
543

Minimum running cycle
128
128
145
128

Each sub-dataset contains training data, test data and an actual RUL corresponding to the test data. The training data contain all the engine data from a certain health state to the fault, while the test data are data before the engine running fault. Moreover, the training and test data respectively contain a certain number of engines with different initial health states.

Due to the different initial health states of the engines, the running cycles of different engines in a same database are different. Taking the FD001 dataset as an example, the training dataset includes 100 engines, with a maximum running cycle of 362, and a minimum running cycle of 128. In order to fully prove the superiority of the method, a simplest subset (namely the subset FD001 having a single operating condition and a single fault mode) and a most complex subset (namely the subset FD004 having various operating conditions and various fault modes) are taken as experimental data.

1-2) Some stable trend measurements (measurement data of sensors 1, 5, 6, 10, 16, 18 and 19) are excluded in advance. These sensors are unsuitable for RUL prediction, because their full-life cycle measurement curves are stable and constant, namely containing less degradation information of the engine, and operating conditions have a significant impact on a prediction capability of the model. Therefore, measurements and operating conditions of screened 14 sensors are formed into original data to obtain the acquired data.

2) Data preprocessing: The acquired data are segmented by an SW to obtain preprocessed data, specifically:

As shown in FIG. 4, assuming that the full-life cycle of the engine is T, the sliding window size is l, and the sliding step size is m, an i th input sample has a size of l×n, n being a sum of a number of selected sensors and a number of dimensions for information of operating conditions.

When the ith sample is input, the actual RUL is T−l−(i−1)×m.

RUL labels are constructed by a piece-wise linear RUL technology, and are defined as follows:

$\begin{matrix} Rul = {\begin{matrix} Rul, & if Rul \leq {Rul}_{\max} \\ {Rul}_{\max} & if Rul > {Rul}_{\max} \end{matrix} & (10) \end{matrix}$

In Equation (10), Rul_maxis a maximum RUL and a preset threshold.

In the example of the present disclosure, for FD001 and FD004, the maximum RUL is 130 cycles and 150 cycles respectively, while the sliding window size l is 30, and the sliding step size m is 1. There are 17,731 and 54,028 training samples for the FD001 and the FD004. Both the FD001 and the FD004 contain 100,248 test samples, because only the last measured value of the test set is used to validate the prediction capability.

3) Model construction: A RUL prediction model of the aero-engine is constructed based on an ADLDNN, the RUL prediction model including an MBCNN model, an MCBLSTM model, an FC layer FC1, and a regression layer.

The MBCNN model includes a level division unit, and a spatial feature alienation-extraction unit.

The MCBLSTM model includes a bidirectional trend-level division unit, and multicellular update units.

4) Feature extraction: The preprocessed data are taken as input data of the MBCNN model, an output of the MBCNN model is extracted, the output of the MBCNN model and recursive data are taken as input data of the MCBLSTM model, and an output of the MCBLSTM model is extracted, specifically:

4-1) The step of extracting an output of the MBCNN model specifically includes:

4-1-1) Level division: The preprocessed data in Step 2) are taken as the input data, input data x_tat time t are input to the level division unit of the MBCNN model for level division, the level division unit including an FC layer FC2 composed of five neurons, and softmax normalization is performed on an output D_tof the FC layer FC2 to obtain a level division result D_1t:

D
_t=tanh(w_xd₁x_t+b_d₁) (11)

D
_1t=soft max(D_t)=[d_11td_12td_13td_14td_15t] (12)

In Equations (11) and (12), w_xd₁and b_d₁respectively represent a weight and a bias of the FC layer FC2, d_11t, d_12t, d_13t, d_14tand d_15trespectively represent an important level, a relatively important level, a general level, a relatively minor level and a minor level, and a position of a maximal element in D_1trepresents a level division result of a present input.

4-1-2) Feature extraction: According to a level division result D₁of the input data, the input data are input to different convolution paths of the spatial feature alienation-extraction unit for convolution, and automatic differential processing is performed on an input measured value according to the level division result and five designed convolution paths to obtain a health feature h_t¹:

h
_ti
¹
=P
₁₅(C₁₅(P₁₄(C₁₄(P₁₃(C₁₃(P₁₂(C₁₂(P₁₁(C₁₁(x_t))))))))))

h
_tj
¹
=P
₂₄(C₂₄(P₂₃(C₂₃(P₂₂(C₂₂(P₂₁(C₂₁(x_t))))))))

h
_tk
¹
=P
₃₃(C₃₃(P₃₂(C₃₂(P₃₁(C₃₁(x_t))))))

h
_tl
¹
=P
₄₃(C₄₂(P₄₁(C₄₁(x_t))))

h
_tm
¹
=P
₅₁(C₅₁(x_t))

h
_t
¹
=D
_1t
[h
_ti
¹
h
_tj
¹
h
_tk
¹
h
_tl
¹
h
_tm
¹]^T (13)

In Equation (13), P_ijand C_ijrespectively represent a jth convolution operation and a jth pooling operation for an ith convolution path, h_ti¹is a convolution output of data of the important level, h_tj¹is a convolution output of data of the relatively important level, h_t¹is a convolution output of data of the general level, h_tl¹is a convolution output of data of the relatively minor level, and h_tm¹is a convolution output of data of the minor level.

Further, the step of extracting an output of the MCBLSTM model specifically includes:

4-2-1) Trend division: An output h_t¹of the MBCNN model at time t and recursive data h²_t-1of the MCBLSTM model at time t−1 are taken as input data of the MCBLSTM at time t, and input to the bidirectional trend-level division unit for trend division, the bidirectional trend-level division unit including an FC layer FC3 and an FC layer FC4 for dividing a trend level of the input data along forward and backward directions, the FC layer FC3 and the FC layer FC4 each including five neurons, and the FC layer FC3 and the FC layer FC4 respectively having an output {right arrow over ({tilde over (D)})}_2tand an output custom-character _2t:

$\begin{matrix} {\overset{\vec{~}}{D}}_{2 t} = \tanh ({\vec{h}}_{t}^{1} {\vec{w}}_{{xd}_{2}} + {\vec{h}}_{t - 1}^{2} {\vec{w}}_{{hd}_{2}} + {\vec{b}}_{d_{2}}) \\ {\overset{\overset{\leftarrow}{~}}{D}}_{2 t} = \tanh ({\overset{\leftarrow}{h}}_{t}^{1} {\overset{\leftarrow}{w}}_{{xd}_{2}} + {\overset{\leftarrow}{h}}_{t - 1}^{2} {\overset{\leftarrow}{w}}_{{hd}_{2}} + {\overset{\leftarrow}{b}}_{d_{2}}) \end{matrix}$

In Equation (14),

${\vec{w}}_{x d_{2}} and {\vec{w}}_{h d_{2}}$

each are a weight of the FC layer FC3,

${\overset{\leftarrow}{w}}_{x d_{2}} and {\overset{\leftarrow}{w}}_{h d_{2}}$

each are a weight of the FC layer FC4, {right arrow over (b)}_d₂is a bias of the FC layer FC3, and custom-character _d₂is a bias of the FC layer FC4.

A softmax operation is respectively performed on the {right arrow over ({tilde over (D)})}_2tand the custom-character _2tto obtain forward and backward trend levels {right arrow over (D)}_2tand _2t:

custom-character
_2t=soft max(_2t)=[_21t_22t_23t_24t_25t] (15)

In Equation (15), {right arrow over (d)}_21t( custom-character _21t), {right arrow over (d)}_22t(_22t), {right arrow over (d)}_23t(_23t), {right arrow over (d)}_24t(_24t), and {right arrow over (d)}_25t(_25t) respectively represent a local trend, a medium and short-term trend, a medium-term trend, a medium and long-term trend and a global trend in bidirectional calculation, and {right arrow over (d)}_{2 max t}and custom-character _{2 max t}in {right arrow over (D)}_2tand _2trepresent trend levels along two directions at the time t.

4-2-2) Feature extraction: According to the trend division results {right arrow over (d)}_2tand custom-character _2t, data of different trends are input to the multicellular update units and , and which perform differential learning along the two directions, for update, the comprising five subunits (i), (j), (k), (l), (m), and the comprising five subunits (i), (j), (k), (l), and (m):

$\begin{matrix} {\overset{r}{i}}_{t} = σ ({\overset{r}{w}}_{{ih}^{1}} {\overset{r}{h}}_{t}^{1} + {\overset{r}{w}}_{{ih}^{2}} {\overset{r}{h}}_{t - 1}^{2} {\overset{r}{b}}_{i}) & (16) \end{matrix}$

${\overset{r}{f}}_{t} = σ ({\overset{r}{w}}_{{fh}^{1}} {\overset{r}{h}}_{t}^{1} + {\overset{r}{w}}_{{fh}^{2}} {\overset{r}{h}}_{t - 1}^{2} {\overset{r}{b}}_{f})$

${\overset{s}{i}}_{t} = σ ({\overset{s}{w}}_{{ih}^{1}} {\overset{s}{h}}_{t}^{1} + {\overset{s}{w}}_{{ih}^{2}} {\overset{s}{h}}_{t - 1}^{2} {\overset{s}{b}}_{i})$

${\overset{s}{f}}_{t} = σ ({\overset{s}{w}}_{{fh}^{1}} {\overset{s}{h}}_{t}^{1} + {\overset{s}{w}}_{{fh}^{2}} {\overset{s}{h}}_{t - 1}^{2} {\overset{s}{b}}_{f})$

${\overset{r}{c}}_{t} (m) = {\overset{r}{c}}_{t - 1}$

${\overset{s}{c}}_{t} (m) = {\overset{s}{c}}_{t - 1}$

${\overset{r}{c}}_{t} (i) = \frac{r}{c_{t}} = \tanh ({\overset{r}{W}}_{{ch}^{1}} {\overset{r}{h}}_{t}^{1} + {\overset{r}{W}}_{{ch}^{2}} {\overset{r}{h}}_{t - 1}^{2} + {\overset{r}{b}}_{c})$

${\overset{s}{c}}_{t} (i) = \frac{s}{c_{t}} = \tanh ({\overset{s}{W}}_{{ch}^{1}} {\overset{s}{h}}_{t}^{1} + {\overset{s}{W}}_{{ch}^{2}} {\overset{s}{h}}_{t - 1}^{2} + {\overset{s}{b}}_{c})$

${\overset{r}{c}}_{t} (k) = {\overset{r}{f}}_{t} e {\overset{r}{c}}_{t - 1} + {\overset{r}{i}}_{t} e \frac{r}{c_{t}}$

${\overset{s}{c}}_{t} (k) = {\overset{s}{f}}_{t} e {\overset{s}{c}}_{t - 1} + {\overset{s}{i}}_{t} e \frac{s}{c_{t}}$

${\overset{r}{c}}_{t} (l) = s_{1} ({\overset{r}{f}}_{t} e {\overset{r}{c}}_{t - 1} + {\overset{r}{i}}_{t} e \frac{r}{c_{t}}) + (1 - s_{1}) {\overset{r}{c}}_{t - 1}$

${\overset{s}{c}}_{t} (l) = s_{3} ({\overset{s}{f}}_{t} e {\overset{s}{c}}_{t - 1} + {\overset{s}{i}}_{t} e \frac{s}{c_{t}}) + (1 - s_{3}) {\overset{s}{c}}_{t - 1}$

${\overset{r}{c}}_{t} (j) = s_{2} ({\overset{r}{f}}_{t} e {\overset{r}{c}}_{t - 1} + {\overset{r}{i}}_{t} e \frac{r}{c_{t}}) + (1 - s_{2}) \frac{r}{c_{t}}$

${\overset{s}{c}}_{t} (j) = s_{4} ({\overset{s}{f}}_{t} e {\overset{s}{c}}_{t - 1} + {\overset{s}{i}}_{t} e \frac{s}{c_{t}}) + (1 - s_{5}) \frac{s}{c_{t}}$

In Equation (16), arrows → and ← respectively represent forward and backward processes, custom-character (m), (m) are corresponding data update units of the global trend in the bidirectional calculation, (i), (i) are corresponding data update units of the short-term trend in the bidirectional calculation, (k), (k) are corresponding data update units of the medium-term trend in the bidirectional calculation, custom-character (l), (l) are corresponding data update units of the medium and long-term trend in the bidirectional calculation, (j), (j) are corresponding data update units of the medium and short-term trend in the bidirectional calculation, σ is a sigmod activation function, _th₁, and _th₂, custom-character are weights of input gates of the MCBLSTM model, _fh₁, and , are weights of forget gates of the MCBLSTM model, _ch₁, and , are weights of cell storage units of the MCBLSTM model, _iand _iare biases of the input gates of the MCBLSTM model, _fand _fare biases of the forget gates of the MCBLSTM model, custom-character _cand _care biases of the cell storage units of the MCBLSTM model, ⊙ is a dot product operation, and s₁, s₃, s₃and s₄each are a mix proportion factor obtained by learning.

Weights of alienation outputs of daughter-cell units in the multicellular update units are combined according to update results of five alienation units and the trend division results {right arrow over (D)}_2tand custom-character obtain outputs and of the multicellular update units, and controlling output gates {right arrow over (o)}_tand of the MCBLSTM model to obtain an output h²_tof the MCBLSTM model at the time t:

custom-character =[(i)(j)(k)(l)(m)]^T

{right arrow over (o)}
_t=σ({right arrow over (w)}_ox{right arrow over (h)}_t¹+{right arrow over (w)}_oh{right arrow over (h)}_t-1²+{right arrow over (b)}_o)

custom-character
_t=σ(++)

{right arrow over (h)}
_t
²
={right arrow over (o)}
_t□tanh({right arrow over (c)}_t)

custom-character =□tanh()

h
²
_t
={right arrow over (h)}
_t
²⊕ custom-character (17)

In Equation (17), custom-character _ox, and _ox, are weights of the output gates of the MCBLSTM model, and σ and tanh each are an activation function.

In the example of the present disclosure, in order to keep the global trend as long as possible, the cell units custom-character (k) and (k) are updated from a state at previous time. In order to replace the local trend timely, the units (k) and (k) are updated from an internal state at this time. According to the conventional cell update mechanism in the BLSTM, (k) and (k) in the medium-term trend are updated with custom-character (k) and (k) in the global trend as well as (k) and (k) in the local trend, the units in the medium and long-term trend are updated with (k) and (k) in the global trend as well as (k) and (k) in the medium-term trend, and the units in the medium and short-term trend are updated with (k) and custom-character (k) in the medium-term trend as well as h²_tand h²_tin the local trend.

5) RUL prediction: The output of the MCBLSTM model is taken as an input of the FC layer FC1 to obtain an output of the FC layer FC1, and the output of the FC layer FC1 is input to the regression layer to predict a RUL, specifically:

h²_tis input to the FC layer FC1, overfitting is prevented by Dropout to obtain an output h³_tof the FC layer FC1, and the h³_tis input to the regression layer to obtain a predicted RUL y_t:

$\begin{matrix} h_{t}^{3} = dropout (Relu (w_{h_{2} h_{3}} h_{t}^{2} + b_{h_{3}})) & (18) \end{matrix}$

$\begin{matrix} y_{t} = Linear (w_{h_{3} y} h_{t}^{3} + b_{y}) & (19) \end{matrix}$

In Equations (18) and (19), w_h₂_h₃is a weight of the FC layer FC1, b_h₃is a bias of the FC layer FC1, w_h₃_yis a weight of the regression layer, and b_yis a bias of the regression layer.

In the example of the present disclosure, there are N samples in training. A mean square error (MSE) is defined as a loss function and calculated by:

$\begin{matrix} M S E Loss = \frac{1}{2} \sum_{i = 1}^{N} {({\overline{Rul}}_{i} - {Rul}_{i})}^{2} & (20) \end{matrix}$

In Equation (20), Rul_iand are Rul_irespectively a predicted RUL and an actual RUL of an ith sample. An error gradient of each level is obtained by back propagation, and a weight parameter of the model is optimized by Adam optimization. The Dropout is used to prevent the overfitting in deep learning (DL).

Hyper-parameters of the ADLDNN are selected by a grid search method:

C11, C12, C13, C14, C15, C21, C22, C23, C24, C31, C32, C33, C41, C42, and C51 respectively have a kernel size of 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 7, 2, 2, 2, and 9.

P11, P12, P13, P14, P14, P21, P22, P23, P24, P31, P32, P33, P41, P42, and P51 respectively have a maximum pooling size of 2, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 2, and 2.

It is assumed that the convolution kernel has a step size of 1, the MCBLSTM has 30 neurons, the FC layer FC1 has 30 neurons, and the regression layer has one neuron. The Dropout is set as 0.5, and the window size and the step size are respectively set as 30 and 1.

6) Experimental validation:

6-1) Evaluation indexes: A score and a root-mean-square error (RMSE) of IEEE are taken as evaluation indexes to quantitatively characterize RUL prediction performance. The evaluation indexes can be respectively calculated by:

$\begin{matrix} A_{i} = {\begin{matrix} \exp (- (({\overline{Rul}}_{i} - {Rul}_{i}) / 13)) - 1, & {\overline{Rul}}_{i} < {Rul}_{i} \\ \exp (({\overline{Rul}}_{i} - {Rul}_{i}) / 10) - 1, & {\overline{Rul}}_{i} \geq {Rul}_{i} \end{matrix} & (21) \end{matrix}$

$\begin{matrix} Score = \sum_{i = 1}^{N} A_{i} & (22) \end{matrix}$

$\begin{matrix} R M S E = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {({Rul}_{i} - {\overline{Rul}}_{i})}^{2}} & (23) \end{matrix}$

In Equations (21), (22), and (23), Rul_iand Rul_iare respectively an actual RUL and a predicted RUL of an ith engine, and N is a total number of engines in a subset. Values of these indexes are inversely proportional to the RUL performance, namely the smaller the values, the better the performance of the model. The score imposes a greater penalty on over-prediction than the RMSE and thus is more suitable for engineering practices. Therefore, while the RMSEs are close to each other, the model is more evaluated based on the scores.

6-2) RUL prediction and comparison: The proposed ADLDNN is trained first by FD001, FD002, FD003 and FD004 training sets, and tested by corresponding test sets. Predicted results on the four subsets are respectively as shown in FIGS. 4-7.

In FIGS. 4-7, the x axis refers to a number of test engines, and the y axis refers to a RUL value. The predicted RUL and the actual RUL are respectively described by a solid line and a dotted line. The header in FIGS. 4-7 shows the score value and the RMSE value in the predicted result. It can be intuitively seen that the error between the predicted RUL and the actual RUL in FIG. 4 is less than that in FIGS. 5-7, which means that the proposed ADLDNN shows best performance on FD001. In addition, the method shows better performance on FD003 than FD002, and worst performance on FD004.

The engine has a relatively simple degradation trend in a single operating condition, and there is a large overlapping degree between the training set and the test set. Hence, predicted results on FD001 and FD003 in the single operating condition are superior to those on FD002 and FD004 in various operating conditions. In addition, the predicted result on FD001 is more accurate than that on FD003, and the predicted result on FD002 is more accurate than that on the FD004. Therefore, the prediction accuracy in the single-failure mode is higher than that in the multi-failure mode. It can be further seen that the predicted result on FD003 is superior to that on FD002, which means that the number of failure modes has a less impact on RUL prediction than the number of operating conditions.

In order to further show the superiority of the ADLDNN in RUL prediction, comparisons are made between the proposed method and various typical methods based on the statistical model, shallow learning model, classic DL model and several recently published DL models. In addition, scores and RMSEs calculated according to predicted results of the above all methods are as shown in Table 3. As can be seen from the table, all methods show the best predictive effect to FD001 and the worst predictive effect to FD004. This is because FD001 is the simplest subset, while FD004 has the most complex operating conditions and fault types and more test engine numbers than other subsets. All methods are more accurate to FD003 than FD002, which further proves that the operating condition and the engine number have a greater impact on the accuracy of RUL prediction than the fault type.

As can be seen from Table 3, for the simplest FD001, except Acyclic Graph Network, a score and an RMSE in the result predicted by the method are smaller than those in the results predicted by existing other methods. However, for complex datasets such as FD002 and FD004, the method shows a stronger prediction capability than other typical methods. In addition, since the score is more practical than the RMSE in actual engineering, the ADLDNN is considered to be superior to Acyclic Graph Network in FD003. Compared with existing typical methods, the ADLDNN is more applied to process complex datasets including various operating conditions and fault types. In conclusion, the ADLDNN shows high overall performance, and can be better applied to predict the machine RUL.

TABLE 3

Quantitative comparisons of different methods in prediction performance on datasets

Statistical

method

Data
Evaluation
Cox's
Shallow learning method
DL method

set
standard
regression
MLP
SVR
RVR
ELM
RF
CNN
LSTM

FD001
RMSE
45.10
37.56
20.96
23.86
17.27
17.91
18.45
16.14

Score
28616
17972
1381.5
1502.9
523
479.75
1286.7
338

FD002
RMSE
N/A
80.03
41.99
31.29
37.28
29.59
30.29
24.49

Score
N/A
7802800
58990
17423
498149
70456
17423
4450

FD003
RMSE
N/A
37.38
21.04
22.36
18.9
20.27
19.81
16.18

Score
N/A
17409
1598.3
1431.6
121414
711.13
1431
852

FD004
RMSE
54.29
77.37
45.35
34.34
38.43
31.12
29.16
28.17

Score
1164590
5616600
371140
26509
121414.47
46567.63
7886.4
5550

Prediction

DL method
method

LSTM +

of the

attention +
Acyclic

present

Data
Evaluation

handscraft
Graph
SUR-

disclosure

set
standard
DBN
MONBNE
feature
Network
LSTM
AEQRNN
ADLDNN

FD001
RMSE
15.21
15.04
14.53
11.96
14.46
N/A
13.19

Score
417.59
334.23
322.44
229
200
N/A
275

FD002
RMSE
27.12
25.05
N/A
20.34
21.1
19.10
17.33

Score
9031.64
5590
N/A
2730
1383
3220
1149

FD003
RMSE
14.71
12.51
N/A
12.46
17.16
N/A
13.81

Score
442.43
422
N/A
535
370
N/A
334

FD004
RMSE
29.88
28.66
27.08
22.43
22.61
20.6
19.89

Score
7954.51
6557.62
5649.14
3370
2602
4597
2505

Those skilled in the art should understand that the embodiments of the present disclosure may be provided as a method, a system, or a computer program product. Therefore, the present disclosure may use a form of hardware only embodiments, software only embodiments, or embodiments with a combination of software and hardware. Moreover, the present disclosure may be in a form of a computer program product that is implemented on one or more computer-usable storage media (including but not limited to a magnetic disk memory, a CD-ROM, an optical memory, and the like) that include computer-usable program code.

The present disclosure is described with reference to the flowcharts and/or block diagrams of the method, the device (system), and the computer program product according to the embodiments of the present disclosure. It should be understood that computer program instructions may be used to implement each process and/or each block in the flowcharts and/or the block diagrams and a combination of a process and/or a block in the flowcharts and/or the block diagrams. These computer program instructions may be provided for a general-purpose computer, a dedicated computer, an embedded processor, or a processor of another programmable data processing device to generate a machine, such that the instructions executed by a computer or a processor of another programmable data processing device generate an apparatus for implementing a specific function in one or more processes in the flowcharts and/or in one or more blocks in the block diagrams.

These computer program instructions may be stored in a computer-readable memory that can instruct the computer or any other programmable data processing device to work in a specific manner, such that the instructions stored in the computer-readable memory generate an artifact that includes an instruction apparatus. The instruction apparatus implements a specific function in one or more processes in the flowcharts and/or in one or more blocks in the block diagrams.

These computer program instructions may be loaded onto a computer or another programmable data processing device, such that a series of operations and steps are performed on the computer or the another programmable device, thereby generating computer-implemented processing. Therefore, the instructions executed on the computer or the another programmable device provide steps for implementing a specific function in one or more processes in the flowcharts and/or in one or more blocks in the block diagrams.

Finally, it should be noted that: the above embodiments are merely intended to describe the technical solutions of the present disclosure, rather than to limit thereto; although the present disclosure is described in detail with reference to the above embodiments. It is to be appreciated by those of ordinary skill in the art that modifications or equivalent substitutions may still be made to the specific implementations of the present disclosure, and any modifications or equivalent substitutions made without departing from the spirit and scope of the present disclosure shall fall within the protection scope of the claims of the present disclosure.

METHOD FOR PREDICTING REMAINING USEFUL LIFE (RUL) OF AERO-ENGINE BASED ON AUTOMATIC DIFFERENTIAL LEARNING DEEP NEURAL NETWORK (ADLDNN)

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

Priority Claims (1)