METHOD FOR PREDICTING REMAINING USEFUL LIFE (RUL) OF AERO-ENGINE BASED ON AUTOMATIC DIFFERENTIAL LEARNING DEEP NEURAL NETWORK (ADLDNN)

Information

  • Patent Application
  • 20230141864
  • Publication Number
    20230141864
  • Date Filed
    October 24, 2022
    2 years ago
  • Date Published
    May 11, 2023
    a year ago
Abstract
The present disclosure provides a method for predicting a remaining useful life (RUL) of an aero-engine, specifically including: acquiring multidimensional degradation parameters of an aero-engine to be predicted to obtain acquired data; segmenting the acquired data by a sliding window (SW) to obtain preprocessed data; constructing a RUL prediction model of the aero-engine including a multibranch convolutional neural network (MBCNN) model, a multicellular bidirectional long short-term memory (MCBLSTM) model, a fully connected (FC) layer FC1, and a regression layer; taking the preprocessed data as input data of the MBCNN model, extracting an output of the MBCNN model, taking the output of the MBCNN model and recursive data as input data of the MCBLSTM model, and extracting an output of the MCBLSTM model; obtaining an output of the FC layer FC1, and inputting the output of the FC layer FC1 to the regression layer to predict a RUL.
Description
CROSS REFERENCE TO RELATED APPLICATION

This patent application claims the benefit and priority of Chinese Patent Application No. 202111261992.X, filed with the China National Intellectual Property Administration on Oct. 28, 2021, the disclosure of which is incorporated by reference herein in its entirety as part of the present application.


TECHNICAL FIELD

The present disclosure relates to the field of remaining useful life (RUL) prediction for aero-engines, and in particular to a method for predicting a RUL of an aero-engine based on an automatic differential learning deep neural network (ADLDNN).


BACKGROUND

Aero-engine, a highly complex and precise thermal machine, is the engine that provides the aircraft with the necessary power for flights. It is more susceptible to faults for the complex internal structure and harsh operating environment. Hence, accurate prediction on a RUL of the aero-engine is of great significance to operation and maintenance of the aero-engine.


With the development of sciences and technologies, the long short-term memory (LSTM) and convolutional neural network (CNN) have been widely applied to predict a RUL of a rotary machine. However, existing neural networks all process data in a uniform mode, cannot mine different levels of feature information in various feature extraction modes, and have a poor prediction accuracy.


SUMMARY

An objective of the present disclosure is to provide a method for predicting a RUL of an aero-engine based on an ADLDNN, which can be used to predict the RUL of the aero-engine.


The objective of the present disclosure is implemented with the following technical solutions. A method for predicting a RUL of an aero-engine based on an ADLDNN includes the following specific steps:


1) data acquisition: acquiring multidimensional degradation parameters of an aero-engine to be predicted, analyzing a stable trend, and selecting a plurality of parameters capable of reflecting degradation performance of the aero-engine to obtain acquired data;


2) data preprocessing: segmenting the acquired data by a sliding window (SW) to obtain preprocessed data;


3) model construction: constructing a RUL prediction model of the aero-engine based on an ADLDNN, the RUL prediction model including a multibranch convolutional neural network (MBCNN) model, a multicellular bidirectional long short-term memory (MCBLSTM) model, a fully connected (FC) layer FC1, and a regression layer;


4) feature extraction: taking the preprocessed data as input data of the MBCNN model, extracting an output of the MBCNN model, taking the output of the MBCNN model and recursive data as input data of the MCBLSTM model, and extracting an output of the MCBLSTM model; and


5) RUL prediction: taking the output of the MCBLSTM model as an input of the FC layer FC1 to obtain an output of the FC layer FC1, and inputting the output of the FC layer FC1 to the regression layer to predict a RUL.


Further, the MBCNN model includes a level division unit, and a spatial feature alienation-extraction unit; and


the MCBLSTM model includes a bidirectional trend-level division unit, and multicellular update units.


Further, the extracting an output of the MBCNN model in step 4) specifically includes:


4-1-1) level division: taking the preprocessed data in step 2) as the input data, inputting input data xt at time t to the level division unit of the MBCNN model for level division, the level division unit including an FC layer FC2 composed of five neurons, and performing softmax normalization on an output Dt of the FC layer FC2 to obtain a level division result D1t:






D
t=tanh(wxd1xt+bd1)   (1)






D
1t=soft max(Dt)=[d11td12td13td14td15t]  (2)


where in equations (1) and (2), wxd1 and bd1 respectively represent a weight and a bias of the FC layer FC2, d11t, d12t, d13t, d14t and d15t respectively represent an important level, a relatively important level, a general level, a relatively minor level and a minor level, and a position of a maximal element in D1t represents a level division result of a present input; and


4-1-2) feature extraction: inputting, according to a level division result D1 of the input data, the input data to different convolution paths of the spatial feature alienation-extraction unit for convolution, and performing automatic differential processing on an input measured value according to the level division result and five designed convolution paths to obtain a health feature ht1:






h
ti
1
=P
15(C15(P14(C14(P13(C13(P12(C12(P11(C11(xt))))))))))






h
tj
1
=P
24(C24(P23(C23(P22(C22(P21(C21(xt))))))))






h
tk
1
=P
33(C33(P32(C32(P31(C31(xt))))))






h
tl
1
=P
42(C42(P41(C41(xt))))






h
tm
1
=P
51(C51(xt))






h
t
1
=D
1t
[h
ti
1
h
tj
1
h
tk
1
h
tl
1
h
tm
1]T   (3)


where in equation (3), Pij and Cij respectively represent a jth convolution operation and a jth pooling operation for an ith convolution path, hti1 is a convolution output of data of the important level, htj1 is a convolution output of data of the relatively important level, htk1 is a convolution output of data of the general level, htl1 is a convolution output of data of the relatively minor level, and htm1 is a convolution output of data of the minor level.


Further, the extracting an output of the MCBLSTM model in step 4) specifically includes:


4-2-1) trend division: taking an output ht1 of the MBCNN model at time t and recursive data ht-12 of the MCBLSTM model at time t−1 as input data of the MCBLSTM at time t, and inputting the input data to the bidirectional trend-level division unit for trend division, the bidirectional trend-level division unit including an FC layer FC3 and an FC layer FC4 for dividing a trend level of the input data along forward and backward directions, the FC layer FC3 and the FC layer FC4 each including five neurons, and the FC layer FC3 and the FC layer FC4 respectively having an output {right arrow over ({tilde over (D)})}2t and output custom-character2t:














D

~




2

t


=

tanh

(




h


t
1




w



xd
2



+



h



t
-
1

2




w



hd
2



+


b



d
2



)









D

~




2

t


=

tanh

(




h


t
1




w



xd
2



+



h



t
-
1

2




w



hd
2



+


b



d
2



)








(
4
)







where in equation (4),








w



xd
2




and




w



hd
2






each are a weight of the FC layer FC3,







w



xd
2





and







w



hd
2





each are a weight of the FC layer FC4, {right arrow over (b)}d2 is a bias of the FC layer FC3, and custom-characterd2 is a bias of the FC layer FC4; and


respectively performing a softmax operation on the {right arrow over ({tilde over (D)})}2t and the custom-character2t to obtain forward and backward trend levels {right arrow over (D)}2t and custom-character2t:






{right arrow over (D)}
2t=soft max({right arrow over ({tilde over (D)})}2t)=[{right arrow over (d)}21t{right arrow over (d)}22t{right arrow over (d)}23t{right arrow over (d)}24t{right arrow over (d)}25t]






custom-character
2t=soft max (custom-character2t)=[custom-character21tcustom-character22tcustom-character23tcustom-character24tcustom-character25t]  (5)


where in equation (5), {right arrow over (d)}21t(custom-character21t), {right arrow over (d)}22t(custom-character22t), {right arrow over (d)}23t(custom-character23t), {right arrow over (d)}24t(custom-character24t), and {right arrow over (d)}25t(custom-character25t) respectively represent a local trend, a medium and short-term trend, a medium-term trend, a medium and long-term trend and a global trend in bidirectional calculation, and {right arrow over (d)}2 max t and custom-character2 max t in {right arrow over (D)}2t and custom-character2t represent trend levels along two directions at the time t; and


4-2-2) feature extraction: inputting, according to the trend division results {right arrow over (D)}2t and custom-character2t, data of different trends to the multicellular update units custom-character and custom-character, which perform differential learning along the two directions, for update, the lc, comprising five subunits custom-character(i), custom-character(j), custom-character(k), custom-character(l), custom-character(m), and custom-character comprising five subunits custom-character(i), custom-character(j), custom-character(k), custom-character(l), and custom-character(m):











i
r

t

=

σ

(




w
r


ih
1





h
r

t
1


+



w
r


ih
2





h
r


t
-
1

2




b
r

i



)





(
6
)











f
r

t

=

σ

(




w
r


fh
1





h
r

t
1


+



w
r


fh
2





h
r


t
-
1

2




b
r

f



)









i
s

t

=

σ

(




w
s


ih
1





h
s

t
1


+



w
s


ih
2





h
s


t
-
1

2




b
s

i



)









f
s

t

=

σ

(




w
s


fh
1





h
s

t
1


+



w
s


fh
2





h
s


t
-
1

2




b
s

f



)










c
r

t

(
m
)

=


c
r


t
-
1











c
s

t

(
m
)

=


c
s


t
-
1











c
r

t

(
i
)

=


r

c
t


=

tanh

(




W
r


ch
1





h
r

t
1


+



W
r


ch
2





h
r


t
-
1

2


+


b
r

c


)











c
s

t

(
i
)

=


s

c
t


=

tanh

(




W
s


ch
1





h
s

t
1


+



W
s


ch
2





h
s


t
-
1

2


+


b
s

c


)











c
r

t

(
k
)

=




f
r

t



e




c
r


t
-
1



+



i
r

t



e



r

c
t













c
s

t

(
k
)

=




f
s

t



e




c
s


t
-
1



+



i
s

t



e



s

c
t













c
r

t

(
l
)

=



s
1

(




f
r

t



e




c
r


t
-
1



+



i
r

t



e



r

c
t




)

+


(

1
-

s
1


)




c
r


t
-
1













c
s

t

(
l
)

=



s
3

(




f
s

t



e




c
s


t
-
1



+



i
s

t



e



s

c
t




)

+


(

1
-

s
3


)




c
s


t
-
1













c
r

t

(
j
)

=



s
2

(




f
r

t



e




c
r


t
-
1



+



i
r

t



e



r

c
t




)

+


(

1
-

s
2


)



r

c
t













c
s

t

(
j
)

=



s
4

(




f
s

t



e




c
s


t
-
1



+



i
s

t



e



s

c
t




)

+


(

1
-

s
5


)



s

c
t








where in equation (6), arrows → and ← respectively represent forward and backward processes, custom-character(m), custom-character(m) are corresponding data update units of the global trend in the bidirectional calculation, custom-character(i), custom-character(i) are corresponding data update units of the short-term trend in the bidirectional calculation, custom-character(k), custom-character(k) are corresponding data update units of the medium-term trend in the bidirectional calculation, custom-character(l), custom-character(l) are corresponding data update units of the medium and long-term trend in the bidirectional calculation, custom-character(j), custom-character(j) are corresponding data update units of the medium and short-term trend in the bidirectional calculation, σ is a sigmod activation function,








w
I


ih
1


,



w
S


ih
1




and




w
I


ih
2



,


w
S


ih
2






are weights of input gates of the MCBLSTM model,








W
I


fh
1


,



W
S


fh
1




and




W
I


fh
2



,


W
S


fh
2






are weights of forget gates of the MCBLSTM model,








w
I


ch
1


,



w
S


ch
1




and




w
I


ch
2



,


w
S


ch
2






are weights of cell storage units of the MCBLSTM model, custom-characteri and custom-characteri are biases of the input gates of the MCBLSTM model, custom-characterf and custom-characterf are biases of the forget gates of the MCBLSTM model, custom-characterc and custom-characterc are biases of the cell storage units of the MCBLSTM model, ⊙ is a dot product operation, and s1, s2, s3 and s4 each are a mix proportion factor obtained by learning; and


combining weights of alienation outputs of daughter-cell units in the multicellular update units according to update results of five alienation units and the trend division results {right arrow over (D)}2tand custom-character2t to obtain outputs custom-character and custom-character of the multicellular update units, and controlling output gates {right arrow over (o)}t and custom-charactert of the MCBLSTM model to obtain an output h2t of the MCBLSTM model at the time t:






{right arrow over (c)}
t
={right arrow over (D)}
2t
[{right arrow over (c)}
t(i){right arrow over (c)}t(j){right arrow over (c)}t(k){right arrow over (c)}t(l){right arrow over (c)}t(m)]T






custom-character=custom-character2t[custom-character(i)custom-character(j)custom-character(k)custom-character(l)custom-character(m)]T






{right arrow over (o)}
t=σ({right arrow over (w)}ox{right arrow over (h)}t1+{right arrow over (w)}oh{right arrow over (h)}t-12+{right arrow over (b)}o)






custom-character
t=σ(custom-characteroxcustom-character+custom-characterohcustom-character+custom-character)






{right arrow over (h)}
t
2
={right arrow over (o)}
t□tanh(ct)






custom-character=custom-character□tanh(custom-character)






h
2
t
={right arrow over (h)}
t
2custom-character  (7)


where in equation (7), custom-characterox, custom-characteroh and custom-characterox, custom-characteroh are weights of the output gates of the MCBLSTM model, and σ and tanh each are an activation function.


Further, the predicting a RUL in step 5) specifically includes:


inputting h2t to the FC layer FC1, preventing overfitting by Dropout to obtain an output h3t of the FC layer FC1, and inputting the h3t to the regression layer to obtain a predicted RUL yt:










h
t
3

=

dropout



(

Relu

(



w


h
2



h
3





h
t
2


+

b

h
3



)

)






(
8
)













y
t

=

Linear



(



w


h
3


y




h
t
3


+

b
y


)






(
9
)







where in equations (8) and (9), wh2h3 is a weight of the FC layer FC1, bh3 is a bias of the FC layer FC1, wh3y is a weight of the regression layer, and by is a bias of the regression layer.


By adopting the foregoing technical solutions, the present disclosure achieves the following advantages:


1. The present disclosure constructs a deep mining model (ADLDNN model) according to different sensitivities of different measured values for mechanical faults in different periods, automatically screens features through the ADLDNN model and combines with differential learning, thereby improving the accuracy and generalization of RUL prediction.


2. Input data are classified by a level division unit of an MBCNN model. Classified data are input to an MBCNN, in which each branch can execute corresponding feature extraction in accordance with a level of its input data. A bidirectional trend-level division unit of the MBCNN model is used to classify output features of the MBCNN into various levels of degradation trends along the forward and backward directions. Multicellular update units are then used to perform corresponding feature learning on bidirectional trend levels of input features to output health indexes. The present disclosure can better mine different degradation trends for a health state of the aero-engine.


Other advantages, objectives and features of the present disclosure will be illustrated in the subsequent description in some degree, and will be apparent to those skilled in the art in some degree based on study on the following description, or those skilled in the art may obtain teachings by practicing the present disclosure. The objectives and other advantages of the present disclosure can be implemented and obtained by the following description and claims.





BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings of the present disclosure are described as follows:



FIG. 1 is a flowchart of a method for predicting a RUL of an aero-engine according to the present disclosure;



FIG. 2 is a structural view of an ADLDNN model according to the present disclosure;



FIG. 3 is a schematic view of an SW for preprocessing data according to the present disclosure;



FIG. 4 illustrates a predicted result on a subset FD001 according to a prediction method of the present disclosure;



FIG. 5 illustrates a predicted result on a subset FD002 according to a prediction method of the present disclosure;



FIG. 6 illustrates a predicted result on a subset FD003 according to a prediction method of the present disclosure; and



FIG. 7 illustrates a predicted result on a subset FD004 according to a prediction method of the present disclosure.





DETAILED DESCRIPTION OF THE EMBODIMENTS

The present disclosure will be further described below in conjunction with the accompanying drawings and embodiments.


As shown in FIGS. 1-3, a method for predicting a RUL of an aero-engine based on an ADLDNN specifically includes the following steps:


1) Data acquisition: Multidimensional degradation parameters of an aero-engine to be predicted are acquired, a stable trend is analyzed, and a plurality of parameters capable of reflecting degradation performance of the aero-engine are selected to obtain acquired data, specifically:


1-1) Degradation data of the aero-engine are simulated by commercial modular aero-propulsion system simulation (C-MAPSS) to acquire the multidimensional degradation parameters of the aero-engine to be predicted, as shown in Table 1:









TABLE 1







Outputs of 21 sensors in operation of the engine












Symbol
Description
Unit
Trend















1
T2
Total temperature at fan inlet
°R



2
T24
Total temperature at low pressure
°R





compressor (LPC) outlet


3
T30
Total temperature at high pressure
°R





compressor (HPC) outlet


4
T50
Total temperature at low pressure
°R





turbine (LPT) outlet


5
P2
Pressure at fan inlet
psia



6
P15
Total pressure in bypass-duct
psia



7
P30
Total pressure at HPC outlet
psia



8
Nf
Physical fan speed
rpm



9
Nc
Physical core speed
rpm



10
Epr
Engine pressure ratio




11
Ps30
Static pressure at HPC outlet
psia



12
Phi
Ratio of fuel flow to Ps30
pps/psi



13
NRf
Corrected fan speed
rpm



14
NRc
Corrected core speed
rpm



15
BPR
Bypass ratio




16
farB
Burner fuel-air ratio




17
htBleed
Bleed enthalpy




18
NF_dmd
Demanded fan speed
rpm



19
PCNR_dmd
Demanded Corrected fan speed
rpm



20
W31
High pressure turbine
lbm/s





(HPT) coolant bleed


21
W32
LPT coolant bleed
lbm/s










As shown in Table 2, the C-MAPSS dataset is divided into four sub-datasets according to different operating conditions and fault modes:









TABLE 2







C-MAPSS dataset











Subset
FD001
FD002
FD003
FD004














Number of engines
100
260
100
249


Operating condition
1
6
1
6


Fault mode
1
1
2
2


Maximum running cycle
362
378
525
543


Minimum running cycle
128
128
145
128









Each sub-dataset contains training data, test data and an actual RUL corresponding to the test data. The training data contain all the engine data from a certain health state to the fault, while the test data are data before the engine running fault. Moreover, the training and test data respectively contain a certain number of engines with different initial health states.


Due to the different initial health states of the engines, the running cycles of different engines in a same database are different. Taking the FD001 dataset as an example, the training dataset includes 100 engines, with a maximum running cycle of 362, and a minimum running cycle of 128. In order to fully prove the superiority of the method, a simplest subset (namely the subset FD001 having a single operating condition and a single fault mode) and a most complex subset (namely the subset FD004 having various operating conditions and various fault modes) are taken as experimental data.


1-2) Some stable trend measurements (measurement data of sensors 1, 5, 6, 10, 16, 18 and 19) are excluded in advance. These sensors are unsuitable for RUL prediction, because their full-life cycle measurement curves are stable and constant, namely containing less degradation information of the engine, and operating conditions have a significant impact on a prediction capability of the model. Therefore, measurements and operating conditions of screened 14 sensors are formed into original data to obtain the acquired data.


2) Data preprocessing: The acquired data are segmented by an SW to obtain preprocessed data, specifically:


As shown in FIG. 4, assuming that the full-life cycle of the engine is T, the sliding window size is l, and the sliding step size is m, an i th input sample has a size of l×n, n being a sum of a number of selected sensors and a number of dimensions for information of operating conditions.


When the ith sample is input, the actual RUL is T−l−(i−1)×m.


RUL labels are constructed by a piece-wise linear RUL technology, and are defined as follows:









Rul
=

{




Rul
,





if


Rul



Rul
max







Rul
max





if


Rul

>

Rul
max










(
10
)







In Equation (10), Rulmax is a maximum RUL and a preset threshold.


In the example of the present disclosure, for FD001 and FD004, the maximum RUL is 130 cycles and 150 cycles respectively, while the sliding window size l is 30, and the sliding step size m is 1. There are 17,731 and 54,028 training samples for the FD001 and the FD004. Both the FD001 and the FD004 contain 100,248 test samples, because only the last measured value of the test set is used to validate the prediction capability.


3) Model construction: A RUL prediction model of the aero-engine is constructed based on an ADLDNN, the RUL prediction model including an MBCNN model, an MCBLSTM model, an FC layer FC1, and a regression layer.


The MBCNN model includes a level division unit, and a spatial feature alienation-extraction unit.


The MCBLSTM model includes a bidirectional trend-level division unit, and multicellular update units.


4) Feature extraction: The preprocessed data are taken as input data of the MBCNN model, an output of the MBCNN model is extracted, the output of the MBCNN model and recursive data are taken as input data of the MCBLSTM model, and an output of the MCBLSTM model is extracted, specifically:


4-1) The step of extracting an output of the MBCNN model specifically includes:


4-1-1) Level division: The preprocessed data in Step 2) are taken as the input data, input data xt at time t are input to the level division unit of the MBCNN model for level division, the level division unit including an FC layer FC2 composed of five neurons, and softmax normalization is performed on an output Dt of the FC layer FC2 to obtain a level division result D1t:






D
t=tanh(wxd1xt+bd1)   (11)






D
1t=soft max(Dt)=[d11td12td13td14td15t]  (12)


In Equations (11) and (12), wxd1 and bd1 respectively represent a weight and a bias of the FC layer FC2, d11t, d12t, d13t, d14t and d15t respectively represent an important level, a relatively important level, a general level, a relatively minor level and a minor level, and a position of a maximal element in D1t represents a level division result of a present input.


4-1-2) Feature extraction: According to a level division result D1 of the input data, the input data are input to different convolution paths of the spatial feature alienation-extraction unit for convolution, and automatic differential processing is performed on an input measured value according to the level division result and five designed convolution paths to obtain a health feature ht1:






h
ti
1
=P
15(C15(P14(C14(P13(C13(P12(C12(P11(C11(xt))))))))))






h
tj
1
=P
24(C24(P23(C23(P22(C22(P21(C21(xt))))))))






h
tk
1
=P
33(C33(P32(C32(P31(C31(xt))))))






h
tl
1
=P
43(C42(P41(C41(xt))))






h
tm
1
=P
51(C51(xt))






h
t
1
=D
1t
[h
ti
1
h
tj
1
h
tk
1
h
tl
1
h
tm
1]T   (13)


In Equation (13), Pij and Cij respectively represent a jth convolution operation and a jth pooling operation for an ith convolution path, hti1 is a convolution output of data of the important level, htj1 is a convolution output of data of the relatively important level, ht1 is a convolution output of data of the general level, htl1 is a convolution output of data of the relatively minor level, and htm1 is a convolution output of data of the minor level.


Further, the step of extracting an output of the MCBLSTM model specifically includes:


4-2-1) Trend division: An output ht1 of the MBCNN model at time t and recursive data h2t-1 of the MCBLSTM model at time t−1 are taken as input data of the MCBLSTM at time t, and input to the bidirectional trend-level division unit for trend division, the bidirectional trend-level division unit including an FC layer FC3 and an FC layer FC4 for dividing a trend level of the input data along forward and backward directions, the FC layer FC3 and the FC layer FC4 each including five neurons, and the FC layer FC3 and the FC layer FC4 respectively having an output {right arrow over ({tilde over (D)})}2t and an output custom-character2t:











D

~




2

t


=

tanh

(




h


t
1




w



xd
2



+



h



t
-
1

2




w



hd
2



+


b



d
2



)









D

~




2

t


=

tanh

(




h


t
1




w



xd
2



+



h



t
-
1

2




w



hd
2



+


b



d
2



)








In Equation (14),








w



x


d
2





and




w



h


d
2







each are a weight of the FC layer FC3,








w



x


d
2





and




w



h


d
2







each are a weight of the FC layer FC4, {right arrow over (b)}d2 is a bias of the FC layer FC3, and custom-characterd2 is a bias of the FC layer FC4.


A softmax operation is respectively performed on the {right arrow over ({tilde over (D)})}2t and the custom-character2t to obtain forward and backward trend levels {right arrow over (D)}2t and custom-character2t:






{right arrow over (D)}
2t=soft max({right arrow over ({tilde over (D)})}2t)=[{right arrow over (d)}21t{right arrow over (d)}22t{right arrow over (d)}23t{right arrow over (d)}24t{right arrow over (d)}25t]






custom-character
2t=soft max(custom-character2t)=[custom-character21tcustom-character22tcustom-character23tcustom-character24tcustom-character25t]  (15)


In Equation (15), {right arrow over (d)}21t(custom-character21t), {right arrow over (d)}22t(custom-character22t), {right arrow over (d)}23t(custom-character23t), {right arrow over (d)}24t(custom-character24t), and {right arrow over (d)}25t(custom-character25t) respectively represent a local trend, a medium and short-term trend, a medium-term trend, a medium and long-term trend and a global trend in bidirectional calculation, and {right arrow over (d)}2 max t and custom-character2 max t in {right arrow over (D)}2t and custom-character2t represent trend levels along two directions at the time t.


4-2-2) Feature extraction: According to the trend division results {right arrow over (d)}2t and custom-character2t, data of different trends are input to the multicellular update units custom-character and custom-character, and which perform differential learning along the two directions, for update, the custom-character comprising five subunits custom-character(i), custom-character(j), custom-character(k), custom-character(l), custom-character(m), and the comprising five subunits custom-character(i), custom-character(j), custom-character(k), custom-character(l), and custom-character(m):











i
r

t

=

σ

(




w
r


ih
1





h
r

t
1


+



w
r


ih
2





h
r


t
-
1

2




b
r

i



)





(
16
)











f
r

t

=

σ

(




w
r


fh
1





h
r

t
1


+



w
r


fh
2





h
r


t
-
1

2




b
r

f



)









i
s

t

=

σ

(




w
s


ih
1





h
s

t
1


+



w
s


ih
2





h
s


t
-
1

2




b
s

i



)









f
s

t

=

σ

(




w
s


fh
1





h
s

t
1


+



w
s


fh
2





h
s


t
-
1

2




b
s

f



)










c
r

t

(
m
)

=


c
r


t
-
1











c
s

t

(
m
)

=


c
s


t
-
1











c
r

t

(
i
)

=


r

c
t


=

tanh

(




W
r


ch
1





h
r

t
1


+



W
r


ch
2





h
r


t
-
1

2


+


b
r

c


)











c
s

t

(
i
)

=


s

c
t


=

tanh

(




W
s


ch
1





h
s

t
1


+



W
s


ch
2





h
s


t
-
1

2


+


b
s

c


)











c
r

t

(
k
)

=




f
r

t



e




c
r


t
-
1



+



i
r

t



e



r

c
t













c
s

t

(
k
)

=




f
s

t



e




c
s


t
-
1



+



i
s

t



e



s

c
t













c
r

t

(
l
)

=



s
1

(




f
r

t



e




c
r


t
-
1



+



i
r

t



e



r

c
t




)

+


(

1
-

s
1


)




c
r


t
-
1













c
s

t

(
l
)

=



s
3

(




f
s

t



e




c
s


t
-
1



+



i
s

t



e



s

c
t




)

+


(

1
-

s
3


)




c
s


t
-
1













c
r

t

(
j
)

=



s
2

(




f
r

t



e




c
r


t
-
1



+



i
r

t



e



r

c
t




)

+


(

1
-

s
2


)



r

c
t













c
s

t

(
j
)

=



s
4

(




f
s

t



e




c
s


t
-
1



+



i
s

t



e



s

c
t




)

+


(

1
-

s
5


)



s

c
t








In Equation (16), arrows → and ← respectively represent forward and backward processes, custom-character(m), custom-character(m) are corresponding data update units of the global trend in the bidirectional calculation, custom-character(i), custom-character(i) are corresponding data update units of the short-term trend in the bidirectional calculation, custom-character(k), custom-character(k) are corresponding data update units of the medium-term trend in the bidirectional calculation, custom-character(l), custom-character(l) are corresponding data update units of the medium and long-term trend in the bidirectional calculation, custom-character(j), custom-character(j) are corresponding data update units of the medium and short-term trend in the bidirectional calculation, σ is a sigmod activation function, custom-characterth1, custom-character and custom-characterth2, custom-character are weights of input gates of the MCBLSTM model, custom-characterfh1, custom-character and custom-character, custom-character are weights of forget gates of the MCBLSTM model, custom-characterch1,custom-character and custom-character, custom-character are weights of cell storage units of the MCBLSTM model, custom-characteri and custom-characteri are biases of the input gates of the MCBLSTM model, custom-characterf and custom-characterf are biases of the forget gates of the MCBLSTM model, custom-characterc and custom-characterc are biases of the cell storage units of the MCBLSTM model, ⊙ is a dot product operation, and s1, s3, s3 and s4 each are a mix proportion factor obtained by learning.


Weights of alienation outputs of daughter-cell units in the multicellular update units are combined according to update results of five alienation units and the trend division results {right arrow over (D)}2t and custom-character obtain outputs custom-character and custom-character of the multicellular update units, and controlling output gates {right arrow over (o)}t and custom-character of the MCBLSTM model to obtain an output h2t of the MCBLSTM model at the time t:






{right arrow over (c)}
t
={right arrow over (D)}
2t
[{right arrow over (c)}
t(i){right arrow over (c)}t(j){right arrow over (c)}t(k){right arrow over (c)}t(l){right arrow over (c)}t(m)]T






custom-character=custom-character[custom-character(i)custom-character(j)custom-character(k)custom-character(l)custom-character(m)]T






{right arrow over (o)}
t=σ({right arrow over (w)}ox{right arrow over (h)}t1+{right arrow over (w)}oh{right arrow over (h)}t-12+{right arrow over (b)}o)






custom-character
t=σ(custom-character+custom-character+custom-character)






{right arrow over (h)}
t
2
={right arrow over (o)}
t□tanh({right arrow over (c)}t)






custom-character=custom-character□tanh(custom-character)






h
2
t
={right arrow over (h)}
t
2custom-character  (17)


In Equation (17), custom-characterox, custom-character and custom-characterox, custom-character are weights of the output gates of the MCBLSTM model, and σ and tanh each are an activation function.


In the example of the present disclosure, in order to keep the global trend as long as possible, the cell units custom-character(k) and custom-character(k) are updated from a state at previous time. In order to replace the local trend timely, the units custom-character(k) and custom-character(k) are updated from an internal state at this time. According to the conventional cell update mechanism in the BLSTM, custom-character(k) and custom-character(k) in the medium-term trend are updated with custom-character(k) and custom-character(k) in the global trend as well as custom-character(k) and custom-character(k) in the local trend, the units in the medium and long-term trend are updated with custom-character(k) and custom-character(k) in the global trend as well as custom-character(k) and custom-character(k) in the medium-term trend, and the units in the medium and short-term trend are updated with custom-character(k) and custom-character(k) in the medium-term trend as well as h2t and h2t in the local trend.


5) RUL prediction: The output of the MCBLSTM model is taken as an input of the FC layer FC1 to obtain an output of the FC layer FC1, and the output of the FC layer FC1 is input to the regression layer to predict a RUL, specifically:


h2t is input to the FC layer FC1, overfitting is prevented by Dropout to obtain an output h3t of the FC layer FC1, and the h3t is input to the regression layer to obtain a predicted RUL yt:










h
t
3

=

dropout



(

Relu

(



w


h
2



h
3





h
t
2


+

b

h
3



)

)






(
18
)













y
t

=

Linear



(



w


h
3


y




h
t
3


+

b
y


)






(
19
)







In Equations (18) and (19), wh2h3 is a weight of the FC layer FC1, bh3 is a bias of the FC layer FC1, wh3y is a weight of the regression layer, and by is a bias of the regression layer.


In the example of the present disclosure, there are N samples in training. A mean square error (MSE) is defined as a loss function and calculated by:










M

S

E


Loss

=


1
2






i
=
1

N



(



Rul
_

i

-

Rul
i


)

2







(
20
)







In Equation (20), Ruli and are Ruli respectively a predicted RUL and an actual RUL of an ith sample. An error gradient of each level is obtained by back propagation, and a weight parameter of the model is optimized by Adam optimization. The Dropout is used to prevent the overfitting in deep learning (DL).


Hyper-parameters of the ADLDNN are selected by a grid search method:


C11, C12, C13, C14, C15, C21, C22, C23, C24, C31, C32, C33, C41, C42, and C51 respectively have a kernel size of 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 7, 2, 2, 2, and 9.


P11, P12, P13, P14, P14, P21, P22, P23, P24, P31, P32, P33, P41, P42, and P51 respectively have a maximum pooling size of 2, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 2, and 2.


It is assumed that the convolution kernel has a step size of 1, the MCBLSTM has 30 neurons, the FC layer FC1 has 30 neurons, and the regression layer has one neuron. The Dropout is set as 0.5, and the window size and the step size are respectively set as 30 and 1.


6) Experimental validation:


6-1) Evaluation indexes: A score and a root-mean-square error (RMSE) of IEEE are taken as evaluation indexes to quantitatively characterize RUL prediction performance. The evaluation indexes can be respectively calculated by:










A
i

=

{






exp

(

-

(


(



Rul
_

i

-

Rul
i


)

/
13

)


)

-
1

,






Rul
_

i

<

Rul
i









exp

(


(



Rul
_

i

-

Rul
i


)

/
10

)

-
1

,






Rul
_

i



Rul
i










(
21
)












Score
=




i
=
1

N


A
i






(
22
)













R

M

S

E

=



1
N






i
=
1

N



(


Rul
i

-


Rul
_

i


)

2








(
23
)







In Equations (21), (22), and (23), Ruli and Ruli are respectively an actual RUL and a predicted RUL of an ith engine, and N is a total number of engines in a subset. Values of these indexes are inversely proportional to the RUL performance, namely the smaller the values, the better the performance of the model. The score imposes a greater penalty on over-prediction than the RMSE and thus is more suitable for engineering practices. Therefore, while the RMSEs are close to each other, the model is more evaluated based on the scores.


6-2) RUL prediction and comparison: The proposed ADLDNN is trained first by FD001, FD002, FD003 and FD004 training sets, and tested by corresponding test sets. Predicted results on the four subsets are respectively as shown in FIGS. 4-7.


In FIGS. 4-7, the x axis refers to a number of test engines, and the y axis refers to a RUL value. The predicted RUL and the actual RUL are respectively described by a solid line and a dotted line. The header in FIGS. 4-7 shows the score value and the RMSE value in the predicted result. It can be intuitively seen that the error between the predicted RUL and the actual RUL in FIG. 4 is less than that in FIGS. 5-7, which means that the proposed ADLDNN shows best performance on FD001. In addition, the method shows better performance on FD003 than FD002, and worst performance on FD004.


The engine has a relatively simple degradation trend in a single operating condition, and there is a large overlapping degree between the training set and the test set. Hence, predicted results on FD001 and FD003 in the single operating condition are superior to those on FD002 and FD004 in various operating conditions. In addition, the predicted result on FD001 is more accurate than that on FD003, and the predicted result on FD002 is more accurate than that on the FD004. Therefore, the prediction accuracy in the single-failure mode is higher than that in the multi-failure mode. It can be further seen that the predicted result on FD003 is superior to that on FD002, which means that the number of failure modes has a less impact on RUL prediction than the number of operating conditions.


In order to further show the superiority of the ADLDNN in RUL prediction, comparisons are made between the proposed method and various typical methods based on the statistical model, shallow learning model, classic DL model and several recently published DL models. In addition, scores and RMSEs calculated according to predicted results of the above all methods are as shown in Table 3. As can be seen from the table, all methods show the best predictive effect to FD001 and the worst predictive effect to FD004. This is because FD001 is the simplest subset, while FD004 has the most complex operating conditions and fault types and more test engine numbers than other subsets. All methods are more accurate to FD003 than FD002, which further proves that the operating condition and the engine number have a greater impact on the accuracy of RUL prediction than the fault type.


As can be seen from Table 3, for the simplest FD001, except Acyclic Graph Network, a score and an RMSE in the result predicted by the method are smaller than those in the results predicted by existing other methods. However, for complex datasets such as FD002 and FD004, the method shows a stronger prediction capability than other typical methods. In addition, since the score is more practical than the RMSE in actual engineering, the ADLDNN is considered to be superior to Acyclic Graph Network in FD003. Compared with existing typical methods, the ADLDNN is more applied to process complex datasets including various operating conditions and fault types. In conclusion, the ADLDNN shows high overall performance, and can be better applied to predict the machine RUL.









TABLE 3





Quantitative comparisons of different methods in prediction performance on datasets




















Statistical





method











Data
Evaluation
Cox's
Shallow learning method
DL method
















set
standard
regression
MLP
SVR
RVR
ELM
RF
CNN
LSTM





FD001
RMSE
45.10
37.56
20.96
23.86
17.27
17.91
18.45
16.14



Score
28616
17972
1381.5
1502.9
523
479.75
1286.7
338


FD002
RMSE
N/A
80.03
41.99
31.29
37.28
29.59
30.29
24.49



Score
N/A
7802800
58990
17423
498149
70456
17423
4450


FD003
RMSE
N/A
37.38
21.04
22.36
18.9
20.27
19.81
16.18



Score
N/A
17409
1598.3
1431.6
121414
711.13
1431
852


FD004
RMSE
54.29
77.37
45.35
34.34
38.43
31.12
29.16
28.17



Score
1164590
5616600
371140
26509
121414.47
46567.63
7886.4
5550


















Prediction





DL method
method





















LSTM +



of the







attention +
Acyclic


present



Data
Evaluation


handscraft
Graph
SUR-

disclosure



set
standard
DBN
MONBNE
feature
Network
LSTM
AEQRNN
ADLDNN







FD001
RMSE
15.21
15.04
14.53
11.96
14.46
N/A
13.19




Score
417.59
334.23
322.44
229
200
N/A
275



FD002
RMSE
27.12
25.05
N/A
20.34
21.1
19.10
17.33




Score
9031.64
5590
N/A
2730
1383
3220
1149



FD003
RMSE
14.71
12.51
N/A
12.46
17.16
N/A
13.81




Score
442.43
422
N/A
535
370
N/A
334



FD004
RMSE
29.88
28.66
27.08
22.43
22.61
20.6
19.89




Score
7954.51
6557.62
5649.14
3370
2602
4597
2505










Those skilled in the art should understand that the embodiments of the present disclosure may be provided as a method, a system, or a computer program product. Therefore, the present disclosure may use a form of hardware only embodiments, software only embodiments, or embodiments with a combination of software and hardware. Moreover, the present disclosure may be in a form of a computer program product that is implemented on one or more computer-usable storage media (including but not limited to a magnetic disk memory, a CD-ROM, an optical memory, and the like) that include computer-usable program code.


The present disclosure is described with reference to the flowcharts and/or block diagrams of the method, the device (system), and the computer program product according to the embodiments of the present disclosure. It should be understood that computer program instructions may be used to implement each process and/or each block in the flowcharts and/or the block diagrams and a combination of a process and/or a block in the flowcharts and/or the block diagrams. These computer program instructions may be provided for a general-purpose computer, a dedicated computer, an embedded processor, or a processor of another programmable data processing device to generate a machine, such that the instructions executed by a computer or a processor of another programmable data processing device generate an apparatus for implementing a specific function in one or more processes in the flowcharts and/or in one or more blocks in the block diagrams.


These computer program instructions may be stored in a computer-readable memory that can instruct the computer or any other programmable data processing device to work in a specific manner, such that the instructions stored in the computer-readable memory generate an artifact that includes an instruction apparatus. The instruction apparatus implements a specific function in one or more processes in the flowcharts and/or in one or more blocks in the block diagrams.


These computer program instructions may be loaded onto a computer or another programmable data processing device, such that a series of operations and steps are performed on the computer or the another programmable device, thereby generating computer-implemented processing. Therefore, the instructions executed on the computer or the another programmable device provide steps for implementing a specific function in one or more processes in the flowcharts and/or in one or more blocks in the block diagrams.


Finally, it should be noted that: the above embodiments are merely intended to describe the technical solutions of the present disclosure, rather than to limit thereto; although the present disclosure is described in detail with reference to the above embodiments. It is to be appreciated by those of ordinary skill in the art that modifications or equivalent substitutions may still be made to the specific implementations of the present disclosure, and any modifications or equivalent substitutions made without departing from the spirit and scope of the present disclosure shall fall within the protection scope of the claims of the present disclosure.

Claims
  • 1. A method for predicting a remaining useful life (RUL) of an aero-engine based on an automatic differential learning deep neural network (ADLDNN), specifically comprising the following steps:
  • 1. data acquisition: acquiring multidimensional degradation parameters of an aero-engine to be predicted, analyzing a stable trend, and selecting a plurality of parameters capable of reflecting degradation performance of the aero-engine to obtain acquired data;
  • 2. data preprocessing: segmenting the acquired data by a sliding window (SW) to obtain preprocessed data;
  • 3. model construction: constructing a RUL prediction model of the aero-engine based on an ADLDNN, the RUL prediction model comprising a multibranch convolutional neural network (MBCNN) model, a multicellular bidirectional long short-term memory (MCBLSTM) model, a fully connected (FC) layer FC1, and a regression layer;
  • 4. feature extraction: taking the preprocessed data as input data of the MBCNN model, extracting an output of the MBCNN model, taking the output of the MBCNN model and recursive data as input data of the MCBLSTM model, and extracting an output of the MCBLSTM model; and
  • 5. RUL prediction: taking the output of the MCBLSTM model as an input of the FC layer FC1 to obtain an output of the FC layer FC1, and inputting the output of the FC layer FC1 to the regression layer to predict a RUL.
  • 2. The method for predicting a RUL of an aero-engine based on an ADLDNN according to claim 1, wherein the MBCNN model comprises a level division unit, and a spatial feature alienation-extraction unit; and the MCBLSTM model comprises a bidirectional trend-level division unit, and multicellular update units.
  • 3. The method for predicting a RUL of an aero-engine based on an ADLDNN according to claim 2, wherein the extracting an output of the MBCNN model in step 4) specifically comprises: 4-1-1) level division: taking the preprocessed data in step 2) as the input data, inputting input data x, at time t to the level division unit of the MBCNN model for level division, the level division unit comprising an FC layer FC2 composed of five neurons, and performing softmax normalization on an output Dt of the FC layer FC2 to obtain a level division result D1t: Dt=tanh(wxd1xt+bd1)   (1)D1t=soft max(Dt)=[d11td12td13td14td15t]  (2)wherein in equations (1) and (2), wxd1 and bd1 respectively represent a weight and a bias of the FC layer FC2, d11t, d12t, d13t, d14t and d15t respectively represent an important level, a relatively important level, a general level, a relatively minor level and a minor level, and a position of a maximal element in D1t represents a level division result of a present input; and4-1-2) feature extraction: inputting, according to a level division result D1 of the input data, the input data to different convolution paths of the spatial feature alienation-extraction unit for convolution, and performing automatic differential processing on an input measured value according to the level division result and five designed convolution paths to obtain a health feature ht1: hti1=P15(C15(P14(C14(P13(C13(P12(C12(P11(C11(xt))))))))))htj1=P24(C24(P23(C23(P22(C22(P21(C21(xt))))))))htk1=P33(C33(P32(C32(P31(C31(xt))))))htl1=P42(C42(P41(C41(xt))))htm1=P51(C51(xt))ht1=D1t[hti1htj1htk1htl1htm1]T   (3)wherein in equation (3), Pij and Cij respectively represent a jth convolution operation and a jth pooling operation for an ith convolution path, hti1 is a convolution output of data of the important level, htj1 is a convolution output of data of the relatively important level, htk1 is a convolution output of data of the general level, htl1 is a convolution output of data of the relatively minor level, and htm1 is a convolution output of data of the minor level.
  • 4. The method for predicting a RUL of an aero-engine based on an ADLDNN according to claim 2, wherein the extracting an output of the MCBLSTM model in step 4) specifically comprises: 4-2-1) trend division: taking an output ht1 of the MBCNN model at time t and recursive data h2t-1 of the MCBLSTM model at time t−1 as input data of the MCBLSTM at time t, and inputting the input data to the bidirectional trend-level division unit for trend division, the bidirectional trend-level division unit comprising an FC layer FC3 and an FC layer FC4 for dividing a trend level of the input data along forward and backward directions, the FC layer FC3 and the FC layer FC4 each comprising five neurons, and the FC layer FC3 and the FC layer FC4 respectively having an output {right arrow over ({tilde over (D)})}2t and an output 2t:
  • 5. The method for predicting a RUL of an aero-engine based on an ADLDNN according to claim 1, wherein the predicting a RUL in step 5) specifically comprises: inputting h2t to the FC layer FC1, preventing overfitting by Dropout to obtain an output h3t of the FC layer FC1, and inputting the h3t to the regression layer to obtain a predicted RUL
Priority Claims (1)
Number Date Country Kind
202111261992.X Oct 2021 CN national