SYSTEM AND METHOD FOR DYNAMIC-MODULAR-NEURAL-NETWORK-BASED MUNICIPAL SOLID WASTE INCINERATION NITROGEN OXIDES EMISSION PREDICTION

Information

  • Patent Application
  • 20240078410
  • Publication Number
    20240078410
  • Date Filed
    August 16, 2023
    9 months ago
  • Date Published
    March 07, 2024
    2 months ago
Abstract
A dynamic modular neural network (DMNN) for NOx emission prediction in MSWI process is provided. First, the input variables are smoothed and normalized. Then, a feature extraction method based on principal component analysis (PCA) was designed to realize the dynamic division of complex conditions, and the prediction task to be processed was decomposed into sub-tasks under different conditions. In addition, aiming each sub-tasks, a long short-term memory (LSTM)-based sub-network is constructed to achieve accurate prediction of NOx emissions under various working conditions. Finally, a cooperative strategy is used to integrate the output of the sub-networks, further improving the accuracy of prediction model. Finally, merits of the proposed DMNN are confirmed on a benchmark and real industrial data of a municipal solid waste incineration (MSWI) process. The problem that the NOx emission of MSWI process is difficult to be accurately predicted due to the sensor limitation is effectively solved.
Description
FIELD

The invention relates in general to municipal solid waste incineration and intelligent modeling field, and in particular, to a system and method for dynamic-modular-neural-network-based municipal solid waste incineration nitrogen oxides emission prediction.


BACKGROUND

MSWI is an effective measure for killing pathogens, reducing quantity and recycling resources. Thus, it is a universally accepted strategy for MSW disposal. However, the potential secondary pollution in MSWI process, such as nitrogen oxides (NOx), is the major reason for the not-in-my-back-yard effect. NOx is formed at a high temperature during combustion, causing damage to human health and environment. However, the existing technology can only measure NOx emission at current moment, and cannot provide the reference value of NOx emission at the future moment for operators, which will bring about problems such as lagging control means and excessive NOx emissions. Therefore, accurate prediction of NOx emission is of great significance to improve the efficiency of denitration system and ensure the safe and stable operation of MSWI plant.


SUMMARY

The invention provides a prediction method for MSWI process based on a dynamic modular neural network (DMNN). The prediction model based on dynamic modular neural network was established to achieve accurate prediction of NOx emission in the future. DMNN is used to construct NOx emission prediction model, which can track the dynamic characteristics of MSWI process. Thus, accurate prediction of NOx emissions can be achieved.


In one embodiment, a system and method for dynamic-modular-neural-network (DMNN)-based municipal solid waste incineration (MSWI) process nitrogen oxides (NOx) emission prediction is provided. Sensor data associated with an MSWI process is obtained, the sensor data including a data set comprising a plurality of samples. The sensor data is preprocessed to remove those of the samples that comprise noise and to standardize the data set. A task of prediction of a NOx emission associated with an MSWI process is decomposed into a plurality of sub-tasks using principal component analysis, including applying a sliding window of a fixed size to the preprocessed sensor data set and identifying key variables of operating conditions of the MSWI process by key variables by applying a sliding window to the preprocessed sensor data set, each of the key variables associated with one of the sub-tasks. A long-short-term memory (LSTM) neural network is constructed, the LTSM neural network including a plurality of sub-networks, wherein each of the sub-networks outputs a value for one of the sub-tasks and a key variable associated with that sub-task serves as an input for that sub-network. A further set of sensor data associated with a further MSWI process is obtained, the further sensor data including further data samples; At least one of the further samples is compared to at least some of the samples in the preprocessed sensor data set; At least some of the sub-networks are activated based on the comparison. The activated subnetworks in the LTSM network are used to predict the NOx emission for the further MSWI process, wherein the steps are performed by at least one suitably-programmed computer and wherein a plant associated with the further MSWI process is operated based on the NOx prediction for the further MSWI process.


The technical scheme and steps of the invention are as follows:


As shown in FIG. 1. The related equipment of NOx emission prediction includes thermocouple temperature sensor, air volume sensor, liquid flow sensor, continuous emission monitoring system, distributed control system and upper computer. The continuous emission monitoring system includes a nitrogen oxide concentration detector. The detection instruments such as nitrogen oxide concentration detector, thermocouple temperature sensor, air volume sensor and liquid flow sensor are connected to the distributed control system through the fieldbus, and the data collected by sensors is transmitted to the I/O communication template. The analog voltage signal is converted into a digital signal that the computer can recognize through switch gating, amplifier and A/D converter and communicate with upper computer by industrial ethernet.


Through the switch gating in the communication template, the amplifier and A/D converter convert the analog voltage signal into the digital signal that the computer can recognize and communicate with the upper computer through the industrial Ethernet. The upper computer obtains the data of the MSWI process in real time and stores the collected data in the structured query language server database.


To obtain experimental data, the hardware storage device is used to read the historical data, which includes a total of 10 process variables and a NOx value to be predicted, they are the air flow of combustion grate left side 1-1, air flow of dry grate left side 1, temperature of primary combustion chamber, left side temperature of primary combustion chamber, right side temperature of primary combustion chamber, cumulative primary air flow, cumulative secondary air flow, accumulated urea solution flow, accumulated urea solution supply flow and NOx emission value. Among these variables, air flow is detected by the air volume sensor, the temperature is detected by the thermocouple temperature sensor, and the urea solution is detected by the liquid flowmeter.


Since the sensors works in the environment with high temperature and ash content, the original data is often accompanied by noise. To eliminate the influence of noise on prediction model, Rajda criterion is adopted to smooth and de-noise the original data. In addition, Z-score algorithm is used for normalization to eliminate the influence between different dimensions. After data processing, the processed variables and NOx value are taken as input and output of DMNN model, respectively. After off-line training of model, the real-time data in server is read online and used as inputs of DMNN model to predict NOx value of 10 s. The predicted value of NOx can be used for reference in denitration control system. If predicted value is higher than the current moment, the operator will increase the urea input to reduce NOx emission and meet the environmental protection index. On the contrary, if predicted value of NOx is lower than the current value, it is necessary to reduce urea supply to meet the economic indicators.


Dynamic Task Decomposition Based on PCA

The original data was read from distributed control system with sampling interval of 10 s. A sliding window is used to detect the principal components in the time-series. The size of sliding window is denoted by win_1. Assume that the observation sample matrix in the first sliding window is represented by XM×n1win_1










X

m
×

n
1



win

_

1


=



[










x
1




x
2













x
m




]

T

=

[




x
11




x
12







x

1


n
1








x
21




x
22







x

2


n
1






















x

m

1





x

m

2








x

mn
1





]






(
1
)









    • where m and n1 are the number of variables and samples in the sliding window win_1. x1 x2 . . . xm represent m variables of the matrix, which are inputs of prediction model. For the debutanizer column dataset, x1 x2 . . . xm denote a total of 13 variables, they are top temperature, top pressure, flow of reflux, flow to the next process, temperature of the sixth tray at time t, temperature of the sixth tray at t-1, temperature of the sixth tray at t-2, temperature of the sixth tray at t-3, average value of the temperature at bottom at t, and the butane concentration at t-1, t-2, t-3, and t-4, respectively. The size of m is 13 in this case. For MSWI process, x1 x2 . . . xm represent a total of 10 variables, they are air flow of combustion grate (left side 1-1), air flow of combustion grate (right side 1-1), air flow of dry grate (left side 1-1), primary combustion chamber temperature, primary combustion chamber temperature(left), primary combustion chamber temperature(right), accumulation of primary air flow, accumulation of secondary air flow, accumulation of urea solution, and accumulation of urea solvent supply, respectively. The size of m is 10 in the real industrial data.





The mean vector μ of sample matrix Xm×n1win_1 is denoted as









μ
=


[



μ
¯

1

,


μ
¯

2

,


,


μ
¯

m


]

T





(
2
)














μ
¯

i

=


1

n
1







j
=
1


n
1



x
ij








(
3
)










    • where μ1, μ2, . . . , μm represent mean value of each row in Xm×n1win_1, and then the mean value of each variable can be obtained by Eq. (3). μi denotes the i-th value of vector μ. i=1, 2, . . . , m, and m is the number of variables. xij denote the value of i-th variable in j-th sample. j=1, 2, . . . , n1, n1 represents samples number in sliding window with the size of win_1.





All the samples of matrix Xm×n1win_1 minus the mean (decentralized) are denoted as














X
˜


m
×

n
1



win

_

1


=



[





x
11




x
12







x

1


n
1










x
21




x
22







x

2


n
1















x

m

1





x

m

2








x

mn
1






]

-

[






μ
¯

1

,


μ
¯

1

,


,


μ
¯

1









μ
¯

2

,


μ
¯

2

,


,


μ
¯

2













μ
m

,


μ
¯

m

,


,


μ
¯

m





]








=


[






x
~

11





x
~

12








x
~


1


n
1











x
~

21





x
~

22








x
~


2


n
1
















x
~


m

1






x
~


m

2









x
~


mn
1






]







=



[



x
˜

1

,


x
˜

2

,


,


x
˜

m


]

T








(
4
)









    • where {tilde over (X)}m×n1win_1 represents the matrix after decentralization, {tilde over (x)}ij denotes the value of the i-th feature after decentralization in j-th sample, m represents the number of variables, and n is the number of samples contained in sliding window with the size of win_1.





The covariance matrix Hm×mwin_1 of {tilde over (X)}m×n1win_1 is calculated as:










H

m
×
m


win

_

1


=


1


n
1

-
1






X
~


m
×

n
1



win

_

1


·


X
~


m
×

n
1



win

_

1

T








(
5
)









    • where {tilde over (X)}m×n1win_1T is the transpose of {tilde over (X)}m×n1win_1.





Then, the eigenvalue λ of covariance matrix Hm×mwin_1 can be calculated as












"\[LeftBracketingBar]"



H

m
×
m


win

_

1


-

λ

I




"\[RightBracketingBar]"


=
0




(
6
)












I
=

[



1


0





0




0


1





0


















0


0





1



]





(
7
)









    • where I denote the unit matrix. Based on Eq. (6), the eigenvalues of Hm×mwin_1 can be represented as








λ1≥λ2≥ . . . ≥λQ   (8)

    • where Q is the number of eigenvalues. According to Eq. (8), the eigenvector α corresponding to each eigenvalue is calculated as





(Hm×mwin_1−λkIk=0   (9)

    • where Hm×mwin_1 is covariance matrix. λk denotes the k-th eigenvalue. I is unit matrix, which is represented by Eq. (7). αk is eigenvector corresponding to the k-th eigenvalue. αk=[α1k, α2k, . . . , αmk]T, (k=1, 2, . . . , Q).


The threshold of cumulative variance contribution rate is set as θ, and if the cumulative variance satisfies













i
=
1


Q
0



λ
k


>
θ




(
10
)







Then the first Q0 principal components are selected for further analysis. Q0 is the number of principal components, which is determined by Eq. (10). The number of eigenvalues is Q0, which is equal to the number of principal components. λk denotes the k-th eigenvalues.


Generally, in most studies, the threshold of cumulative variance contribution rate is selected above 0.8, that is, θ≥0.8. Therefore, the threshold θ is determined as 0.85.


Then, the unit eigenvector α corresponding to Q0 eigenvalues is used as a coefficient for linear transformation to obtain Q0 principal components.





zkkTx   (11)

    • where αk=[α1k, α2k, . . . , αmk]T (k=1, 2, . . . , Q0).


Combining with the samples in Xm×n1win_1, the principal components of n1 samples can be obtained by Eq. (11). The k-th principal component zkj of the j-th sample xj=[x1j, x2j, . . . , xmj]T (j=1, 2, . . . , is calculated as













𝓏
kj

=




[


α

1

k


,

α

2

k


,


,

α

m

k



]

[


x

1

j


,

x

2

j


,

x
mj


]

T







=





i
=
1

m



α
ik



x
ij










(
12
)









    • where α1k, α2k, . . . , αmk denoted the m values of k-th unit eigenvector. x1j, x2j, . . . , xmj represent the m variables of j-th sample, respectively. j=1, 2, . . . , n1, i=1, 2, . . . , m, and k=1, 2, . . . , Q0.





According to Eq. (12), zk that containing k principal components can be denoted by zk=[zk1, zk2, . . . , zkn1]. Therefore, a factor load is defined as the correlation between the k-th principal component zk and i-th feature xi, which is calculated as










ρ

(


𝓏
k

,

x
i


)

=




λ
k




α
ik




σ
ii







(
13
)









    • where αik is the unit eigenvector, which denote the i-th value in αik. σii is the variance of the i-th variable xi, which is also the i-th diagonal entry of covariance matrix Hm×mwin_1, k=1, 2, . . . , Q0, i=1, 2, . . . , m. The factor load matrix is expressed as












ρ
=

[




ρ

(


𝓏
1

,

x
1


)




ρ


(


𝓏
1

,

x
2


)








ρ


(


𝓏
1

,

x
m


)







ρ

(


𝓏
2

,

x
1


)




ρ


(


𝓏
2

,

x
2


)








ρ


(


𝓏
2

,

x
m


)





















ρ

(


𝓏
q

,

x
1


)




ρ


(


𝓏
q

,

x
2


)








ρ


(


𝓏
q

,

x
m


)





]





(
14
)







Then, the contribution rate υi of Q0 principal components to the i-th variable xi(i=1, 2, . . . , m) is










υ
i

=




k
=
1


Q
0




ρ
2

(


𝓏
k

,

x
i


)






(
15
)









    • where the contribution rate υi is the sum of squares of factor loads between the Q0 principal components and i-th variable xi. Then, the contribution rate matrix υ of Q0 principal components corresponding to each variable can be expressed as








υ=[υ1, υ2, . . . , υm]  (16)

    • where m represents the number of variables contained in Xm×n1win_1. The importance of variables changes with the fluctuation of complex operating conditions in MSWI furnace, that is, the contribution rate υi of principal components corresponding to each variable will also change. Therefore, the contribution rate υ is reordered in a descending order.





sort(υ)=[υmax, . . . , υmin]  (17)

    • where the function of sort(·) is to sort data in a descending order. υmax and υmin represent the maximum and minimum value of contribution rate, respectively. The key variables are determined by defining a threshold value ψ.










{




i
=
1

F


υ
i


}

>
ψ




(
18
)









    • where the value of ψ is equal to the cumulative variance contribution rate, that is ψ=0.85. F denote the number of key variables, which can be determined by ψ. Equation (18) indicates that the first F variables have the greatest correlation with the principal components in the current window. Then, the first F key features are selected as reference vectors for condition identification, as shown in Eq. (19).








con_1=[xnum_1win_1, xnum_2win_1, . . . , xnum_Fwin_1]  (19)

    • where xnum_1win_1, xnum_2win_1, . . . , xnum_Fwin_1 represent the first F variables in Xm×n1win_1. Thereafter, the window moves forward by a certain step, and the key variables are detected successively. Finally, the key variables in each sub-task are stored in the knowledge base for modeling analysis, which is expressed as





condition_library=[con_1,con_2, . . . , con_W]  (20)

    • where con_1,con_2, . . . , con_W represent reference vectors corresponding to different operating conditions, respectively. W denotes the number of operating conditions.


In this invention, the size of sliding window and moving step is selected according to specific data sets. The simulation phase includes a debutanizer column process and a real industrial data of MSWI process. For debutanizer column process, the sliding window size is 600. Considering the dataset is accompanied by slow fluctuations, the moving step of sliding window is set to 300. For MSWI process, the size of sliding window is 600. Considering the complex variation and large fluctuation of the process, the moving step of sliding window is set to 100.


Construction of the LSTM-Based Sub-Network

The performance of sub-network is critical for the whole MNN. Aiming each sub-task, LSTM neural network is explored driven by the corresponding key variables. LSTM cell comprises forget, input, cell state and output gate, which can effectively overcome the gradient disappearance problem existing in general networks through the gate operation.


The internal structural of LSTM cell is shown in FIG. 3. Different gates are marked with different colors. Each gate is calculated as follows


Forget gate:






f
t=σ(Wf·[ht-1, xt]+bf)   (21)


Input gate:






i
t=σ(Wi·[ht-1, xt]+bt)   (22)


Cell state gate:






{tilde over (C)}
t=tan h(Wc·[ht-1, xt]+bc)   (23)






C
t
=f
t
⊗C
t-1
+i
t
⊗{tilde over (C)}
t   (24)


Output gate:






o
t=σ(Wo[ht-1, xt]+bo)   (25)


Using Eqs. (21)-(25), the final output of LSTM is






ŷ
NOx
t
=o
t⊗tan h(Ct)   (26)

    • where xt denote the input of LSTM neural network at time t. They are air flow of combustion grate (left side 1-1), air flow of combustion grate (right side 1-1), air flow of dry grate (left side 1-1), primary combustion chamber temperature, primary combustion chamber temperature(left), primary combustion chamber temperature(right), accumulation of primary air flow, accumulation of secondary air flow, accumulation of urea solution, accumulation of urea solvent supply at time t, respectively. ht-1 is the output of LSTM neural network at time t-1. Wf, Wi, Wc and Wo denote the weight matrix of the forget, input, cell state and output gate, respectively. bf, bi, bc and bo are the bias of the forget, input, cell state and output gate, respectively. ft, it, Ct and ot represent the output of the forget, input, cell state and output gate, respectively. ŷNOxt is the output of LSTM neural network at time t. σ(·) and tan h(·) are the activation functions, which are calculated as










σ

(
U
)

=

1

1
+

e

-
U








(
27
)













tanh

(
U
)

=



e
U

-

e

-
U





e
U

+

e

-
U








(
28
)









    • where U denote the input of activation function in each gate.





Forget gate:






U
f
=W
f
·[h
t-1
, x
t
]+b
f   (29)


Input gate:






U
i
=W
i
·[h
t-1
, x
t
]+b
i   (30)


Cell state gate:






U
c
=W
c
·[h
t-1
, x
t
]+b
c   (31)


Output gate:






U
o
=W
o
·[h
t-1
, x
t
]+b
o   (32)


Cooperation Decision Strategy

During testing stage, the similarity between the i-th testing sample and training samples is measured by Euclidean distance.






d
g,j
test=dist(xgtest, xjtrain), (j=1, 2, . . . , N)   (33)





dist(xgtest, xjtrain)=√{square root over (∥xgtest_1−xjtrain_12+ . . . +∥xgtest_m−xjtrain_m2)}  (34)






d
g
test
=[d
g,1
test
, d
g,2
test
, . . . , d
g,N
test]  (35)

    • where xgtest is the g-th sample of testing set. xgtest_1 and xgtest_m denote the first and m-th variable of g-th testing sample, respectively. Similarly, xjtrain_1 and xjtrain_m denote the first and m-th variable of j-th training sample, respectively. dg,1test, dg,2test, . . . , dg,Ntest represent Euclidean distance between g-th sample of testing set and samples of training set, respectively. g=1, 2, . . . , G, j=1, 2, . . . , N. N and G denote the number of samples in training and testing sets.


According to Eq. (35), the training sample xjtrain which is closest to testing sample xgtest is selected. Then, the operating condition of xgtest is determined by that of xjtrain.


Finally, a decision operation strategy is adopted to generate the prediction outputs of MNN during testing phase, which are calculated as











y
ˆ


N

O

x


=





r
=
1

R



y
ˆ


N

O

x

r


R





(
36
)









    • where ŷNOx denote the predicted value of NOx emission, and ŷNOxr is the output of sub-network. r=1, 2, . . . , R, R represent the number of activated sub-networks.





DMNN-Based Prediction Model

The NOx emission prediction model for MSWI process based on DMNN mainly includes four parts: data preprocessing, PCA-based dynamic task decomposition, construction of sub-network and cooperation decision strategy. As shown in FIG. 4, the original dataset is represented by Xori, and XoriϵRL×m, where L denotes the number of samples and m is the number of variables. First, the original data is preprocessed via smooth and normalization, and then represented by Xpre={x1i, x2i, . . . , xmi, yNOxi}i=1N. Second, to implement a dynamic task decomposition, a sliding window is performed on the training set to determine key variables. Furthermore, the corresponding sub-task is formed in each window. Then, a LSTM-based sub-network is established for sub-task with different key variables as inputs. During the testing phase, the sub-networks are activated using similarity between the testing and the training samples, which is measured via Euclidean distance. And the cooperative decision strategy is used to integrate each activated sub-network to generate the final prediction results of NOx.


Data Preprocessing

Denoising: In MSWI process, the sensors usually operate in a high temperature and dust environment, which bring the noise to original data. To reduce the effect of the noise on data analysis, Rajda is used to smooth the original data, as shown in Eq. (37).





|xori−μori|≥3Σori   (37)

    • where xori denotes original sample, μori and σori denotes the mean and standard deviation of variables, respectively. The samples satisfying Eq. (37) are regarded as the outliers and removed from the original data. Then, the dataset after smoothing is expressed as Xsmo, XsmoϵRN×m. N and m denote the number of samples and variables, respectively.


Normalization: To eliminate the influence of different dimensions among the variables and improve the prediction accuracy, standardization is performed on the dataset using Z-score method, which is calculated as Eq. (38).










x
i

=



x
i

s

m

o


-

μ
i

s

m

o




σ
i

s

m

o







(
38
)









    • where xi, μismo, and σismo (i=1, 2, . . . , m) are the normalized vector, mean and standard deviation of the i-th dimension variable, respectively. The normalized dataset is represented by XN×mT. N and m denote the number of samples and variables, respectively.





Schematic Diagram of DMNN-Based Prediction Model

In this section, the proposed DMNN-based NOx emission prediction framework for MSWI process (as shown in FIG. 4) is described as follows.


Training Phase

Step 1: Preprocess the original data ori_data=[Xori Yori] based on Eqs. (37), (38), and then the dataset is expressed by dataset=[X Y];


Step 2: Set a sliding window with a fixed length of win, and the subset contained in the window is Xwin_1; The key features of Xwin_1 are constructed by Eqs. (1)-(20); Thereafter, the window moves forward by a certain step, and the key variables are detected successively; Finally, the key variables in each sub-task are stored in the knowledge base for modeling analysis;


Step 3: For each sub-task, LSTM is applied to established the sub-network driven by the corresponding key variables. And the number of hidden neurons is optimized by trial-and-error method;


Step 4: Move the sliding window in steps and repeat step 2)-step 3).


Testing Phase

Step 5: Calculate the similarity between the test sample and training samples via Eqs. (33)-(35) and generate the outputs of MNN by activating the corresponding the sub-networks.


Step 6: The final prediction result of NOx emission is obtained by integrating the outputs of the sub-networks with a cooperation decision strategy by Eq. (36).


To evaluate the effectiveness of the proposed method, the merits of the DMNN are confirmed on a debutanizer column process and real industrial data of a MSWI process. All the simulations were carried out using MATLAB_R2019b.lnk on a PC with Intel® Core™ i7-7700, CPU @ 3.60 GHz and RAM 8.00 GB. Furthermore, the performances of DMNN was measured by calculating the root mean square error (RMSE), mean absolute percentage error (MAPE), and r-square (R2).










R

M

S

E

=



1
N






j
=
1

N



(


y

d

_

j

NOx

-

y

o

_

j

NOx


)

2








(
39
)













M

A

P

E

=



1

0

0

%

N






j
=
1

N




"\[LeftBracketingBar]"




y

d

_

j

NOx

-

y

o

_

j

NOx



y

o

_

j

NOx




"\[RightBracketingBar]"








(
40
)













R
2

=

1
-





j
=
1

N




(


y

o

_

j

NOx

-

y

d

_

j

NOx


)

2






j
=
1

N



(


y

o

_

j

NOx

-


y
_

o
NOx


)

2








(
41
)









    • where yd_jNOx, yo_jNOx and yoNOx denote the desired, predicted and mean outputs of NOx, respectively.








BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 is a Hardware system in MSWI process.



FIG. 2 is a Flow chart of MSWI process.



FIG. 3 is a Structure of LSTM neural cell.



FIG. 4 is a DMNN-based model for NOx emission prediction in MSWI process.



FIG. 5 is a Distribution of variable importance.



FIG. 6 is a Training result of DMNN.



FIG. 7 is a Testing result of DMNN.



FIG. 8 is a Regression analysis.



FIG. 9 is a Prediction errors comparison of different algorithms.



FIGS. 10a-10e are Prediction errors distribution of different algorithms.



FIG. 11 is a Distribution of variable importance.



FIG. 12 is a Training result of DMNN.



FIG. 13 is a Testing result of DMNN.



FIG. 14 is a Regression analysis.



FIG. 15 is a Prediction errors comparison of different algorithms.



FIGS. 16a-16e are Prediction errors distribution of different algorithms.





DETAILED DESCRIPTION

This invention firstly uses the debutanizer column process to verify the validity of DMNN method, and then it was applied on a real MSWI process to predict NOx emission concentration.


C4 Prediction Based on the DMNN

The original dataset is composed of 2394 samples with 7 variables. Table 1 gives a detailed description of these variables. Considering the dynamic characteristics in the real process, a set of optimal variables were selected for C4 prediction, as shown in Eq. (42).










[






u
1

(
t
)

,


u
2

(
t
)

,


u
3

(
t
)

,


u
4

(
t
)

,


u
5

(
t
)

,


u
5

(

t
-
1

)

,








u
5

(

t
-
2

)

,


u
5

(

t
-
3

)

,


(



u
6

(
t
)

+


u
7

(
t
)


)

/
2

,







y

(

t
-
1

)

,

y

(

t
-
2

)

,

y

(

t
-
3

)

,

y

(

t
-
4

)





]

T




(
42
)














TABLE 1







Variable description on debutanizer column










Secondary variables
Description







u1
Top temperature



u2
Top pressure



u3
Flow of reflux



u4
Flow to the next process



u5
Temperature of the sixth tray



u6
Temperature A at bottom



u7
Temperature B at bottom










The dataset was divided into training and testing set at a ratio of 7:3. The length and step of sliding window is 600 and 300, respectively.


Dynamic Task Decomposition Based on PCA

To explore the dynamics in different windows, a total of five sub-tasks are obtained via the dynamic task decomposition based on PCA. The variable importance in each sub-task is shown in FIG. 5.


The results in FIG. 5 illustrate that the distribution of variable importance in each sub-task is different. It can be seen that the dynamic operation characteristics can be described by the variable importance due to the different distribution of samples in each window. To further visualize the importance of each variable, the variables are sorted in descending order according to the cumulative contribution rate, as shown in Table 2.









TABLE 2







Sorting results of variables in different windows




















Window





















number
Order of variables (In descending order of contribution rate)























win_1
x2
x1
x6
x7
x9
x3
x5
x8
x12
x13
x10
x4
x11


win_2
x3
x6
x7
x2
x4
x9
x5
x1
x8
x13
x12
x11
x10


win_3
x2
x1
x3
x6
x7
x9
x5
x4
x8
x13
x12
x10
x11


win_4
x2
x1
x3
x6
x7
x4
x5
x9
x10
x13
x8
x12
x11


win_5
x2
x1
x6
x3
x4
x7
x9
x5
x8
x11
x13
x12
x10









Table 2 shows that the distribution of variables is different, which can be used to characterize different operation conditions. The cumulative contribution rate threshold is determined as ζ=0.85, Then, the variables with the cumulative contribution rate higher than ζ is regarded as the key variables for each sub-task.


C4 Prediction Based on DMNN

Aiming each sub-task, LSTM is used to established the sub-network driven by the key variables. The training and testing results of C4 prediction for debutanizer column are shown in FIG. 6, 7, which demonstrated that the proposed DMNN has the superior performance in terms of its approximation capability.


To demonstrate the merits of the proposed method, the performance of DMNN is compared with those of RBF, LSSVM, DBN and LSTM methods, as shown in Table 3.









TABLE 3







Comparison of C4 prediction results on debutanizer column










Training phase
Testing phase













Methods
RMSE
MAPE
R2
RMSE
MAPE
R2





RBF
0.0202
0.0789
0.9816
0.1575
0.3982
0.2316


LSSVM
0.0154
0.0581
0.9893
0.0538
0.1479
0.9105


DBN



0.1655
1.8736



LSTM



0.0736
0.8566



DMNN
0.0057
0.0215
0.9986
0.0311
0.1343
0.9701









Compared with RBF, LSSVM and DBN, LSTM neural network shows significant advantages in terms of its lower RMSE, MAPE and R2. It illustrates that LSTM is more suitable to tackle the complex task because of its memory properties. On this basis, PCA-based dynamic task decomposition method further improves the prediction accuracy of C4. In contrast with other methods, DMNN shows an average improvement of 65.35% in RMSE, 68.48% in MAPE, and 39.91% in R2. Besides, the regression performance of different methods plotted in FIG. 8 reveals the high approximation ability of the proposed DMNN.


For performance comparison, the prediction errors of each method are visualized in FIG. 9, 10a-10e. Compared with the other methods, the prediction error of the DMNN is significantly closer to 0, which indicates the effectiveness of this method.


Prediction of NOx Emission in MSWI Process

MSWI process is a complex dynamic system. As one of the important pollutant, accurate prediction of NOx emission has great significance to ensure the stable operation of MSWI plant. The experiment was implemented based on the real industrial data. A total of 2215 samples was collected from the DCS with the sampling interval of 10 s. 1550 samples are considered as the training set to construct the model, and the remaining are used to evaluate the proposed method. Combined with the operation characteristic of MSWI process, 10 variables that are highly related to NOx are used for establishing the prediction model, as shown in Table 4.









TABLE 4







Variable description of NOx prediction model.










Index
Variables
Range
Unit













1
Air flow of combustion grate (left side 1-1)
 4~13
km3N/h


2
Air flow of combustion grate (right side 1-1)
5.5~10 
km3N/h


3
Air flow of dry grate (left side 1-1)
1~5
km3N/h


4
Primary combustion chamber temperature
 900~1040
° C.


5
Primary combustion chamber temperature(left)
 870~1070
° C.


6
Primary combustion chamber temperature(right)
 850~1050
° C.


7
Accumulation of primary air flow
 980~1383
km3N


8
Accumulation of secondary air flow
55~95
km3N


9
Accumulation of urea solution
1370~1876
L


10
Accumulation of urea solvent supply
3.31(*104) ~ 3.37(*104)
L









In this section, the length of sliding window is 600. Considering the frequent changes of MSWI process, the moving step of the window is 100.


Dynamic Task Decomposition Based on PCA

A total of 11 sub-tasks are obtained using the dynamic task composition method. The variables importance in each window are shown in FIG. 11.



FIG. 11 reveals that the distribution of variables importance is different in each sub-tasks, which is closely related to the characteristic of real MSWI process. Influenced by the feed quantity, composition and operation means, MSWI process is complex and fluctuant. Thus, the principal components have different contribution rates to each variable. According to the ζ=0.85, the dominant variables are determined for each sub-task, as shown in Table 5.









TABLE 5







Sorting results of variables in different windows

















Window


















number
Order of variables (In descending order of contribution rate) number




















win_1
x4
x2
x3
x1
x8
x7
x10
x5
x6
x9


win_2
x2
x4
x3
x1
x8
x7
x6
x10
x9
x5


win_3
x1
x3
x2
x4
x7
x10
x8
x6
x5
x9


win_4
x1
x4
x2
x3
x7
x10
x8
x6
x5
x9


win_5
x1
x4
x3
x2
x7
x10
x5
x8
x9
x6


win_6
x1
x7
x4
x3
x5
x2
x6
x9
x8
x10


win_7
x4
x1
x2
x3
x5
x7
x9
x10
x8
x6


win_8
x4
x2
x1
x3
x5
x9
x10
x7
x6
x8


win_9
x4
x1
x7
x5
x3
x2
x6
x10
x8
x9


win_10
x4
x2
x7
x5
x1
x6
x3
x10
x8
x9


win_11
x7
x1
x4
x5
x2
x6
x3
x10
x8
x9









As can be seen from Table 5, the air flow of combustion grate and the primary combustion chamber temperature play a key role in sub-task-1, 2, 6-11 which indicates that the oxygen and temperature have an important impact for NOx emission. Besides, for the sub-tasks-3-5, the accumulation of urea solution is also an essential factor that cannot be ignored. From the analysis of NOx generation and emission mechanism, the coupling relationship between these variables and NOx is different in each sub-task.


NOx Emission Prediction Based on the DMNN

In this section, each sub-task is assigned to develop the corresponding sub-network using LSTM. The training and testing results of NOx emission prediction based on DMNN are shown in FIG. 12, 13. The results demonstrate that the predicted values of DMNN are close to the real values in general. Meanwhile, the testing results of the samples distributed in the range of 550˜650 have a large deviation, which can be explained by the violent and frequent fluctuation of the MSWI process. To further demonstrate the merits of the proposed method, the performance of DMNN is compared with those of RBF, LSSVM, DBN and LSTM neural networks, as shown in Table 6.









TABLE 6







Comparison of NOx emission prediction results on MSWI process










Training phase
Testing phase













Methods
RMSE
MAPE
R2
RMSE
MAPE
R2
















RBF
6.2630
4.0806
0.9696
12.3938
6.8560
0.7659


LSSVM
4.2520
2.5481
0.9860
10.4851
6.9292
0.8325


DBN
4.8278
2.9096
0.9819
8.0834
5.9306
0.9004


LSTM
4.0860
2.5801
0.9871
8.3332
5.0864
0.8942


DMNN
3.4603
2.0801
0.9890
7.3510
4.4921
0.9177









Table 6 presents the performance comparison of various methods for NOx emission prediction, wherein the effectiveness of the proposed DMNN is further manifested. Typically, LSTM neural network still shows significant advantages in processing time-series. In addition, the DMNN with dynamic task decomposition method based on PCA further improves the prediction accuracy in both the training and testing phase. Compared with other algorithms, the testing performance of the proposed method is improved by 23.25% (RMSE), 26.4%(MAPE), and 8.65 (R2) on average. FIG. 14 shows the regression performance of the various methods. Evidently, the prediction outputs of DMNN satisfactorily fit the desired outputs.


Accordingly, the prediction errors of the different methods in the testing phase are plotted in FIGS. 15 and 16a-16e, which clearly illustrates that most prediction errors of the proposed method were close to 0.


The reasonability and effectiveness of proposed DMNN were evaluated through an industrial benchmark, and it was then applied for NOx emission prediction in the MSWI process. The following advantages can be summarized based on the above analysis:


(1) A PCA-based dynamic task decomposition method: Different from traditional clustering methods, the proposed method was designed to detected the key variables in each sliding window. Then, the original task with complex dynamics was divided into several sub-tasks, thus simplifying the complexity of the task to be processed.


(2) A DMNN-based prediction model for NOx emission: Aiming each sub-task, a LSTM was constructed driven by the key variables. Then, the nonlinearity between the key variables and NOx value is learned to guarantee the prediction accuracy. Table 3 and Table 6 show the performance index of various algorithm. The experimental results demonstrated the higher generalization of DMNN via RMSEs, MAPEs and R2s on both the training and testing sets.


The technical scheme and steps above can also be described as follows:


Step 1: Dynamic task decomposition based on PCA;


Aiming to detect the dynamic operating conditions, a sliding window with fixed size was used to decompose complex task; Then, the characteristic of operating conditions can be represented by key variables in sliding window;


The algorithm is described as follows:


A sliding window is used to detect the principal components in the time-series; The size of sliding window is denoted by win_1; Assume that the observation sample matrix in the first sliding window is represented by Xm×n1win_1










X

m
×
n


win

_

1


=



[




x
1




x
2







x
m




]

T

=

[




x
11




x
12







x

1


n
1








x
21




x
22







x

2


n
1






















x

m

1





x

m

2








x

mn
1





]






(
1
)









    • where m and n1 are the number of variables and samples in the sliding window win_1; x1 x2 . . . xm represent m variables of the matrix, which are inputs of prediction model;





For the debutanizer column dataset, x1 x2 . . . xm denote a total of 13 variables, they are top temperature, top pressure, flow of reflux, flow to the next process, temperature of the sixth tray at time t, temperature of the sixth tray at t-1, temperature of the sixth tray at t-2, temperature of the sixth tray at t-3, average value of the temperature at bottom at t, and the butane concentration at t-1, t-2, t-3, and t-4, respectively; The size of m is 13 in this case;


For MSWI process, x1 x2 . . . xm in represent a total of 10 variables, they are air flow of combustion grate (left side 1-1), air flow of combustion grate (right side 1-1), air flow of dry grate (left side 1-1), primary combustion chamber temperature, primary combustion chamber temperature(left), primary combustion chamber temperature(right), accumulation of primary air flow, accumulation of secondary air flow, accumulation of urea solution, and accumulation of urea solvent supply, respectively; The size of m is 10 in the real industrial data;


The mean vector μ of sample matrix Xm×n1win_1 is denoted as:









μ
=


[



μ
_

1

,


μ
_

2

,

,


μ
_

m


]

T





(
2
)














μ
¯

i

=


1

n
1







j
=
1


n
1



x
ij







(
3
)









    • where μ1, μ2, . . . , μm represent mean value of each row in Xm×n1win_1, and then the mean value of each variable can be obtained by Eq. (3); μi denotes the i-th value of μ; i=1, 2, . . . , m, and m is the number of variables; xij denote the value of i-th variable in j-th sample; j=1, 2, . . . , n1, n1 represents samples number in sliding window with the size of win_1;





All the samples of matrix Xm×n1win_1 minus the mean (decentralized) are denoted as














X
~


m
×

n
1




win

_


1


=


[





x
11



x
12





x

1


n
1










x
21



x
22





x

2


n
1















x

m

1




x

m

2






x

mn

1






]

-

[






μ
_

1

,


μ
_

1

,


,


μ
_

1









μ
_

2

,


μ
_

2

,


,


μ
_

2














μ
_

m

,


μ
_

m

,


,


μ
_

m





]








=

[






x
~

11




x
~

12






x
~


1


n
1











x
~

21




x
~

22






x
~


2


n
1
















x
~


m

1





x
~


m

2







x
~


mn

1






]







=


[



x
~

1

,


x
~

2

,


,


x
~

m


]

T








(
4
)









    • where {tilde over (X)}m×n1win_1 represents the matrix after decentralization, {tilde over (x)}ij denotes the value of the i-th feature after decentralization in j-th sample, m represents the number of variables, and n is the number of samples contained in sliding window with the size of win_1;





The covariance matrix Hm×mwin_1 of {tilde over (X)}m×n1win_1 is calculated as:










H

m
×
m


win
_

1


=


1


n
1

-
1






X
~


m
×

n
1




win

_


1


·


X
~


m
×

n
1



win_

1

T








(
5
)









    • where {tilde over (X)}m×n1win_1T is the transpose of {tilde over (X)}m×n1win_1;





Then, the eigenvalue λ of covariance matrix Hm×mwin_1 can be calculated as












"\[LeftBracketingBar]"



H

m
×
m



win

_


1


-

λ

I




"\[RightBracketingBar]"


=
0




(
6
)












I
=

[



1


0





0




0


1





0


















0


0





1



]





(
7
)









    • where I denote the unit matrix; Based on Eq. (6), the eigenvalues of Hm×mwin_1 can be represented as








λ1≥λ2≥ . . . ≥λQ   (8)

    • where Q is the number of eigenvalues; According to Eq. (8), the eigenvector α corresponding to each eigenvalue is calculated as;





(Hm×mwin_1−λkIk=0   (9)

    • where Hm×mwin_1 is covariance matrix; λk denotes the k-th eigenvalue; I is unit matrix, which is represented by Eq. (7); αk is eigenvector corresponding to the k-th eigenvalue; αk=[α1k, α2k, . . . , αmk]T, (k=1, 2, . . . , Q0);


The threshold of cumulative variance contribution rate is set as θ, and if the cumulative variance satisfies













i
=
1


Q
0



λ
k


>
θ




(
10
)







Then the first Q0 principal components are selected for further analysis; Q0 is the number of principal components, which is determined by Eq. (10); The number of eigenvalues is Q0, which is equal to the number of principal components; λk denotes the k-th eigenvalues; Furthermore, the threshold θ is selected as 0.85;


Then, the unit eigenvector α corresponding to Q0 eigenvalues is used as a coefficient for linear transformation to obtain Q0 principal components:





zkkTx   (11)

    • where αk=[α1k, α2k, . . . , αmk]T (k=1, 2, . . . , Q0)


Combining with the samples in Xm×n1win_1, the principal components of n1 samples can be obtained by Eq. (11); The k-th principal component zkj of the j-th sample xj=[x1j, x2j, . . . , xmj]T (j=1, 2, . . . , n1( is













z
kj

=



[


α

1

k


,

α

2

k


,


,

α
mk


]

[


x

1

j



,

x

2

j


,


,

x
mj


]

T







=




i
=
1

m



α
ik



x
ij










(
12
)









    • where α1k, α2k, . . . , zmk denoted the m values of k-th unit eigenvector; x1j, x2j, . . . , xmj represent the m variables of j-th sample, respectively; j=1, 2, . . . , n1, i=1, 2, . . . , m, and k=1, 2, . . . , Q0;





According to Eq. (12), zk that containing k principal components can be denoted by zk=[zk1, zk2, . . . , zkn1]; Therefore, a factor load is defined as the correlation between the k-th principal component zk and i-th feature xi, which is calculated as










ρ

(


z
k

,

x
i


)

=




λ
k




α
ik




σ
ii







(
13
)









    • where αik is the unit eigenvector, which denote the i-th value in αk; σii is the variance of the i-th variable xi, which is also the i-th diagonal entry of covariance matrix Hm×mwin_1, k=1, 2, . . . , Q0, i=1, 2, . . . , m;





The factor load matrix is expressed as









ρ
=

[




ρ


(


z
1

,

x
1


)





ρ


(


z
1

,

x
2


)








ρ


(


z
1

,

x
m


)







ρ


(


z
2

,

x
1


)





ρ


(


z
2

,

x
2


)








ρ


(


z
2

,

x
m


)





















ρ


(


z
q

,

x
1


)





ρ


(


z
q

,

x
2


)








ρ


(


z
q

,

x
m


)





]





(
14
)







Then, the contribution rate σi of Q0 principal components to the i-th variable xi (i=1, 2, . . . , m) is










υ
i

=




k
=
1


Q
0




ρ
2

(


z
k

,

x
j


)






(
15
)









    • where the contribution rate υi is the sum of squares of factor loads between the Q0 principal components and i-th variable xi; Then, the contribution rate matrix υ of Q0 principal components corresponding to each variable can be expressed as








υ=[υ1, υ2, . . . , υm]  (16)

    • where m represents the number of variables contained in Xm×n1win_1; The importance of variables changes with the fluctuation of complex operating conditions in MSWI furnace, that is, the contribution rate υi of principal components corresponding to each variable will also change; Therefore, the contribution rate υ is reordered in a descending order





sort(υ)=[υmax, . . . , υmin]  (17)

    • where the function of sort(·) is to sort data in a descending order; υmax and υmin represent the maximum and minimum value of contribution rate, respectively; The key variables are determined by defining a threshold value ψ;










{




i
=
1

F


υ
i


}

>
ψ




(
18
)









    • where the value of ψ is equal to the cumulative variance contribution rate, that is ψ=0.85. F denote the number of key variables, which can be determined by ψ; Equation (18) indicates that the first F variables have the greatest correlation with the principal components in the current window; Then, the first F key features are selected as reference vectors for condition identification, as shown in Eq. (19);








con_1=[xnum_1win_1, xnum_2win_1, . . . , xnum_Fwin_1]  (19)

    • where xnum_1win_1, xnum_2win_1, . . . , xnum_Fwin_1 represent the first F variables in Xm×n1win_1; Thereafter, the window moves forward by a certain step, and the key variables are detected successively; Finally, the key variables in each sub-task are stored in the knowledge base for modeling analysis, which is expressed as





condition_library=[con_1,con_2, . . . , con_W]  (20)

    • where con_1, con_2, . . . , con_W represent reference vectors corresponding to different operating conditions, respectively; W denotes the number of operating conditions;


The size of sliding window and moving step is selected according to specific data sets; The simulation phase includes a debutanizer column process and a real industrial data of MSWI process; For debutanizer column process, the sliding window size is 600; Considering the dataset is accompanied by slow fluctuations, the moving step of sliding window is set to 300; For MSWI process, the size of sliding window is 600; Considering the complex variation and large fluctuation of the process, the moving step of sliding window is set to 100;


Step 2: Construction of the LSTM-based sub-network;


Aiming each sub-task, LSTM neural network is explored driven by the corresponding key variables; LSTM cell comprises input, forget, output and cell state gate, and each gate is calculated as follows:


Forget gate:






f
t=σ(Wf·[ht-1, xt]+bf)   (21)


Input gate:






i
t=σ(Wi·[ht-1, xt]+bt)   (22)


Cell state gate:






{tilde over (C)}
t=tan h(Wc·[ht-1, xt]+bc)   (23)






C
t
=f
t
⊗C
t-1
+i
t
⊗{tilde over (C)}
t   (24)


Output gate:






o
t=σ(Wo[ht-1, xt]+bo)   (25)


Using Eqs. (21)-(25), the final output of LSTM is






ŷ
NOx
t
=o
t⊗tan h(Ct)   (26)

    • where xt denote the input of LSTM neural network at time t; They are air flow of combustion grate (left side 1-1), air flow of combustion grate (right side 1-1), air flow of dry grate (left side 1-1), primary combustion chamber temperature, primary combustion chamber temperature(left), primary combustion chamber temperature(right), accumulation of primary air flow, accumulation of secondary air flow, accumulation of urea solution, accumulation of urea solvent supply at time t, respectively; ht-1 is the output of LSTM neural network at time t-1; Wf, Wi, Wc and Wo denote the weight matrix of the forget, input, cell state and output gate, respectively; bf, bi, bc and bo are the bias of the forget, input, cell state and output gate, respectively; ft, it, Ct and ot represent the output of the forget, input, cell state and output gate, respectively; ŷNOxt is the output of LSTM neural network at time t; σ(·) and tan h(·) are the activation functions, which are calculated as










σ

(
U
)

=

1

1
+

e

-
U








(
27
)













tanh

(
U
)

=



e
U

-

e

-
U





e
U

+

e

-
U








(
28
)









    • where U denote the input of activation function in each gate, as shown in Eqs. (29)-(32):





Forget gate:






U
f
=W
f
·[h
t-1
, x
t
]+b
f   (29)


Input gate:






U
i
=W
i
·[h
t-1
, x
t
]+b
i   (30)


Cell state gate:






U
c
=W
c
·[h
t-1
, x
t
]+b
c   (31)


Output gate:






U
o
=W
o
·[h
t-1
, x
t
]+b
o   (32)


Step 3: Cooperation decision strategy;


During testing stage, the similarity between the i-th testing sample and training samples is measured by Euclidean distance:






d
g,j
test=dist(xgtest, xjtrain), (j=1, 2, . . . , N)   (33)





dist(xgtest, xjtrain)=√{square root over (∥xgtest_1−xjtrain_12+ . . . +∥xgtest_m−xjtrain_m2)}  (34)






d
g
test
=[d
g,1
test
, d
g,2
test
, . . . , d
g,N
test]  (35)

    • where xgtest is the g-th sample of testing set; xgtest_1 and xgtest_m denote the first and m-th variable of g-th testing sample, respectively; Similarly, xjtrain_1 and xjtrain_m denote the first and m-th variable of j-th training sample, respectively; dg,1test, dg,2test, . . . , dg,Ntest represent Euclidean distance between g-th sample of testing set and samples of training set, respectively; g=1, 2, . . . , G, j=1, 2, . . . , N; N and G denote the number of samples in training and testing sets; According to Eq. (35), the training sample xjtrain which is closest to testing sample xgtest is selected; Then, the operating condition of xgtest is determined by that of xjtrain;


Finally, a decision operation strategy is adopted to generate the prediction outputs of MNN during testing phase;











y
ˆ

NOx

=





r
=
1

R



y
ˆ

NOx
r


R





(
36
)









    • where ŷNOx denote the predicted value of NOx emission, and ŷNOxr is the output of sub-network; r=1, 2, . . . , R, R represent the number of activated sub-networks;





Step 4: DMNN-based prediction model for NOx emission;


The NOx emission prediction model for MSWI process based on DMNN mainly includes four parts: data preprocessing, PCA-based dynamic task decomposition, construction of sub-network and cooperation decision strategy; As shown in FIG. 3, the original dataset is represented by Xori, and XoriϵRL×m, where L denotes the number of samples and m is the number of variables; First, the original data is preprocessed via smooth and normalization, and then represented by Xpre={x1i, x2i, . . . , xmi, yNOxi}i=1N; Second, to implement a dynamic task decomposition, a sliding window is performed on the training set to determine key variables; Furthermore, the corresponding sub-task is formed in each window; Then, a LSTM-based sub-network is established for sub-task with different key variables as inputs; During the testing phase, the sub-networks are activated using similarity between the testing and the training samples, which is measured via Euclidean distance; And the cooperative decision strategy is used to integrate each activated sub-network to generate the final prediction results of NOx;


In MSWI process, the sensors usually operate in a high temperature and dust environment, which bring the noise to original data; To reduce the effect of the noise on data analysis, Rajda is used to smooth the original data, as shown in Eq. (37);





|xori−μori|≥3σori   (37)

    • where xori denotes original sample, μori and σori denotes the mean and standard deviation of variables, respectively; The samples satisfying Eq. (37) are regarded as the outliers and removed from the original data; Then, the dataset after smoothing is expressed as Xsmo, and XsmoϵRN×m; N and m denote the number of samples and variables, respectively;


Z-score method is used to perform standardization on the dataset, which is calculated as Eq. (38);










x
i

=



x
i
smo

-

μ
i
smo



σ
i
sm






(
38
)









    • where xi, μismo, and σismo (i=1, 2, . . . , m) are the normalized vector, mean and standard deviation of the i-th dimension variable, respectively; The normalized dataset is represented by XN×mT; N and m denote the number of samples and variables, respectively;





The proposed DMNN-based NOx emission prediction framework for MSWI process (as shown in FIG. 4) is described as follows:


Training Phase





    • 1) Preprocess the original data ori_data=[Xori Yori] based on Eqs. (37), (38), and then the dataset is expressed by dataset=[X Y];

    • 2) Set a sliding window with a fixed length of win, and the subset contained in the window is Xwin_1; The key features of Xwin_1 are constructed by Eqs. (1)-(20); Thereafter, the window moves forward by a certain step, and the key variables are detected successively; Finally, the key variables in each sub-task are stored in the knowledge base for modeling analysis;

    • 3) For each sub-task, LSTM is applied to established the sub-network driven by the corresponding key variables; And the number of hidden neurons is optimized by trial-and-error method;

    • 4) Move the sliding window in steps and repeat step 2)-step 3);





Testing Phase





    • 5) Calculate the similarity between the test sample and training samples via Eqs. (33)-(35) and generate the outputs of MNN by activating the corresponding the sub-networks;

    • 6) The final prediction result of NOx emission is obtained by integrating the outputs of the sub-networks with a cooperation decision strategy by Eq. (36).





While the invention has been particularly shown and described as referenced to the embodiments thereof, those skilled in the art will understand that the foregoing and other changes in form and detail may be made therein without departing from the spirit and scope.

Claims
  • 1. A method for dynamic-modular-neural-network (DMNN)-based municipal solid waste incineration (MSWI) process nitrogen oxides (NOx) emission prediction, comprising steps of: obtaining sensor data associated with an MSWI process, the sensor data comprising a data set comprising a plurality of samples;preprocessing the sensor data to remove those of the samples that comprise noise and standardizing the data set;decomposing a task of prediction of a NOx emission associated with an MSWI process into a plurality of sub-tasks using principal component analysis, comprising applying a sliding window of a fixed size to the preprocessed sensor data set and identifying key variables of operating conditions of the MSWI process by key variables by applying a sliding window to the preprocessed sensor data set, each of the key variables associated with one of the sub-tasks:constructing a long-short-term memory (LSTM) neural network, the LTSM neural network comprising a plurality of sub-networks, wherein each of the sub-networks outputs a value for one of the sub-tasks and a key variable associated with that sub-task serves as an input for that sub-network;obtaining a further set of sensor data associated with a further MSWI process, the further sensor data comprising further data samples;comparing at least one of the further samples to at least some of the samples in the preprocessed sensor data set;activating at least some of the sub-networks based on the comparison; andusing the activated subnetworks in the LTSM network to predict the NOx emission for the further MSWI process, wherein the steps are performed by at least one suitably-programmed computer and wherein a plant associated with the further MSWI process is operated based on the NOx prediction for the further MSWI process.
  • 2. A method according to claim 1, wherein the comparison comprises finding a similarity between the at least one of the further samples and the at least some of the samples in the preprocessed sensor data set.
  • 3. A method according to claim 2, wherein the similarity is determined using Euclidian distance.
  • 4. A method according to claim 1, wherein each sub-network comprises at least one cell that comprises an input, a forget, an output, and a cell state gate.
  • 5. A method according to claim 1, wherein the sensor data is obtained using one or sensors, the sensors comprising one or more a thermocouple temperature sensor, an air volume sensor, a liquid flow sensor, a continuous emission monitoring system, distributed control system and upper computer.
  • 6. A method according to claim 1, wherein the sensor data comprises air flow of combustion grate left side 1-1, air flow of dry grate left side 1, temperature of primary combustion chamber, left side temperature of primary combustion chamber, right side temperature of primary combustion chamber, cumulative primary air flow, cumulative secondary air flow, accumulated urea solution flow, accumulated urea solution supply flow and a NOx emission value associated with the MSWI process.
  • 7. A method according to claim 1, wherein the window moves forward along the preprocessed sensor data set by a step and the key variables are determined successively.
  • 8. A method according to claim 1, wherein the NOx emission for the further MSWI process is predicted in accordance with:
  • 9. A method according to claim 1, wherein the at least one suitably-programmed computer receives the further sensor data in real-time.
  • 10. A method according to claim 1, wherein a denitration control system of the plant is controlled based on the for the further MSWI process.
  • 11. A system for dynamic-modular-neural-network (DMNN)-based municipal solid waste incineration (MSWI) process nitrogen oxides (NOx) emission prediction, comprising: at least one computer configured to: obtaining sensor data associated with an MSWI process, the sensor data comprising a data set comprising a plurality of samples;preprocessing the sensor data to remove those of the samples that comprise noise and standardize the data set;decompose a task of prediction of a NOx emission associated with an MSWI process into a plurality of sub-tasks using principal component analysis, comprising applying a sliding window of a fixed size to the preprocessed sensor data set and identifying key variables of operating conditions of the MSWI process by key variables by applying a sliding window to the preprocessed sensor data set, each of the key variables associated with one of the sub-tasks:construct a long-short-term memory (LSTM) neural network, the LTSM neural network comprising a plurality of sub-networks, wherein each of the sub-networks outputs a value for one of the sub-tasks and a key variable associated with that sub-task serves as an input for that sub-network;obtain a further set of sensor data associated with a further MSWI process, the further sensor data comprising further data samples;compare at least one of the further samples to at least some of the samples in the preprocessed sensor data set;activate at least some of the sub-networks based on the comparison; anduse the activated subnetworks in the LTSM network to predict the NOx emission for the further MSWI process, wherein a plant associated with the further MSWI process is operated based on the NOx prediction for the further MSWI process.
  • 12. A system according to claim 11, wherein the comparison comprises finding a similarity between the at least one of the further samples and the at least some of the samples in the preprocessed sensor data set.
  • 13. A system according to claim 12, wherein the similarity is determined using Euclidian distance.
  • 14. A system according to claim 11, wherein each sub-network comprises at least one cell that comprises an input, a forget, an output, and a cell state gate.
  • 15. A system according to claim 11, wherein the sensor data is obtained using one or sensors, the sensors comprising one or more a thermocouple temperature sensor, an air volume sensor, a liquid flow sensor, a continuous emission monitoring system, distributed control system and upper computer.
  • 16. A system according to claim 11, wherein the sensor data comprises air flow of combustion grate left side 1-1, air flow of dry grate left side 1, temperature of primary combustion chamber, left side temperature of primary combustion chamber, right side temperature of primary combustion chamber, cumulative primary air flow, cumulative secondary air flow, accumulated urea solution flow, accumulated urea solution supply flow and a NOx emission value associated with the MSWI process.
  • 17. A system according to claim 11, wherein the window moves forward along the preprocessed sensor data set by a step and the key variables are determined successively.
  • 18. A system according to claim 11, wherein the NOx emission for the further MSWI process is predicted in accordance with:
  • 19. A system according to claim 11, wherein the at least one suitably-programmed computer receives the further sensor data in real-time.
  • 20. A system according to claim 11, wherein a denitration control system of the plant is controlled based on the for the further MSWI process.
Priority Claims (1)
Number Date Country Kind
202210994681.2 Aug 2022 CN national