The present application relates to municipal solid waste incineration, and more particularly to a method for predicting dioxin emission concentration based on hybrid integration of random forest and gradient boosting decision tree.
In China, the rapid development of economy and the continuous expansion of urbanization have led to a dramatic increase in the generation of municipal solid waste (MSW), especially in those economically developed and densely populated districts. More seriously, some cities are suffering from the crisis of garbage siege. Municipal solid waste incineration (MSWI)-based power generation is a typical harmless treatment method for volume reduction and recycling. At present, there are over 300 domestic MSWI power plants, and grate furnace incinerators account for more than ⅔. In view of the particular elemental compositions of Chinese MSW, most of the imported incinerators are operated manually, which leads to the frequent occurrence of “unacclimatization”, as well as the substandard emission of MSWI. Accordingly, how to control the pollution emission of MSWI one the premise of ensuring economic benefits has been the most critical issue. Dioxin (DXN), as a highly toxic persistent organic pollutant produced by MSWI, has strong chemical reactivity and thermal stability, is one of the main reasons for the public thought of “Not-In-My-Back-Yard”.
In the actual operation, the DXN emission concentration is mainly detected at a certain period by a combination of online sampling and offline experimental analysis. However, the foregoing detection is expensive and time consuming; moreover, it is difficult to support the real-time optimization control of MSWI operating parameters to minimize the DXN emission concentration. Therefore, an online prediction of DXN emission concentration is necessary. The complex physical and chemical characteristics of the MSWI process make it difficult to establish a model for precisely predicting the DXN emission concentration. Unfortunately, the online prediction of DXN emission concentration plays a vital role in optimizing the MSWI process. Currently, the online detection of DXN is generally performed by measuring related objects firstly and then performing online prediction through a mapping relationship, which has problems such as expensive equipment, weak adaptability and unsatisfactory prediction precision. Compared to the direct offline analysis and related object detection, the soft sensor method can faster and more economically predict difficult-to-measure parameters, and has been widely used in the industrial field. Regarding the MSWI process, it has been reported to employ a combination of feature selection and neural network to construct a DXN emission concentration prediction model. Due to characteristics of modeling data, such as low sample size, high dimensionality, and collinearity, these methods generally suffer from local minima, overfitting and poor generalization performance.
In view of limitations of traditional single prediction model, a prediction model based on ensemble learning has attracted a lot of attention. Random forest (RF) algorithm has strong capabilities of noise processing and nonlinear data modeling, but is not usually used for nonlinear regression. An online prediction of biomass moisture content in a fluidized bed based on combination of electrostatic sensor arrays and the random forest method has been proposed (Zhang, W. B., Cheng, X. F., Hu, Y. H., Yan, Y., 2019. Online prediction of biomass moisture content in a fluidized bed dryer using. Fuel, 239, 437-445). A soft sensor model based on principal component analysis and RF has been constructed for the online prediction of tensile property of twin-screw extruded polylactide sheet (Mulrennan, K., Donovan, J., Creedon, L., Rogers, I., Lyons, J. G., McAfee, M., 2018. A soft sensor for prediction of mechanical properties of extruded PLA sheet using an instrumented slit die and machine learning algorithms Polymer Testing, 69, 462-469). A literature (Napier, L. F. A., Aldrich, C., 2017. An IsaMill™ Soft Sensor based on Random Forests and Principal Component Analysis. Ifac Papersonline, 50, 1175-1180) provided a self-monitoring RF model for the online estimation of P80 particle size in the mill. In addition to the RF algorithm based on modeling data sampling for parallel integration, the gradient boosting decision tree (GBDT) is another popular machine learning algorithm, whereas, the efficiency and scalability of the GBDT are still unsatisfactory in the case of high feature dimension and large data size. A literature (Sachdeva, S., Bhatia, T., Verma, A. K., 2020. A novel voting ensemble model for spatial prediction of landslides using GIS. International Journal of Remote Sensing, 41, 929-952) provides an ensemble model integrated with logistic regression (LR), GBDT and voting feature interval (VFI) to evaluate a landslide susceptibility.
A literature (Wang, R., Lu, S. L., Li, Q. P., 2019. Multi-criteria comprehensive study on predictive algorithm of hourly heating energy consumption for residential buildings. Sustainable Cities and Society, 49.) utilized the GBDT to predict energy consumption for residential buildings. A literature (Chen, B. B., Lin, R. H., Zou, H., 2018. A Short Term Load Periodic Prediction Model Based on GBDT. 2018 Ieee 18th International Conference on Communication Technology (Icct),1402-1406) established a short-term load periodic prediction model based on the GBDT. A literature (Wang, J. D., Li, P., Ran, R., Che, Y. B., Zhou, Y., 2018. A Short-Term Photovoltaic Power Prediction Model Based on the Gradient Boost Decision Tree. Applied Sciences-Basel 8) provided a photovoltaic power prediction model based on GBDT, in which binary trees are integrated through gradient boosting. It has also been reported to establish a wind power quantile regression model based on an instance-based transfer learning method embedded GBDT (Cai, L., Gu, J., Ma, J. H., Jin, Z. J., 2019. Probabilistic Wind Power Forecasting Approach via Instance-Based Transfer Learning Embedded Gradient Boosting Decision Trees. Energies 12). A literature (Liu, X. L., Tan, W. A., Tang, S., 2019. A Bagging-GBDT ensemble learning model for city air pollutant concentration prediction. 4th International Conference on Advances in Energy Resources and Environment Engineering 237) proposed a Bagging-GBDT ensemble learning prediction model. Nevertheless, the above researches mostly use a single RF of GBDT algorithm for modeling, failing to effectively construct a model with small sample size and high-dimensional characteristics to predict the DXN emission concentration.
DXN is a highly toxic pollutant produced by the MSWI. In the current industrial process, the DXN emission concentration is mainly measured by collecting flue gas samples on site and then analyzing the samples in a laboratory, which has problems such as time-consuming operation and high cost. In view of the above-mentioned defects in the prior art, an object of this application is to provide a DXN emission concentration prediction model based on hybrid integration of random forest (RF) and gradient boosting decision tree (GBDT), in which a process control system is employed to acquire a process variable in real time.
Technical solutions of this application are described as follows.
This application provides a method for predicting dioxin emission concentration, comprising:
performing random sampling of a training sample and an input feature on a DXN emission concentration prediction modeling data with a small sample size and a high-dimensional characteristic to generate a training subset;
establishing an RF-based DXN sub-model based on the training subset;
performing an iteration on each of the RF-based DXN sub-model I times to establish a GBDT-based DXN sub-model; and
combining a predicted output of the RF-based DXN sub-model and the GBDT-based DXN sub-model by a simple average weighting method to obtain a final output;
wherein the number of the RF-based DXN sub-model is J; and the number of the GBDT-based DXN sub-model is J×I.
Compared to the prior art, the application has the following beneficial effects. In the method provided herein, a DXN emission concentration prediction model based on hybrid integration of RF and GBDT is established, which has an improved online prediction precision for the DXN emission concentration, and can facilitate the optimization of operation parameters of the MSWI and improve the economic benefit.
MSW is transported by a vehicle to a weighbridge to be weighted and discharged into a garbage pool. After biologically fermented and dehydrated for 3-7 days, the MSW is transferred to a garbage hopper by a grab, fed to an incineration grate through a feeder, and subjected to drying, burning and incineration successively. Combustible components of the MSW after drying are burned in the combustion air delivered by a primary air fan. The ash residue generated by burning falls from an end of the grate to a slag conveyor to be transported to a slag pit, and finally is landfilled at a designated location. A temperature of the flue gas produced in the combustion process should be controlled above 850° C. in a first combustor to ensure a complete decomposition and combustion of harmful gas. When the flue gas passes through a second combustor, air delivered by a secondary air fan generates a turbulence, which ensures that the residence time of the flue gas exceeds 2 s, such that the harmful gas is further decomposed. The flue gas then enters a waste heat boiler and absorbs heat to generate high-temperature steam to drive a turbo-generator set to generate electricity. Subsequently, the flue gas is mixed with lime and activated carbon, and enters a deacidification reactor to undergo a neutralization reaction to allow the DXN and heavy metals therein to be adsorbed. Then, a flue gas particle, neutralization reactant and activated carbon are removed in a bag filter. Part of the gas and ash mixture is mixed with water in a mixer and then transported into the deacidification reactor for repeated treatment. Fly ash produced in the deacidification reactor and the bag filter enters a fly ash tank and is needed to be transported to experience further processing. The final gas is emitted to the atmosphere by an induced draft fan through a stack, which includes soot, CO, NOx, SO2, HC1, HF, Hg, Cd, DXN and so on.
As shown in
As shown in
In
All of sub-models of the DXN emission concentration prediction model based on hybrid integration of EnRFGBDT herein are established by maximize growth classification and regression trees (CART). The training subset of the RF-based DXN sub-model and the input feature of the RF-based DXN sub-model are generated by random sampling, where the number of features of the RF-based DXN sub-model is much smaller than that of an initial modeling data, therefore a correlation between the CART is reduced, and a robustness of an outlier and a noisy data are improved. Multiple GBDT-based DXN sub-models in series further improve a prediction precision of the CART. As a consequence, the DXN emission concentration prediction model in “parallel+series” is established. Different modules are performed as follows.
(1) A random sampling with replacement N time to the training sample set {X∉RN×M, y∉RN×1} and a random selection of a fixed number of input features from the training sample set are performed by the training sample and input feature random sampling module to generate the training subset {Xj, yj}j−1J.
(2) A RF-based DXN sub-model {fRFj(⋅)}j=1J is established by the random forest (RF)-based DXN sub-model establishing module. A predicted value {ŷj}j=1J of the DXN emission concentration is subtracted from a measured value {yj}j=1J of the DXN emission concentration to obtain a prediction error {ej,0}j=1J.
(3) I iterations are performed on each of a new training subset {Xj, ej,0}j=1J to build I×J GBDT-based DXN sub-models {{fGBDTj,i(⋅)}i=1I}j=1J by the GBDT-based DXN sub-model establishing module, where the new training subset is formed by the prediction error {ej,0}j−1J as an output data true value and an input data of a training subset {XJ}j−1J.
(4) The RF-based DXN sub-model {ŷRFJ}j−1J and the GBDT-based sub-model {{fGBDTj,i(⋅)}j−1I}j−1J are subjected to simple averaging by the simple average-based DXN integrated prediction module to establish the DXN emission concentration prediction model fDXN(⋅).
Accordingly, steps of modeling the method herein is as follows.
(1) A random sampling with replacement and a random selection of a fixed number of input features are performed on the process variable of the MSWI process to generate J training subsets.
(2) J RF-based DXN sub-models {fRFj(⋅)}j−1J are established.
(3) I iterations are performed to build I×J GBDT-based DXN sub-models {{fGBDTj,i(⋅)}j−1I}j−1J where the prediction error {ej,0}j−1J of the {fRFj(⋅)}j−1J is used as the output data true value.
(4) The RF-based DXN sub-model and the GBDT-based sub-model are subjected to simple averaging to establish the DXN emission concentration prediction model.
The process variable of the MSWI process is processed by a Bootstrap method and a random subspace method (RSM).
The training subset is extracted by the Bootstrap method, where the number of samples in the training subset is the same with the number of samples of the training sample set.
Then the RSM is introduced to randomly select some features to generate J training subsets including N training samples and Mj input features.
Generation of the training subset is expressed as follows:
where {Xj,yj} is a jth training subset; (xj,M
The RF-based DXN sub-model establishing module is operated through the following steps with the jth training subset {(xj,M
A duplicate sample is removed from the jth training subset {(xj,M
A mth input feature xj,m is taken as a splitting variable, and a value xn
A number of an optimal splitting variable and a value of the splitting point are found based on the following criterion by traversing all input features:
where y1j and y2j are a measured value of DXN emission concentration of the jth training subset in the R1 and the R2, respectively; and C1 and C2 are a mean value of a measured value of DXN emission concentration in the R1 and the R2, respectively.
The above processes are repeated respectively for R1 and R2 until the number of training samples in a leaf node is less than a preset threshold ∂RF to split the input feature space into K areas. The K areas are marked as R1, . . . , Rk, . . . , RK, respectively, where K indicates the number of the leaf node of the CART.
The RF-based DXN sub-model established by the CART is expressed as follows
where NR
is a nR
A prediction error of the RF-based DXN sub-model established based on the jth training subset {(xj,M
where (ej,0)n is a predicted error of DXN emission concentration based on a nth training sample.
The above processes are repeated to obtain J RF-based DXN sub-models {fRFj(⋅)}j=1J established by the CART. A predicted output {ŷRFj}j=1J of the J RF-based DXN sub-models respectively is subtracted from a measured value {yj}j=1 to obtain the prediction error {ej,0}j=1J.
Multiple weak learner models “in series” are established, where an input data of a training subset of the multiple weak learner models is unchanged. A true value of output data of a training subset of a first GBDT-based DXN sub-model is an error between the predicted output of the RF-based DXN sub-model and the measured value. And a true value of output data of a training subset of other GBDT-based DXN sub-models is a prediction error of the GBDT-based DXN sub-model iterated in a previous iteration.
Establishment of a jth GBDT-based DXN sub-model is taken as an example. I GBDT-based DXN sub-model are supposed to be established by the CART.
A first GBDT-based DXN sub-model is established:
where ŷGBDTj,1 is the predicted output of the first GBDT-based DXN sub-model.
A loss function of the first GBDT-based DXN sub-model is defined as follows:
where (ŷGBDTj,1)n is a predicted value of a nth sample in a jth training subset.
An output residual ej,1 of the first GBDT-based DXN sub-model fGBDTj,1(⋅) is calculated, which is expressed as follows:
The ej,1 is taken as a true value of output data of a training subset of a second GBDT-based DXN sub-model fGBDTj,2(⋅). The second GBDT-based DXN sub-model is expressed as follows:
where (ej,1)n is a predicted error of the first GBDT-based DXN sub-model of the nth sample.
The above processes are repeated. A ith (i≤I) GBDT-based DXN sub model is marked as fGBDTj,i(⋅) where an output residual of the ith GBDT-based DXN sub-model is expressed as follows:
After I−1 iterations, a true value of output data of a training subset of a Ith GBDT-based DXN sub-model is expressed as follows:
where ŷGBDTj,I−1 is a predicted output of a (I−1)th GBDT-based DXN sub-model.
The Ith GBDT-based DXN sub-model is expressed as follows:
where (ej,I−1)n is a predicted error of the (I−1)th GBDT-based DXN sub-model for the nth sample.
As a consequence, the I GBDT-based DXN sub-models based on the jth training subset are expressed as {fGBDTj,i(⋅)}i=1I, and an output of the I GBDT-based DXN sub-models is expressed as {ŷGBDTj,i}i=1I,
J RF-based DXN sub-models established in parallel are indicate as {fRFj(⋅)}j=1J. J×I GBDT-based DXN sub-models established in series and parallel simultaneously are indicated as
For the jth training subset, one RF-based DXN sub-model and I GBDT-based DXN sub-models are established in parallel. A sum of a predicted output of the one RF-based DXN sub-model and the I GBDT-based DXN sub-models are taken as a total output of the jth training subset, which is expressed as follows:
Since the J training subsets are parallel, the one RF-based DXN sub-model is combined with the I pieces of GBDT-based DXN sub-model through simple average weighting method, where the prediction model fDXN (⋅) is expressed as follows:
The DXN emission concentration prediction model is established based on the training sample and input feature random sampling module, the RF-based DXN sub-model establishing module, the GBDT-based DXN sub-model establishing module and the simple average-based DXN integrated prediction module.
The process variable of the MSWI process, including furnace temperature, activated carbon injection amount, stack emission gas concentration, grate speed, primary air flow and secondary air flow, is taken as an input {x|x1, . . . , xm, . . . xM} of the DXN emission concentration prediction model. The input is calculated successively by the RF-based DXN sub-model establishing module, the GBDT-based DXN sub-model establishing module and the simple average-based DXN integrated prediction module, and a current DXN emission concentration value is as a DXN emission concentration predicted value of the MSWI process.
The modeling data herein is the inspection data of the incinerator 1# and incinerator 2# of a MSWI power plant in Beijing in the past 6 years, including the process variable as the input data and the measured value of the DXN emission concentration as the output data. The process variable is obtained from 53 power generation systems, 115 public electrical systems, 14 waste heat boiler systems, 79 incineration systems, 20 flue gas treatment systems and 6 terminal detection systems. The DXN emission concentration is obtained by online collection and offline analysis, and an unit of the DXN emission concentration is ng/Nm3. ⅔ of the 67 samples (45 samples) are used as training data and ⅓ (22 samples) are used as testing data.
A square error is taken as the loss function both in RF method and GBDT method. The number of the training sample is 45. A range of the number of input feature is [10,20,30,40,50,60,70,80,90,100]. A range of the iteration time of the GBDT is [1,2,3,4,5,6,7,8,9]. The minimum number of the training sample included in the leaf node of the CART is 3. An out-of-bag data (OOB) sampled by the Bootstrap algorithm is configured to perform model testing, with a root-mean-square error (RMSE) as an evaluation index.
For a DXN emission concentration prediction model based on RF, a relationship between the number of input feature and an OOB error is shown in the Table 1, where the number of the CART is 5 (expressed as an average of 50 experiments).
As shown in Table 1, the OOB error is minimum in the case of 15 input features. A relationship between the CART in the DXN emission concentration prediction model based on RF and the OOB error is shown in Table 2, where the number of the input features is unchanged. The experimental result is an average of 50 experiments.
As shown in Table 2, the 00B error of the DXN emission concentration prediction model based on RF is minimum in the case of 40 CART, which is slightly higher than the minimum in the Table 1. Therefore, an optimization is required both in the number of the CART and the number of the input features to obtain a better prediction performance.
For a DXN emission concentration prediction model based on GBDT, a relationship between a loss function of square error and an iteration time is shown as Table 3.
As shown in Table 3, the value of loss function is gradually decreased as the iteration time increases. A decreasing of the square error is slowed when the iteration time reaches 5. Accordingly, an appropriate iteration time is necessary for reducing a computing consumption.
Therefore, a preferable parameter herein is as follows: the number of the input feature is 10; the number of the CART is 5; and the number of the GBDT-based DXN sub-model (iteration time) is 5. Statistical results of training set and testing set based on different method is shown in Table 4.
As shown in Table 4,
Based on the process variable of MSWI process, for a technical problem of detection of DXN emission concentration in real-time, a DXN emission concentration prediction model based on hybrid integration of RF and GBDT is provided. The prediction model herein has following novelty: the first DXN sub-model is established based on RF, and other multiple DXN sub-models are established based on GBDT. The dimension is reduced and the predicted error of the prediction model herein is reduced, simultaneously. Result of simulation experiments based on real data of the MSWI process indicates that the method for predicting herein has an outstanding prediction performance compared to prediction model merely based on RF or GBDT.
Number | Date | Country | Kind |
---|---|---|---|
202010083784.4 | Feb 2020 | CN | national |
This application is a continuation of International Patent Application No. PCT/CN2020/080528, filed on Mar. 21, 2020, which claims the benefit of priority from Chinese Patent Application No. 202010083784.4, filed on Feb. 10, 2020. The content of the aforementioned applications, including any intervening amendments thereto, is incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/CN2020/080528 | Mar 2020 | US |
Child | 17544213 | US |