Monitoring of MSW Incinerator Leachate Using Electronic Nose Combined with Manifold Learning and Ensemble Methods

: Waste incineration is regarded as an ideal method for municipal solid waste disposal (MSW), with the advantages of waste-to-energy, lower secondary pollution, and greenhouse gas emission mitigation. For incineration leachate, the information from the headspace gas that varies at different processing processes and might be useful for chemical analysis, is ignored. The study applied a novel electronic nose (EN) to mine the information from leachate headspace gas. By combining manifold learnings (principal component analysis (PCA) and isometric feature mapping (ISOMAP), and uniform manifold approximation and projection (UMAP) and ensemble techniques (light gradient boosting machine (lightGBM) and extreme gradient boosting (XGBT)), EN based on the UMAP-XGBT model had the best classiﬁcation performance with a 99.95% accuracy rate in the training set and a 95.83% accuracy rate in the testing set. The UMAP-XGBT model showed the best prediction ability for leachate chemical parameters (pH, chemical oxygen demand, biochemical oxygen demand, ammonia, and total phosphorus), with R 2 higher than 0.99 both in the training and testing sets. This is the ﬁrst study of the EN application for leachate monitoring, offering an easier and quicker detection method than traditional instrumental measurements for the enforcement and implementation of effective monitoring programs.


Introduction
The world generates 2.01 billion tons of municipal solid waste (MSW) annually, and waste generated per person per day averages at 0.74 kg. When looking forward, global waste is expected to grow to 3.40 billion tons by 2050 [1]. Comprised of physicochemical and biological characteristics that are aggressive to the soil, water resources, fauna and flora, and MSW is difficult to handle for most countries and regions [2]. To date, the main disposal methods for MSW are landfill and incineration. MSW landfill causes some issues to the environment, including: (1) high greenhouse gas (GHG) emissions if landfill gas is not properly collected, (2) leachate produced damages the ecosystem, (3) a larger space is needed for the project set up [3]. Therefore, waste incineration is regarded as an ideal method for MSW disposal [4], with the advantages of waste-to-energy, lower secondary pollution, greenhouse gas emission mitigation, and so on.
However, for MSW incineration, a considerable number of challenges are still generated at different points, including but not limited to leachate processing. For incineration leachate, research has mainly focused on the characteristics of leachate concentrate [5], the organic matter molecular transformation in leachate [6], and the degradation of refractory organics [7]. All these direct or indirect studies relate to the leachate headspace gas, which hints that the information in the leachate headspace gas can be mined for leachate processing or monitoring. Until now, few in-depth studies have been conducted to fetch information from the vast amounts of original data about the varieties, concentrates, and changes of those materials.

Sample Collection
The incineration leachate samples collected were from Wenling Green New Energy Co., Ltd. (Wenling, China), which was invested in by Zheneng Jinjiang Environment Holding Co., Ltd. (Hangzhou, China), who is a forerunner and leading waste-to-energy operator in China's waste-to-energy (WTE) industry. The Wenling incineration power generation plant is located in the northern part of the Eastern New District of Wenling City, next to the East China Sea, with a total area of 7.3 × 10 5 sqm. The leachate treatment scale of the incineration plant is 1600 tons/day in two phases, and for now, the treatment scale of the first phase is 800 tons/day (600 tons/day of domestic waste and 200 tons/day of dry sludge). The leachate treatment process of the incineration power plants is shown in Figure 1.
Leachate samples from six water outlets were collected on the 15 July 2022. In Figure 2, the samples from six water outlets were named leachate raw water (LRW), leachate effluent (LE), internal circulation reactor effluent, aerobic effluent (AeroE), anaerobic effluent (ANE), and MBR effluent (MBRE). The samples were preserved in a refrigerator at a temperature lower than 4 • C and were forwarded to the laboratory. Leachate samples from six water outlets were collected on the 15 July 2022. In Figure  2, the samples from six water outlets were named leachate raw water (LRW), leachate effluent (LE), internal circulation reactor effluent, aerobic effluent (AeroE), anaerobic effluent (ANE), and MBR effluent (MBRE). The samples were preserved in a refrigerator at a temperature lower than 4 °C and were forwarded to the laboratory.

Chemical Parameters Detection for Incinerator Leachate
In general, incinerator leachate is tested by conventional parameters, including pH, chemical oxygen demand (COD), biochemical oxygen demand after 5 days (BOD5), ammonia (NH4+-N), and total phosphorus (TP). The values of those conventional parameters exhibit considerable differences due to variations in composition and moisture content, as well as seasonal factors and incinerator location. The value of pH was tested by the electrode method [15]. For COD detection, the dichromate method is not suitable for the water samples in which the chloride ion concentration is higher than 1000 mg/L. The chlorine emendation method was applied to detect the contents of COD in the incinerator leachate  Leachate samples from six water outlets were collected on the 15 July 2022. In Figure  2, the samples from six water outlets were named leachate raw water (LRW), leachate effluent (LE), internal circulation reactor effluent, aerobic effluent (AeroE), anaerobic effluent (ANE), and MBR effluent (MBRE). The samples were preserved in a refrigerator at a temperature lower than 4 °C and were forwarded to the laboratory.

Chemical Parameters Detection for Incinerator Leachate
In general, incinerator leachate is tested by conventional parameters, including pH, chemical oxygen demand (COD), biochemical oxygen demand after 5 days (BOD5), ammonia (NH4+-N), and total phosphorus (TP). The values of those conventional parameters exhibit considerable differences due to variations in composition and moisture content, as well as seasonal factors and incinerator location. The value of pH was tested by the electrode method [15]. For COD detection, the dichromate method is not suitable for the water samples in which the chloride ion concentration is higher than 1000 mg/L. The chlorine emendation method was applied to detect the contents of COD in the incinerator leachate

Chemical Parameters Detection for Incinerator Leachate
In general, incinerator leachate is tested by conventional parameters, including pH, chemical oxygen demand (COD), biochemical oxygen demand after 5 days (BOD 5 ), ammonia (NH4+-N), and total phosphorus (TP). The values of those conventional parameters exhibit considerable differences due to variations in composition and moisture content, as well as seasonal factors and incinerator location. The value of pH was tested by the electrode method [15]. For COD detection, the dichromate method is not suitable for the water samples in which the chloride ion concentration is higher than 1000 mg/L. The chlorine emendation method was applied to detect the contents of COD in the incinerator leachate samples [16]. The concentration of ammonia nitrogen was measured according to Nessler's reagent spectrophotometry [17]. The alkaline potassium persulfate digestion UV spectrophotometric method [18] was used to detect total nitrogen content. The ammonium molybdate spectrophotometric method [19] was applied to detect the content of TP.

E-Nose Detection
The EN mainly consists of two parts: the sensor array, which is applied to sense the information in the sample's headspace, and the software part, which handles the information received from the sensors. According to the sensing materials, metal-oxidesemiconductor (MOS), quartz crystal microbalance, and surface acoustic wave sensors are most applied in the EN system [20]. MOS gas sensors are most sensitive to hydrogen and unsaturated hydrocarbons or solvent vapors containing hydrogen atoms [21]. The headspace gas from incinerator leachate contents is mainly composed of volatile organic compounds (VOCs), including hydrogen sulphide, methyl mercaptan, acetylene, propylene, and ethylene, and also varies according to source, season, the incinerator site, and so on [22].
A commercial PEN2 electronic nose (Airsense Analytics, GmBH, Schwerin, Germany) was applied to detect the headspace gas from incinerator leachate samples at different processing procedures. For PEN2, MOS sensors are the core part, and the details of MOS sensors are presented in Table 1. The MOS sensors convert the information about gas types and concentrations into an electrochemical signal. The EN signal was expressed as G/G0, where G and G0 represent the resistance of a sensor in sample headspace gas and clean air. As the sensors are cross-sensitive to a class of gas compounds, EN does not give the specific information of one material but offers the headspace gas complementary information. During the EN detection, incinerator leachate liquid with a 5 mL volume was placed into a 500 mL beaker. The beaker was sealed by plastic wrap and was kept still for 30 min to balance the headspace gas generated from the incinerator leachate. Two holes were made, one for EN detection and the other for a steady stream of gas while EN detecting. The EN detection time was set to 80 s, and then the gas path and sensor chamber were cleaned with clean air. The gas flow rate was 200 mL/min, and one signal per second was collected. Landfill leachate was collected from six water outlets with 24 samples; thus, 144 samples (24 samples × 6 procedures) were prepared. All the detection was accomplished on the sample collection day.

Principal Component Analysis
As a multivariate technique, Principal Component Analysis (PCA) was applied to analyze a data set consisting of several inter-correlated quantitative dependent variables [23]. By calculating eigenvalue and eigenvector from the covariance matrix of the original data set, the new orthogonal variables will be derived and usually called principal components (PC). The cumulative contribution rate of PCs should reach more than 85% of the total variance, then the PCA will be considered to have extracted the main information of the original data.

Isometric Feature Mapping
As a nonlinear dimensionality reduction technique, Isometric Feature Mapping (ISOMAP) maintains the essential geometric structure of nonlinear data [24]. ISOMAP is multidimensional scaling combined with geodesic distance for reducing the dimensionality of data sampled from a smooth manifold. ISOMAP tries to solve the shortest path to obtain the geodesic distance that preserves the characteristics of high-dimensional data structures as much as possible. Multidimensional scaling is used to calculate the coordinates of each data point in the low-dimensional space, and the original data is embedded in the high-dimensional set.

Uniform Manifold Approximation and Projection
As a nonlinear dimensionality reduction technique, uniform manifold approximation and projection (UMAP) was developed for the analysis of any type of high-dimensional data [25]. From a theoretical framework based in Riemannian geometry and algebraic topology, UMAP learns the data representation between points in high-dimensional space and maps to low dimensions by calculating the joint probability density between highdimensional sample points. Spectral clustering analysis is used to initialize the lowdimensional data and then project it into the low-dimensional space. Adjustable parameters are used in joint probability density to control the change of conditional probability and ensure the symmetry of the data. Low-dimensional data also provides two parameters to adjust the aggregation of mapped data so that low-dimensional data can better fit high-dimensional spatial data.

Classification and Regression Tree
Classification and Regression Tree (CART) selects features based on the minimization of the Gini coefficient to generate a binary tree. By pre-pruning through empirical judgment, the useless attributes can be removed. After the construction is completed, the algorithm can resist overfitting and has better generalization ability by cutting off a part of the information with less proportion.
In addition, the shortcomings that CART cannot handle large amounts of data, underfitting, and overfitting, can be overcome by integrating multiple CART classifiers into a single ensemble model with the ideas of bagging and boosting. The study selects a boosting algorithm with relatively stable generalization performance. Boosting is a kind of optimization algorithm based on the greedy strategy of selecting the fixed loss function (optimization function and objective function) based on a greedy strategy for the optimization of the loss function, committed to obtaining the minimum loss optimization function, such as eXtreme Gradient Boosting (XGBT) and Light Gradient Boosting Machine (lightGBM).

eXtreme Gradient Boosting
eXtreme Gradient Boosting (XGBT) uses the first and second partial derivatives, and the second derivatives help the gradient descend faster and more accurately. Using Taylor expansion to obtain the function as the second derivative form of the independent variable, the leaf splitting optimization calculation can be carried out only by relying on the value of the input data without selecting the specific form of the loss function, essentially separating the selection of the loss function from the optimization of the model algorithm/parameter selection [26]. The algorithm goes: 1. Initialize model with a constant value: For m = 1 to M: a. Compute so-called pseudo-residuals: by solving the optimization

Light Gradient Boosting Machine
Light Gradient Boosting Machine (LightGBM) grows trees leaf-wise instead of levelwise, yielding the largest loss decrease. LightGBM implements a highly optimized histogrambased decision tree learning algorithm, which greatly improves efficiency and memory consumption [27]. The algorithm goes: (1) The sample points are sorted in descending order according to the absolute value of their gradient; (2) Select the first samples of the sorted results to generate a subset of large gradient sample points; (3) For 100% samples of the remaining sample set (1 − a), randomly select b (1 − a) × 100% sample points to generate a set of small gradient sample points; (4) Merge the large gradient samples with the sampled small gradient samples; (5) Multiply the small gradient samples by a weight coefficient; (6) Learn a new weak learner (CART) using the above-sampled samples; (7) Continuously repeat steps (1)~(6) until the specified number of iterations or convergence is reached.

The Evaluation of Data Processing
To evaluate the accuracy and precision of the established models, 100 samples were set as the training data, and the rest, 44 samples, were set as the testing data.
The receiver operative curve (ROC) was deployed as a performance indicator for the classification models. True positive and negative rates are the most commonly used to evaluate the performance of classification tests. The higher the probability value of these two indicators, represents the better the judgment effect in the model [28].
The coefficient R 2 and RMSE were selected as the evaluation parameters for prediction models. The higher the R 2 was and the lower the RMSE was, the more accurate the prediction ability of the model would be.

The Chemical Parameter Changes of Leachate
The composition of leachate is highly variable and heterogeneous. In general, incinerator leachate is tested using conventional parameters, including pH, COD, BOD 5 , ammonia, and TP. The changes in the chemical parameters for leachate samples are shown  Table 2. There were statistically significant differences (Turkey HSD, p < 0.05) in the contents of COD, BOD 5 , ammonia, TN, and TP. It was noteworthy that the values of COD decreased significantly at each process procedure. The changes were also noticeable for BOD, ammonia, TN, and TP. Different from the other five chemical parameters, the value of pH changes a lot at the last processing procedure. All the chemical parameters were all up to standard, when the processed leachate was discharged to the municipal pipe network.

The Result of EN Detection
The EN was used to analyze the headspace gas of the leachate samples at different process periods. A typical response of the EN sensors array during exposure to sample gas, which was randomly selected from the 144 samples, is depicted in Figure 3a. The procedure of extracting the sample gas from the beaker to the sensing chamber took 5 seconds, and then the sensors could react with the gas. The sensor signals changed significantly from 5 to 35 s, and then the signals achieved a dynamic equilibrium. The various signal values (maximum or minimum), the shifts, the response areas, and so on indicated that the sensors offered unique and abundant characteristics about the headspace gas of the leachate samples. To simplify data processing, sensor signals at 80th second were selected as the input data of the analysis models. To fully understand the sensor signals, the average values of 10 sensors were calculated and shown in Figure 3b. The overall signals (at the 80th second) varied a lot in the first three process periods (LRW, LE, and ICRE), and for ANE, AeroE, and MBRE samples, the signals changed not so remarkably. To further analyze the behaviors of those sensors, a radar fingerprint chart of EN signals is shown in Figure 3c: S2, the most sensitive sensor, showed the biggest variance, and S10 stayed almost still, and S10 stayed almost still. The impacts of leachate headspace gas on the responses of S8, S6, S9, and S5 were to different degrees, and those on S7 and S4 were not so obvious.

Data Reduction Based on Manifold Learning
Data reduction helps transfer an abundant and disordered original data set into a simplified and ordered form. PCA is a popular technology in dimensionality reduction and is flexible, fast, and easily interpretable. PCA does not perform well when there are

Data Reduction Based on Manifold Learning
Data reduction helps transfer an abundant and disordered original data set into a simplified and ordered form. PCA is a popular technology in dimensionality reduction and is flexible, fast, and easily interpretable. PCA does not perform well when there are nonlinear relationships within the data. For high-dimensional data, it is difficult to affirm whether the EN data is linear or not linear. ISOMAP and UMAP, as two kinds of manifold learning, were applied and compared with PCA.
The best description of differences in the original data can be found by calculating the eigen-decomposition of positive semi-definite matrices and the singular value decomposition of rectangular matrices. The PCs are ordered by ranking according to their contribution (eigenvalue). In Figure 4a, the contributions of PCs are displayed, and the first three PCs have extracted the most information from the original EN data at the 80th second (more than 85%). The sample distributions of 144 samples based on the first three PCs (PC1, PC2, and PC3) are shown in Figure 4b. LPW, LE, and ICRE can be easily classified, but ANE, AeroE, and MBRE are overlapped in a three-dimensional space. The result is similar to those shown in Figure 3b in some ways.

Data Reduction Based on Manifold Learning
Data reduction helps transfer an abundant and disordered original data set into a simplified and ordered form. PCA is a popular technology in dimensionality reduction and is flexible, fast, and easily interpretable. PCA does not perform well when there are nonlinear relationships within the data. For high-dimensional data, it is difficult to affirm whether the EN data is linear or not linear. ISOMAP and UMAP, as two kinds of manifold learning, were applied and compared with PCA.
The best description of differences in the original data can be found by calculating the eigen-decomposition of positive semi-definite matrices and the singular value decomposition of rectangular matrices. The PCs are ordered by ranking according to their contribution (eigenvalue). In Figure 4a, the contributions of PCs are displayed, and the first three PCs have extracted the most information from the original EN data at the 80th second (more than 85%). The sample distributions of 144 samples based on the first three PCs (PC1, PC2, and PC3) are shown in Figure 4b. LPW, LE, and ICRE can be easily classified, but ANE, AeroE, and MBRE are overlapped in a three-dimensional space. The result is similar to those shown in Figure 3b in some ways. ISOMAP constructs the geodesic distance graph from the original EN data, and uses eigenvalue decomposition of MDS on the geodesic distance matrix to achieve low-dimensional embeddings. The ICs are ordered and displayed by ranking the contribution in Figure 5a. The first three ICs also extracted the most information of the original EN data at 80th second (more than 85%), which are very similar to that in Figure 4a. Because PCA ISOMAP constructs the geodesic distance graph from the original EN data, and uses eigenvalue decomposition of MDS on the geodesic distance matrix to achieve lowdimensional embeddings. The ICs are ordered and displayed by ranking the contribution in Figure 5a. The first three ICs also extracted the most information of the original EN data at 80th second (more than 85%), which are very similar to that in Figure 4a. Because PCA and ISOMAP used eigen-decomposition and eigenvalue in this study, but there were minuscule differences in the data. The sample distributions based on the first three ICs are shown in Figure 5b. Similar as in Figure 4b, LPW, LE, and ICRE can be easily classified, but ANE, AeroE, MBRE overlapped.
UMAP preserves the local and global data structures and offers short run times based on Riemannian geometry and algebraic topology. Calculating the distance between embedding spaces is an approximate measure to determine how sensitive the canonical embedding space's topology is, which is the feature importance. Figure 6a provides a careful look at the feature importance of 10 UCs, which is quite different from Figures 4a and 5a. The first three UCs obviously did not extract more than 85% of the information from the original EN data, but not meaning that UMAP would do badly for later classification and prediction. In Figure 6b, most of the samples seem to be clustered narrowly. LPW, LE, and ICRE are classified clearly and can be easily distinguished in a three-dimensional space. ANE and AeroE are overlapped, along with two MERE samples. In general, in three-dimensional space, UMAP outperformed the PCA and ISOMAP in Figures 4b and 5b. and ISOMAP used eigen-decomposition and eigenvalue in this study, but there were minuscule differences in the data. The sample distributions based on the first three ICs are shown in Figure 5b. Similar as in Figure 4b, LPW, LE, and ICRE can be easily classified, but ANE, AeroE, MBRE overlapped. UMAP preserves the local and global data structures and offers short run times based on Riemannian geometry and algebraic topology. Calculating the distance between embedding spaces is an approximate measure to determine how sensitive the canonical embedding space's topology is, which is the feature importance. Figure 6a provides a careful look at the feature importance of 10 UCs, which is quite different from Figures 4a and 5a. The first three UCs obviously did not extract more than 85% of the information from the original EN data, but not meaning that UMAP would do badly for later classification and prediction. In Figure 6b, most of the samples seem to be clustered narrowly. LPW, LE, and ICRE are classified clearly and can be easily distinguished in a three-dimensional space. ANE and AeroE are overlapped, along with two MERE samples. In general, in three-dimensional space, UMAP outperformed the PCA and ISOMAP in Figures 4b and 5b.   UMAP preserves the local and global data structures and offers short run times based on Riemannian geometry and algebraic topology. Calculating the distance between embedding spaces is an approximate measure to determine how sensitive the canonical embedding space's topology is, which is the feature importance. Figure 6a provides a careful look at the feature importance of 10 UCs, which is quite different from Figures 4a and 5a. The first three UCs obviously did not extract more than 85% of the information from the original EN data, but not meaning that UMAP would do badly for later classification and prediction. In Figure 6b, most of the samples seem to be clustered narrowly. LPW, LE, and ICRE are classified clearly and can be easily distinguished in a three-dimensional space. ANE and AeroE are overlapped, along with two MERE samples. In general, in three-dimensional space, UMAP outperformed the PCA and ISOMAP in Figures 4b and 5b.

Classification Result Based on LightGBM
ROC graphs are used to organize classifiers and visualize the results. As can be seen from the ROC curve of the lightGBM in Figure 7, the classification accuracy results of the original, PCA, ISOMAPa, and UMAP data are very different in the training set, respectively. In the view of the lightGBM models, the PCA better retains the majority of the information of the original data set according to the ROC curve, and the overall AUC area reaches 100%. From the ROC results, the performance of UMAP was better than that of ISOMAP, and only one category 6 classification showed partial errors. In the ISOMAP-lightGBM model, samples from classes 4 and 6 were misclassified. from the ROC curve of the lightGBM in Figure 7, the classification accuracy results of the original, PCA, ISOMAPa, and UMAP data are very different in the training set, respectively. In the view of the lightGBM models, the PCA better retains the majority of the information of the original data set according to the ROC curve, and the overall AUC area reaches 100%. From the ROC results, the performance of UMAP was better than that of ISOMAP, and only one category 6 classification showed partial errors. In the ISOMAP-lightGBM model, samples from classes 4 and 6 were misclassified. Class S1 refers to LRW samples, class S2 refers to LE, class S3 refers to ICRE, class S4 refers to AeroE, class S5 refers to ANE, and class S6 refers to MBRE.
To explore further the models based on different data sets, testing data sets were applied to verify the classification result. Moreover, to reduce the volatility attributable, each model was run 20 times, and the average results are displayed in Table 3. The best classification performance is the UMAP-XGBT model, with a 99.95% accuracy rate in the training set and a 97.36% accuracy rate in the testing set, indicating that UMAP-XGBT has the most stable robustness. For PCA-lightGBM, the classification results were worse than those of the original-lightGBM model in the testing set. ISOMAP-lightGBM has a 100% average accuracy rate in the training set and a 96.81 average accuracy rate in the testing set. Class S1 refers to LRW samples, class S2 refers to LE, class S3 refers to ICRE, class S4 refers to AeroE, class S5 refers to ANE, and class S6 refers to MBRE.
To explore further the models based on different data sets, testing data sets were applied to verify the classification result. Moreover, to reduce the volatility attributable, each model was run 20 times, and the average results are displayed in Table 3. The best classification performance is the UMAP-XGBT model, with a 99.95% accuracy rate in the training set and a 97.36% accuracy rate in the testing set, indicating that UMAP-XGBT has the most stable robustness. For PCA-lightGBM, the classification results were worse than those of the original-lightGBM model in the testing set. ISOMAP-lightGBM has a 100% average accuracy rate in the training set and a 96.81 average accuracy rate in the testing set.

Classification Result Based on XGBT
According to the ROC results in Figure 8, the best performance of classification models is PCA-XGBT, with only two error classification cases. In the UMAP-XGBT model, samples of LE and AeroE were misclassified, but the overall performance was better than that of the original-XGBT. The ISOMAP-XGBT model, with the worst classification performance, has categories LE and AeroE misclassified. According to the training set, ISOMAP-XGBT has the worst performance. According to the ROC results in Figure 8, the best performance of classification models is PCA-XGBT, with only two error classification cases. In the UMAP-XGBT model, samples of LE and AeroE were misclassified, but the overall performance was better than that of the original-XGBT. The ISOMAP-XGBT model, with the worst classification performance, has categories LE and AeroE misclassified. According to the training set, ISO-MAP-XGBT has the worst performance. Class S1 refers to LRW samples, class S2 refers to LE, class S3 refers to ICRE, class S4 refers to AeroE, class S5 refers to ANE, and class S6 refers to MBRE.
For XGBT models, the data training took 20 times to decrease the instability, and the result is shown in Table 4. The accuracy rates of XGBT models were lower than those of the lightGBM-based models. From Table 4, it can be concluded that the UMAP-XGBT model had the best classification performance, with a 99.95% accuracy rate in the training set and a 95.83% accuracy rate in the testing set.
From Tables 3 and 4, models of original-XGBT, PCA-XGBT, and ISOMAP-XGBT always had a satisfying performance with a 100% accuracy rate in the training set in the 20 times it was run, while they fell short in the testing set. This might be because the models were overfit in the modeling, so the results in the testing set were not very good. Class S1 refers to LRW samples, class S2 refers to LE, class S3 refers to ICRE, class S4 refers to AeroE, class S5 refers to ANE, and class S6 refers to MBRE.
For XGBT models, the data training took 20 times to decrease the instability, and the result is shown in Table 4. The accuracy rates of XGBT models were lower than those of the lightGBM-based models. From Table 4, it can be concluded that the UMAP-XGBT model had the best classification performance, with a 99.95% accuracy rate in the training set and a 95.83% accuracy rate in the testing set. From Tables 3 and 4, models of original-XGBT, PCA-XGBT, and ISOMAP-XGBT always had a satisfying performance with a 100% accuracy rate in the training set in the 20 times it was run, while they fell short in the testing set. This might be because the models were overfit in the modeling, so the results in the testing set were not very good.

Prediction Results Based on LightGBM
As an ensemble learning program, lightGBM aims to build a comprehensive model by parallelizing and serializing weak learners (CART). For lightGBM models, a histogram-based algorithm and tree leaf-wise growth strategy with a maximum depth limit are adopted to increase the training speed. For lightGBM, the max-features was set 4, and the tree leaf-wise was set 3 to simplify the lightGBM model in preliminary work. Then, the number of decision trees was the most important parameter in the later modeling. The number of CARTs was optimized, and the optimization procedure is carried out in Figure 9 (taking COD prediction as an example) according to the R 2 s and EMSEs in the training set and testing set. The lightGBM was 20 times for each number of CARTs to minimize the contingency. As shown in Figure 9, the results in the training set have stable precision with the increasing decision tree numbers. While the result precision in the testing set was not stable regardless of which data set was applied. All in all, the lightGBM model with the UMAP data set had a slight advantage (not so obvious) when compared to the other three data sets. When the number of CARTs was 25, the lightGBM model showed a satisfactory result. The best number of CARTs for the lightGBM model was decided. As an ensemble learning program, lightGBM aims to build a comprehensive model by parallelizing and serializing weak learners (CART). For lightGBM models, a histogram-based algorithm and tree leaf-wise growth strategy with a maximum depth limit are adopted to increase the training speed. For lightGBM, the max-features was set 4, and the tree leaf-wise was set 3 to simplify the lightGBM model in preliminary work. Then, the number of decision trees was the most important parameter in the later modeling. The number of CARTs was optimized, and the optimization procedure is carried out in Figure  9 (taking COD prediction as an example) according to the R 2 s and EMSEs in the training set and testing set. The lightGBM was 20 times for each number of CARTs to minimize the contingency. As shown in Figure 9, the results in the training set have stable precision with the increasing decision tree numbers. While the result precision in the testing set was not stable regardless of which data set was applied. All in all, the lightGBM model with the UMAP data set had a slight advantage (not so obvious) when compared to the other three data sets. When the number of CARTs was 25, the lightGBM model showed a satisfactory result. The best number of CARTs for the lightGBM model was decided. EN signals offered the entirety of the information on leachate headspace gas, which mainly consisted of volatile organic compounds (VOCs), including hydrogen sulphide, methyl mercaptan, acetylene, and so on. The materials in the headspace gas were most closely correlated to the value of COD. The EN combined with the data mining method could predict the contents of COD in leachate samples. Table 5 summarizes the prediction results based on different data reductions for five chemical parameters (pH, COD, BOD 5 , AN, TN, and TP). According to the R 2 s and EMSEs in the training set and testing set, the PCA process had no effect on the lightGBM models compared to models based on original data. LightGBM models based on ISOMAP and UMAP showed satisfactory outcomes. The prediction of five chemical parameters based on the UMAP-lightGBM model showed the best performance with an R 2 higher than 0.98 in the training set and testing set (RMSEs are not comparable when it comes to different parameters and units).

Prediction Results Based on XGBT
As with lightGBM, the parameters of the XGBT models, including the number of trees, maximal depth, and minimum rows, were optimized. Finally, the max-features was set 4, max-depth was set 2, and min-split was set 2 to simplify the XGBT model. The number of decision trees (CART) was the most important parameter in the later modeling. The optimization procedure is carried out in Figure 10 (taking COD prediction as an example) according to the R 2 s and EMSEs in the training set and testing set, 20 times for each modeling step to minimize the contingency. As shown in Figure 10, the overall prediction performance of XGBT models was much better than lightGBM in Figure 10. XGBT models have very strong robustness and stability, particularly for the ISOMAP-XGBT models in Figure 10c. When the number of decision trees was 25, the XGBT models had a relatively satisfactory result, meanwhile the model was not so big, the same as the lightGBM models.  Table 6 summarizes the prediction results based on different data reductions for five chemical parameters. All in all, prediction results based on XGBT have achieved better results compared to lightGBM models. Different from PCA-lightGBM models, PCA-XGBT models have a slight edge over the (original data)-XGBT models in the testing set.  Table 6 summarizes the prediction results based on different data reductions for five chemical parameters. All in all, prediction results based on XGBT have achieved better results compared to lightGBM models. Different from PCA-lightGBM models, PCA-XGBT models have a slight edge over the (original data)-XGBT models in the testing set. The prediction of five chemical parameters based on the UMAP-XGBT model showed the best performance with an R 2 higher than 0.99 in the training set and testing set. For prediction models, the overall performances of XGBT models are better than those of lightGBM, and the data set based on UMAP reduction has a slight advantage both in the training set and the testing set when compared to other three data sets.

Conclusions
MSW incineration is regarded as an ideal method for MSW disposal, with many advantages. This study applied an EN detection method for monitoring MSW incinerator leachate combined with manifold learning and ensemble methods. Some conclusions can be drawn: (1) COD, BOD5, ammonia, TN, and TP of leachate were significantly changed during the processing procedure, especially for COD; (2) EN sensors offered unique and abundant characteristics of leachate samples in the headspace gas. The signals at the 80th second varied a lot in the first three process periods (LRW, LE, and ICRE), for ANE, AeroE, and MBRE samples, the signals changed not so remarkably; (3) Manifold learnings (PCA, ISOMAP, and UMAP) were applied to extract the information hidden in the headspace gas of leachate detected by EN. The first three PCs and ICs have extracted the most information from the original data (>85%), and samples of LPW, LE, and ICRE could be easily classified according to the three-dimensional space, while others were not so satisfied. UMAP outperformed the performance of PCA and ISOMAP; (4) Ensemble methods (LightGBM and XGBT) were applied to mine the relationship between EN signals of leachate headspace gas and chemical parameter changes combined with PCA, ISOMAP, and UMAP. The UMAP-XGBT model had the best classification performance, with a 99.95% accuracy rate in the training set, and a 95.83% accuracy rate in the testing set. The UMAP-XGBT model showed the best prediction ability for the leachate chemical parameters R 2 higher than 0.99 in the training and testing sets.
Up until now, there have been few in-depth studies that have been conducted to fetch information from the headspace gas of leachate samples. This is the first study with an EN application for leachate monitoring based on manifold learning and ensemble methods, offering an easier and quicker monitoring method than traditional instrumental measurements. Future work will focus on the potential relationship between microorganisms and headspace gas in the leachate based on EN technology to fully understand the MSW incineration leachate chemical parameter changes, which is quite important for leachate disposal.