Next Article in Journal
An Enhanced Approach for Urban Sustainability Considering Coordinated Source-Load-Storage in Distribution Networks Under Extreme Natural Disasters
Previous Article in Journal
Managerial Perspectives on the Use of Environmentally Friendly Energy in Accommodation Facilities in Northern Cyprus
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Retrieval and Evaluation of NOX Emissions Based on a Machine Learning Model in Shandong

1
Key Laboratory of Photoelectric Conversion and Utilization of Solar Energy, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, Qingdao 266101, China
2
Extended Energy Big Data and Strategy Research Center, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, Qingdao 266101, China
3
Shandong Energy Institute, Qingdao 266101, China
*
Author to whom correspondence should be addressed.
Sustainability 2025, 17(13), 6100; https://doi.org/10.3390/su17136100
Submission received: 17 May 2025 / Revised: 25 June 2025 / Accepted: 27 June 2025 / Published: 3 July 2025
(This article belongs to the Section Air, Climate Change and Sustainability)

Abstract

Nitrogen oxides (NOX) are important precursors of ozone and secondary aerosols. Accurate and timely NOX emission estimates are essential for formulating measures to mitigate haze and ozone pollution. Bottom–up and satellite–constrained top–down methods are commonly used for emission inventory compilation; however, they have limitations of time lag and high computational demands. Here, we propose a machine learning model, WOA-XGBoost (Whale Optimization Algorithm–Extreme Gradient Boosting), to retrieve NOX emissions. We constructed a dataset incorporating satellite observations and conducted model training and validation in the Shandong region with severe NOX pollution to retrieve high spatiotemporal resolution of NOX emission rates. The 10–fold cross–validation coefficient of determination (R2) for the NOX emission retrieval model was 0.99, indicating that WOA-XGBoost has high accuracy. Validation of the model for the other year (2019) showed high agreement with MEIC (Multi–resolution Emission Inventory for China), confirming its strong robustness and good temporal transferability. The retrieved NOX emissions for 2021–2022 revealed that emission rate hotspots were located in areas with heavy traffic flow. Among 16 prefecture–level cities in Shandong, Zibo exhibited the highest NOX rate (>1 μg/m2/s), explaining its high NO2 pollution levels. In the future, priority areas for emission reduction should focus on heavy industry clusters such as Zibo and high traffic urban centers.

1. Induction

Nitrogen oxides (NOX = NO + NO2) are important gaseous pollutants in the troposphere [1,2]. They actively participate in the formation of tropospheric ozone (O3) and secondary aerosols, causing significant impacts on human health and the atmospheric environment. Accurate and timely information on NOX emissions is essential for predicting air quality, adjusting environmental policies, and mitigating air pollution [3]. However, both bottom–up and top–down methods for compiling emission inventories have some limitations, such as time lags and large computational costs. Therefore, there is an imperative need to develop an efficient, low cost, and sufficiently accurate method for compiling grid–based emission inventories.
The bottom–up approach calculates emissions based on sectoral activity levels and emission factors [4,5]. Due to the extensive data collection and measurement efforts required by this method, the compiled emission inventory cannot be promptly updated to reflect recent/current accurate emission profiles, especially for air quality forecasting [6,7]. The top–down approach is another established method for estimating emissions, which constrains emission estimations by results from ground–based monitoring data and satellite observations [8]. Compared with ground observation stations, satellite remote sensing has the advantages of wide spatial coverage, high spatial resolution, and daily monitoring [9,10], providing reliable data support for top–down emission constraints [11]. The primary challenge for emissions retrieval through satellite–based top–down approach lies in accurately quantifying the nonlinear relationship between environmental NO2 concentration and NOX emissions. The nonlinear relationship becomes complicated due to the coupled atmospheric transport, chemical transformation, and physical processes [12]. Atmospheric chemical transport models (CTM) [13,14,15] have been widely adopted to establish emission–concentration relationships, with data assimilation techniques like Kalman filtering [16] and 4D-Var [17] enhancing their precision. However, CTMs exhibit inherent limitations, including systematic biases [18,19] and huge calculation resource consumption [20]. Additionally, the statistical exponentially modified Gaussian model along with meteorological elements can be used to convert NO2 VCD to NOX emissions, but most of these models are only applicable to isolated point source emissions [21,22,23].
To overcome the limitations of the model in emission retrieval, Qin et al. [24] and Li et al. [20] developed a new model–free NOX inversion estimation approach based on the mass conservation equation. This approach quantifies daily NOX emissions and their uncertainties through linear regression of three key parameters, including temporal rate of change in column loadings, first–order chemical loss of NOX, and gradient transport of NOX. Notably, although this approach was free from the limitations of the CTMs, it constrains the nonlinear NOX chemical loss and transport to first–order processes [25].
As a transformative tool in interdisciplinary science, machine learning has been widely used in the field of atmospheric science, which can automatically extract key features of input data and capture the behavior of target data [26,27,28]. Machine learning can capture the nonlinear relationship between emissions and pollutant concentrations well, with high computational efficiency [7]. For example, Wei et al. [28,29] used machine learning to construct near surface PM, SO2, CO, and NO2 concentrations based on a variety of data, including ground observations, satellites, and reanalysis. Relevant studies on estimating NOX emissions using machine learning have been conducted.
Huang et al. [7] proposed a novel emission inventory estimation framework (neural network, NN) based on dual learning. The framework first employs a neural network model to investigate the complex relationships between emissions and atmospheric pollution as a substitute for CTMs, followed by implementing a backpropagation–based optimization algorithm to iteratively update the emission inventory. Xing et al. [12] and Chen et al. [8] proposed a physically informed variational autoencoder (VAE) and an ensemble backpropagation neural network (eBPNN), respectively, for NOX emission retrieval. However, both methods still required CTMs to generate training datasets. Although He et al. [30] did not use CTM results in training the model built with a convolutional neural network (CNN) and long short–term memory (LSTM), the retrieved NOX emissions results lacked sufficient refinement, exhibiting a spatial resolution of only 1.1° × 1.1°. Table 1 summarizes relevant studies using machine learning algorithms, including models, contributions, and limitations.
In order to address the shortcomings of current emission inventory compilation methods, we proposed a new method for retrieving NOX emissions using machine learning techniques. We established a nonlinear relationship between multi–source data and NOX emissions using extreme gradient boosting (XGBoost), with hyperparameters optimized via the whale optimization algorithm (WOA) to improve prediction accuracy. The method can provide near–real–time NOX emission characteristics at a low cost. The training dataset, constructed independently of CTM results, enables high spatial resolution (0.05° × 0.05°) for NOX emission retrieval. This method provides highly efficient and accurate technical support for atmospheric pollution monitoring and governance. This paper is structured as follows. Section 2 outlines the multi–source data, data processing methods, and machine learning model. Section 3 outlines the long-term spatial and temporal variation of NO2 VCD in Shandong Province, the validation in the NOX emissions retrieval model, and the prediction of NOX emissions. Section 4 draws the main conclusions of this study.

2. Data and Methods

2.1. Study Area

Shandong Province is located on the coast of East China and downstream of the Yellow River, with most areas below 50 m in elevation, and it has a warm–temperate monsoon climate. Southeast winds dominate in spring and summer, while northwest winds dominate in fall and winter. As shown in Figure 1, it comprises 16 prefecture–level cities, which are geographically categorized into Northwest Shandong region (Liaocheng, Dezhou, Binzhou, and Dongying), Central Shandong region (Jinan, Taian, Zibo, and Weifang), South Shandong region (Heze, Jining, Zaozhuang, Linyi, and Rizhao), and Jiaodong Peninsula region (Qingdao, Yantai, and Weihai). Due to the developed heavy industry and high consumption of fossil fuels in Shandong Province, it has become one of the areas with the most severe air pollution in China. Establishing an accurate and timely NOX emission inventory is crucial for implementing differentiated pollution control and emission reduction strategies. In this study, we explore the spatiotemporal variation of NO2 pollution in Shandong Province and carry out NOX emission retrieval research.

2.2. NO2 Tropospheric Vertical Column Measurements

NO2 tropospheric vertical column density (VCD) measurements were derived from OMI and TROPOMI. OMI, onboard the Aura satellite, was launched by NASA on 15 July 2004, with a local equator overpass time of 13:45 ± 0:15 [31]. As a nadir-viewing imaging spectrometer, it measures direct and atmospheric backscattered sunlight in the UV/vis ranges from 270 to 500 nm. The daily NO2 VCD data (OMI-Aura_L3-OMNO2d) were acquired from NASA Goddard Earth Sciences Data & Information Services center (GES-DISC), featuring a spatial resolution of 0.25° × 0.25°.
TROPOMI, onboard the Sentinel-5 Precursor (S5P) satellite, was launched by the European Space Agency (ESA) on 13 October 2017, with a local equatorial overpass time at 13:30. Its NO2 observations achieve a spatial resolution of 3.5 km × 7 km [32], representing a significant improvement over OMI’s capabilities. This improved resolution enables finer scale spatial analysis. The daily tropospheric NO2 VCD data were obtained from POMINO-TROPOMI v1.2.2, which was developed for the Asian region by Liu et al. [33]. POMINO-TROPOMI v1.2.2 has a spatial resolution of 0.05° × 0.05° and has undergone stringent quality control, including but not limited to qa_value ≥ 0.5, CF (cloud fraction) ≤ 0.5, and AOD (aerosol optical thickness) < 5.

2.3. Meteorological Reanalysis Data

ERA5 is the latest meteorological dataset from the European Center for Medium-Range Weather Forecasts (ECMWF) (https://cds.climate.copernicus.eu/, accessed on 14 May 2025). It provides hourly data of multiple meteorological variables on a global scale with a spatial resolution of 0.25° × 0.25°. The meteorological elements used in this study include 2 m_temperature, 2 m_dewpoint temperature, BLH (Boundary Layer Height), SNDSF (Surface Net Downward Shortwave Flux), 10 m_U wind, and 10 m_V wind. The meteorological elements were spatially resampled onto 0.05° × 0.05° grids via bilinear interpolation to ensure consistency with other datasets.

2.4. Prior Emissions

The Multi–resolution Emission Inventory model for Climate and air pollution research (MEIC) provides bottom–up pollutant emission inventories for five sectors (agriculture, industry, power, residential, transportation) in mainland China [34,35]. MEIC provides monthly total emissions on 0.25° × 0.25° grids (http://meicmodel.org.cn, accessed on 14 May 2025), which were temporally disaggregated to daily totals using sector–specific emission time allocation coefficients. For spatial refinement, we employed the nighttime light remote sensing dataset from Harvard University [36] to downscale emissions from 0.25° to 0.05°grids. Grid cells contributing less than 2.5% of total emissions were excluded based on established thresholds [24]. The resulting NOX emission rate was calculated in units of μg/m2/s and served as a prior value for daily emissions.

2.5. Machine Learning (ML) Model and Input Variables

The ML model used in this study was built based on the Whale Optimization Algorithm (WOA) and Extreme Gradient Boosting (XGBoost) algorithm.

2.5.1. WOA

WOA is a metaheuristic optimization algorithm proposed by Mirjalili and Lewis [37], inspired by the hunting behavior of humpback whales. WOA mainly simulates the bubble net predation behavior of humpback whales, and its main components include representation of whale populations, whale behavior and interaction, and adaptability evaluation. WOA has many advantages, such as fast convergence speed, strong global search ability, and easy implementation [38]. It incorporates three steps: encircling prey, bubble net attack, and searching for prey. It is widely used in support vector machines, artificial neural networks, complex function optimization, and feature selection [39].

2.5.2. XGBoost

XGBoost is a powerful machine learning algorithm that improves the boosting algorithm based on GBDT (Gradient Boosting Decision Tree). The algorithm integrates multiple Classification Regression Trees (CARTs) to compensate for the insufficient prediction accuracy of a single CART, and the prediction result is equal to the sum of the scores of all CARTs [40]. It has the advantage of fast training speed, and its algorithm core is as follows:
y ^ i = k = 1 K f k ( x i ) , f k F , i = 1 , 2 , 3 , , n
where n is the number of samples, y ^ i is the predicted value at the i th sample; K is the number of decision trees; f k is the prediction result at the k th decision tree; F is the CARTs collective space; x i is the feature vector at the i th sample. The objective function of XGBoost includes a loss function and regularization term, and the formula is as follows:
O k = i = 1 n l y i , y ^ i k + k = 1 K Ω ( f k )
where l y i , y ^ i k is the loss function, representing the difference between y i and y ^ i k ; y i is the true value of sample i ; y ^ i k is the predicted value of sample i at the k th decision tree; Ω ( f k ) is the regularization term, composed of the number of nodes and the weights of each node. XGBoost performs a second-order Taylor expansion of the cost function. By using the first and second order derivatives, the model can converge faster on the training set, which increases the training speed effectively. And, adding regularization terms to the loss function can reduce the complexity of the model and the risk of overfitting. XGBoost is widely used in the field of atmospheric science, such as in the predictions of PM2.5 mass concentration [41,42].
The retrieval of NOX emissions based on satellite remote sensing is inherently a nonlinear regression process. As an efficient ensemble learning machine learning model, XGBoost demonstrates high precision in addressing nonlinear problems and is widely applied to tasks such as classification, regression, ranking, anomaly detection, and model interpretation. XGBoost has been extensively utilized in energy systems [43], including forecasting energy consumption or demand in buildings or microgrids, predicting renewable energy generation, and evaluating the reliability and durability of energy infrastructure components [44,45]. These studies highlight the advantages of the XGBoost model in solving similar regression and prediction tasks.

2.5.3. WOA-XGBoost Modeling Procedure

The hyperparameters for the XGBoost algorithm were optimized by WOA to improve the accuracy of the model in simulating the NOX emission rate. Seven important hyperparameters were selected for optimization in the XGBoost model (see Table 2). These parameters are all core parameters of XGBoost, and their selection was based on prior literature, such as Song et al. [46] and Qian et al. [47]. Some research results have confirmed that the simulation results of the WOA-XGBoost were better than the combination with other optimization algorithms, such as gray wolf optimization, Bayesian optimization, and butterfly optimization [48,49]. The initial parameters of the WOA algorithm are set as follows. The population size (number of whales) is set to 5, and the maximum number of iterations is set to 50. The iteration terminates when the maximum number of iterations is reached or when the best fitness value of the whale population shows no improvement for 10 consecutive iterations. At this point, the parameters obtained are considered the optimal parameters.
The Schematic of the WOA-XGBoost hybrid model is shown in Figure 2. WOA-XGBoost modeling includes the following five steps [47].
(1)
The population and iteration times are initialized in the WOA algorithm, and the optimization range for each hyperparameter of XGBoost is set.
(2)
The training data is inputted into the XGBoost model, and the fitness of each whale individual in the population is computed based on the score of the respective XGBoost model regarding 10–fold cross–validation (CV) on the training dataset.
(3)
The position of each individual whale is updated, which provides a new set of hyperparameters.
(4)
Steps (2) and (3) are repeated iteratively until the termination criteria are satisfied. At the end of the iteration, WOA outputs the optimal whale position, which is the optimal hyperparameter for the XGBoost model.
(5)
The optimal hyperparameters are inputted into the XGBoost model for simulation, and its performance is evaluated.

2.5.4. Training Dataset for WOA-XGBoost

In this study, we constructed training data for the NOX emission retrieval model. Variable selection for the training dataset was based on the mass conservation equation [24].
d C = E L + T
The temporal change in NOX column concentration ( d C ) in the troposphere results from NOX emissions ( E ), NOX atmospheric chemical/physical losses ( L ), and NOX atmospheric transport processes ( T ). NO2 VCD values for days t and days t − 1 are used to characterize the NOX concentration temporal changes [30]. NOX chemical/physical losses involve wet deposition, photolytic transformation, and heterogeneous reactions, primarily influenced by meteorological elements including relative humidity, temperature, and solar radiation intensity. NOX transport processes include vertical and horizontal diffusion mechanisms, with dominant meteorological elements being wind speed/direction and boundary layer height [50]. Table 3 lists a detailed description of the input variables. As a short–lived species, NOX daytime lifetimes are ~4 h at low– and mid–latitudes [21]. Given the weak correlation between prior day meteorological parameters and current day NOX concentrations, we exclusively selected concurrent meteorological parameters as model input variables. In order to better characterize the advective loss of NOX, we calculated NO2 horizontal flux divergence using four neighboring grids of the computed grid in both radial and latitudinal directions.
The true values for WOA-XGBoost training come from MEIC prior emissions, which were matched with the data in Table 3 in time and space to obtain a daily NOX emission retrieval dataset with grids of 0.05° × 0.05°. The 2020 dataset is used to train and validate WOA-XGBoost.

2.6. Model Evaluation Metrics

The WOA-XGBoost model performance was evaluated using the 10–fold cross–validation (CV) method. This method evenly divides the dataset into 10 subsets, using 9 of them as training data and 1 as testing data for validation. The WOA-XGBoost model metrics include root mean square error (RMSE), mean absolute error (MAE), coefficient of determination (R2), and correlation coefficient (r). RMSE and MAE are used to quantify the error between predicted values and true values, with small values indicating high accuracy of the model. R2 denotes the goodness of fit, with values ranging from 0 to 1. A larger value of R2 indicates a better fit. The correlation coefficient r quantifies both the strength and direction of a linear relationship between two variables. The formulas for the evaluation metric are given below.
RMSE = 1 n i = 1 n ( y i x i ) 2
MAE = 1 n i = 1 n y i x i
R 2 = 1 i = 1 n ( y i x i ) 2 i = 1 n ( y i y ¯ i ) 2
r = i = 1 n ( x i x ¯ ) ( y i y ¯ ) i = 1 n ( x i x ¯ ) 2 i = 1 n ( y i y ¯ ) 2
where n is the total amount of evaluated data; y i is the true value from MEIC; x i is the predicted value from WOA-XGBoost; x ¯ is the average of the predicted value; y ¯ is the average of the true value.

3. Results and Discussion

3.1. Variation in NO2 VCD and NOX Emissions in Shandong

Figure 3 presents the interannual variation in NO2 VCD and NOX emissions, using data from OMI/Aura observations and MEIC, respectively. The interannual variation in NO2 VCD closely paralleled the NOX emissions. NO2 VCD in Shandong exhibited a trend of initially rising and subsequently declining, reaching its peak in 2011 at 1.78 × 1016 molec/cm2. A notable transient decrease in NO2 VCD occurred in 2008, attributable to stringent emission control measures implemented by Shandong Province during the Beijing Olympics [51]. After 2011, NO2 VCD decreased at an average rate of 7.26 × 1014 molec·cm−2/yr, with significant decreases in 2014 and 2015, consistent with NOX emissions. This was related to China’s release of the Action Plan for Air Pollution Prevention and Control (APAPP) in September 2013 to improve air quality.
Figure 4 illustrates the seasonal and monthly patterns of NO2 VCD. The NO2 VCD monthly variation exhibits a distinct “U–shaped” profile (Figure 4b), contrasting with the absence of clear seasonal or monthly trends in NOX emissions (Figure S1). NO2 VCD seasonal variation was consistent with observations from other Eastern Chinese provinces [51]. The highest and most fluctuating level of NO2 pollution was observed in winter (Figure 4a), with VCD of 2.07 × 1016 molec/cm2, which was attributed to a stable atmospheric structure [52]. Conversely, owing to wet deposition and loss of NO2 photochemical reactions caused by strong solar radiation [53], NO2 VCD in summer reached their minimum (0.63 × 1016 molec/cm2), representing less than one–third of the winter concentrations.
Figure 5 depicts the average spatial distribution of NO2 VCD in Shandong from 2005 to 2023, revealing a clear decline from west to east. On the one hand, this decline was related to the emissions from the heavy industrial centers in Central and Northwestern Shandong (Figure S2). On the other hand, compared with the eastern coastal areas, the atmosphere in the central and western parts of Shandong is relatively stable due to topography and climate, which is not conducive to pollutants diffusion [54]. For example, Huang et al. [55] found that the annual averaged air stagnation days decreased from west to east in Shandong. A persistent NO2 pollution hotspot emerged in Zibo and adjacent Central Shandong, reaching 1.98 × 1016 molec/cm2. And the hotspot has not migrated in the last 20 years, with an annual maximum exceeding 2.5 × 1016 molec/cm2 during the peak pollution year (Figure S3). Due to its location in the hilly terrain of Central Shandong, Zibo’s urban area is surrounded by mountains, forming a semi–enclosed topography that significantly restricts the horizontal dispersion capacity of pollutants. Furthermore, MEIC emission inventory data reveals that Zibo is one of the high NOX emission centers in Shandong Province (Figure S2), with emissions primarily originating from industrial boilers (steelmaking), building material production (cement and ceramics), and road transportation [56]. Low NO2 VCD was found in Yantai and Weihai in the Jiaodong Peninsula region, below 1.1 × 1016 molec/cm2. In particular, Weihai had the lowest NO2 pollution, with VCD less than 0.9 × 1016 molec/cm2, which was related to the favorable dispersion conditions caused by its coastal location.
In terms of seasonal spatial distribution (Figure 6), as mentioned above, Shandong had the highest pollution level of NO2 in winter, with NO2 VCD exceeding 1.9 × 1016 molec/cm2 across most regions except the Jiaodong Peninsula. In addition to the pollution hotpot in Zibo, most areas in Central and South Shandong also showed high NO2 pollution levels. Autumn maintained a comparable spatial pattern to winter, while the NO2 VCD was about 0.8 × 1016 molec/cm2 lower than that in winter. In spring, NO2 levels were above 1.1 × 1016 molec/cm2 in most parts of the province, except for parts of Yantai, Weihai, Rizhao, and Heze. The lowest NO2 pollution level was observed in summer, with NO2 concentrations below 0.7 × 1016 molec/cm2, except for some areas in central Shandong.

3.2. Evaluations for NOX Emission Rate Retrieval

The main cause of NO2 pollution is NOX emissions. Using the WOA-XGBoost model introduced in Section 2.5, NOX emissions were retrieved on 0.05° × 0.05° grids in Shandong. We trained and validated the XGBoost and WOA-XGBoost models based on the data from 2020. We applied 10–fold CV to evaluate the model’s performance, and the results are given in Table 4. Both the XGBoost and WOA XGBoos models had good performance. Compared with XGBoost, the R2 for the WOA-XGBoost model increased by 0.02, and RMSE and MAE decreased by 0.08 μg/m2/s and 0.053 μg/m2/s, respectively, indicating the improvement in accuracy and performance of the WOA-XGBoost model.
We evaluated the performance of the WOA-XGBoost model using data from 11 November 2020, which was not included in the training, and the result is shown in Figure 7. The predicted results were highly consistent with the MEIC, with almost all samples distributed around the 1:1 line and a correlation coefficient (r) of 0.99. Both RMSE and MAE were extremely low, with neither exceeding 0.004 μg/m2/s. Figure 8 shows a spatial comparison between the predicted results and MEIC, and the blank areas indicate missing retrieval values. NOX emission rates predicted from WOA-XGBoost exhibited high spatial consistency with MEIC, effectively reproducing the high value center of emission rates. Approximately 99.5% of the grid exhibited absolute differences between predicted values and the MEIC inventory within 0.03 μg/m2/s, and the maximum difference was only 0.083 μg/m2/s.
To further evaluate the robustness and temporal transferability of the model, we predicted the NOX emission rate for 2019 and evaluated it by matching with MEIC. As shown in Figure 9, the yellow area represents high data density. The majority of the data were distributed between 0 and 1 μg/m2/s. The correlation coefficient between the predictions from WOA-XGBoost and MEIC was 0.93, which denotes that the WOA-XGBoost model can effectively capture the nonlinear relationship between multi–source fusion data and NOX emission rates. The MAE and RMSE were less than 0.3 μg/m2/s, suggesting that the WOA-XGBoost model successfully reproduces NOX emission rates with strong robustness and good temporal transferability. Notably, as evidenced by the fitting line, the predictions were lower than MEIC, with increasing underestimation at higher emission rates, indicating that the model performs optimally under low to moderate emission scenarios.
Figure 10a compares the monthly mean NOX emission rates between model predictions and MEIC data for 2019. Both datasets exhibited stable monthly averages (~0.8 μg/m2/s), with minor discrepancies (all <0.035 μg/m2/s). The comparison between prediction and MEIC for 16 cities in Shandong is shown in Figure 10b. The WOA-XGBoost predictions demonstrated strong agreement with MEIC estimates in capturing both absolute emission levels and inter–city variations. As mentioned above, systematic underestimation occurred for high emission cities (e.g., Jinan, Qingdao, Zaozhuang, and Zibo), with Qingdao showing the largest discrepancy (0.15 μg/m2/s). Notably, Zibo exhibited the highest predicted NOX emissions, consistent with its observed NO2 VCD. In summary, the proposed model demonstrates robust capability in quantifying NOX emission rates with high precision, thereby facilitating the development of near–real–time high–resolution grid–based emission inventories for dynamic environmental monitoring.

3.3. NOX Emission Rate Retrieval for 2021 and 2022

Based on the trained and validated WOA-XGBoost model integrated with multi–source data, daily NOX emission rates in Shandong Province were quantitatively retrieved for 2021 and 2022, with annual average spatial distributions depicted in Figure 11. The average NOX emissions rates for 2021 and 2022 were 0.807 μg/m2/s and 0.806 μg/m2/s, respectively, with a slight decrease in 2022. The high-value centers of NOX emissions were located in areas with high traffic flow, similar to the results from He et al. [57]; for example, Huaiyin, Shizhong, Lixia, and Changqing in Jinan, Shinan, Shibei, and Licang in Qingdao, as well as Weicheng and Kuiwen in Weifang. Transportation emissions are the main source of NOX, and previous statistics have found that on-road vehicles contribute 68.9% to total NOX emissions in Shandong [58]. The NOX emission rates in high emission centers such as Jinan and Zaozhuang showed a downward trend in 2022 (Figure 11c), with some grids decreasing by up to 0.1 μg/m2/s. The downward trend results from the combined effects of industrial pollution control, transportation restructuring, and enhanced policy oversight. For example, Jinan achieved precise emission reductions through mobile source regulation and ultra–low emission transformation, while Zaozhuang accomplished a leapfrog reduction by implementing chemical industry cluster remediation and substituting highway freight with railway transportation.
Figure 12 presents averaged NOX emissions rate for 16 cities in Shandong Province from 2021 to 2022. The mean NOX emission rate values of the 16 cities exhibited minimal interannual variation, with the difference around 0. Jinan recorded the most notable reduction in 2022, decreasing by 0.02 μg/m2/s. Consistent with historical patterns in 2019, Zibo maintained the highest emission intensity throughout both years (>1 μg/m2/s), while Dongying exhibited the lowest regional values. These results underscore two critical priorities for emission reduction. Firstly, heavy industrial hubs such as Zibo require sector-specific controls targeting the steel and petrochemical industries. Secondly, high traffic urban areas must accelerate the electrification of freight fleets and optimize logistics routes.

4. Conclusions

The long–term (2006–2023) spatiotemporal variations in NO2 VCD in Shandong showed that the interannual variation of NO2 VCD closely paralleled the NOX emissions. The highest and most fluctuating level of NO2 pollution was observed in winter, with VCD of 2.07 × 1016 molec/cm2, which was attributed to a stable atmospheric structure. The consumption of NO2 photochemical reactions caused by strong solar radiation resulted in the lowest NO2 VCD in summer, representing less than one–third of the winter concentrations. Spatial analysis reveals a distinct west–to–east decreasing gradient in NO2 VCD, with pollution hotspots persisting in Zibo and surrounding areas of central Shandong Province, where numerous high emission industries cluster and terrain–induced atmospheric stagnation impedes pollutant dispersion.
The NOX emission rate retrieval model (WOA-XGBoost) was trained and evaluated based on multi–source fusion data from 2020. The 10–fold CV results showed that the WOA-XGBoost model has high accuracy, with R2, RMSE, and MAE of 0.99, 0.03 μg/m2/s, and 0.007 μg/m2/s, respectively. The 2019 NOX emissions retrieved by WOA-XGBoost had high consistency with MEIC (r = 0.93, RMSE = 0.28 μg/m2/s and MAE = 0.21 μg/m2/s), confirming its strong robustness and good temporal transferability. Statistical analysis for 2019 revealed nearly identical monthly mean values between both datasets at approximately 0.8 μg/m2/s. And, the WOA-XGBoost model effectively captured intercity emission heterogeneity. However, the model exhibited a systematic underestimation bias relative to the MEIC. The underestimation increased with the increase in emission rate, demonstrating that the model has optimal predictive accuracy within low–to–moderate emission scenarios.
The NOX emission rates in 2021 and 2022 were predicted using WOA-XGBoost, and the average NOX emissions rates for 2021 and 2022 were 0.807 μg/m2/s and 0.806 μg/m2/s, respectively, with a slight decrease in 2022. The high value centers of NOX emissions rate were located in areas with high traffic flow. The NOX emission rates in high emission centers such as Jinan and Zaozhuang showed a downward trend in 2022, owing to the combined effects of industrial pollution control, transportation restructuring, and enhanced policy oversight. Similar to 2019, the NOX emission rate in Zibo was the highest in 2021 and 2022, above 1 μg/m2/s, which explains its high NO2 VCD. In the future, priority areas for emission reduction should focus on heavy industry clusters such as Zibo and high traffic urban centers.
The MEIC has inherent uncertainties due to systematic errors in activity levels and emission factors. Utilizing MEIC data as the ground “true” data for model training and evaluation introduces biases in the retrieved NOX emissions. Furthermore, the TROPOMI–derived NO2 VCD used in our model retrieval reflects the comprehensive influence of emissions from all sectors. Consequently, our model could not distinguish the sectoral origin of NOX emissions. In the future, we will collect CEMS (Continuous Emission Monitoring System) data and continue to improve our model to reduce its uncertainties. Sector–specific models will be developed to differentiate emissions from various sectors. Furthermore, the models will be extended to other regions to support environmental monitoring.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/su17136100/s1. Figure S1. Seasonal and monthly variations for NO2 emissions in Shandong; Figure S2. Spatial averaged distribution of NOX emissions in Shandong from 2005 to 2020; Figure S3. Annual spatial distribution of NO2 VCD.

Author Contributions

T.L.: conceptualization, visualization, validation, investigation, writing—original draft, writing—review and editing. J.Z.: data curation, investigation, methodology. R.L.: data curation, investigation. Y.T.: conceptualization, data curation, funding acquisition, supervision. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Consultancy Research Projects of Shandong Academy of Chinese Engineering Science and Technology Strategy for Development “Research on Green and Low Carbon Transition Strategy of Shandong Electrical Power” (202301SDZD01) and Tianjin Science and Technology Program Project of Tianjin Science and Technology Bureau (24ZLGCSS00070).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare that they have no competing financial interests or personal relationships that may have influenced the work reported in this study.

References

  1. Shah, V.; Jacob, D.J.; Li, K.; Silvern, R.F.; Zhai, S.; Liu, M.; Lin, J.; Zhang, Q. Effect of changing NOX lifetime on the seasonality and long-term trends of satellite-observed tropospheric NO2 columns over China. Atmos. Chem. Phys. 2020, 20, 1483–1495. [Google Scholar] [CrossRef]
  2. Sun, W.; Shao, M.; Granier, C.; Liu, Y.; Ye, C.S.; Zheng, J.Y. Long-Term Trends of Anthropogenic SO2, CO, and NMVOCs Emissions in China. Earths Future 2018, 6, 1112–1133. [Google Scholar] [CrossRef]
  3. Ayazpour, Z.; Sun, K.; Zhang, R.; Shen, H. Evaluation of the Directional Derivative Approach for Timely and Accurate Satellite-Based Emission Estimation Using Chemical Transport Model Simulation of Nitrogen Oxides. J. Geophys. Res. Atmos. 2025, 130, e2024JD042817. [Google Scholar] [CrossRef]
  4. Kang, Y.; Liu, M.; Song, Y.; Huang, X.; Yao, H.; Cai, X.; Zhang, H.; Kang, L.; Liu, X.; Yan, X.; et al. High-resolution ammonia emissions inventories in China from 1980 to 2012. Atmos. Chem. Phys. 2016, 16, 2043–2058. [Google Scholar] [CrossRef]
  5. Crippa, M.; Guizzardi, D.; Muntean, M.; Schaaf, E.; Dentener, F.; van Aardenne, J.A.; Monni, S.; Doering, U.; Olivier, J.G.J.; Pagliari, V.; et al. Gridded emissions of air pollutants for the period 1970–2012 within EDGAR v4.3.2. Earth Syst. Sci. Data 2018, 10, 1987–2013. [Google Scholar] [CrossRef]
  6. Huang, Z.; Zhong, Z.; Sha, Q.; Xu, Y.; Zhang, Z.; Wu, L.; Wang, Y.; Zhang, L.; Cui, X.; Tang, M.; et al. An updated model-ready emission inventory for Guangdong Province by incorporating big data and mapping onto multiple chemical mechanisms. Sci. Total Environ. 2021, 769, 144535. [Google Scholar] [CrossRef]
  7. Huang, L.; Liu, S.; Yang, Z.; Xing, J.; Zhang, J.; Bian, J.; Li, S.; Sahu, S.K.; Wang, S.; Liu, T.-Y. Exploring deep learning for air pollutant emission estimation. Geosci. Model Dev. 2021, 14, 4641–4654. [Google Scholar] [CrossRef]
  8. Chen, Y.; Fung, J.C.H.; Yuan, D.; Chen, W.; Fung, T.; Lu, X. Development of an integrated machine-learning and data assimilation framework for NOX emission inversion. Sci. Total Environ. 2023, 871, 161951. [Google Scholar] [CrossRef]
  9. Choo, G.-H.; Seo, J.; Yoon, J.; Kim, D.-R.; Lee, D.-W. Analysis of long-term (2005–2018) trends in tropospheric NO2 percentiles over Northeast Asia. Atmos. Pollut. Res. 2020, 11, 1429–1440. [Google Scholar] [CrossRef]
  10. Zheng, C.; Zhao, C.; Li, Y.; Wu, X.; Zhang, K.; Gao, J.; Qiao, Q.; Ren, Y.; Zhang, X.; Chai, F. Spatial and temporal distribution of NO2 and SO2 in Inner Mongolia urban agglomeration obtained from satellite remote sensing and ground observations. Atmos. Environ. 2018, 188, 50–59. [Google Scholar] [CrossRef]
  11. van der A, R.J.; Ding, J.; Eskes, H. Monitoring European anthropogenic NOX emissions from space. Atmos. Chem. Phys. 2024, 24, 7523–7534. [Google Scholar] [CrossRef]
  12. Xing, J.; Li, S.; Zheng, S.; Liu, C.; Wang, X.; Huang, L.; Song, G.; He, Y.; Wang, S.; Sahu, S.K.; et al. Rapid Inference of Nitrogen Oxide Emissions Based on a Top-Down Method with a Physically Informed Variational Autoencoder. Environ. Sci. Technol. 2022, 56, 9903–9914. [Google Scholar] [CrossRef]
  13. Yang, Y.; Zhao, Y.; Zhang, L.; Lu, Y. Evaluating the methods and influencing factors of satellite-derived estimates of NOX emissions at regional scale: A case study for Yangtze River Delta, China. Atmos. Environ. 2019, 219, 117051. [Google Scholar] [CrossRef]
  14. Opacka, B.; Stavrakou, T.; Müller, J.-F.; De Smedt, I.; van Geffen, J.; Marais, E.A.; Horner, R.P.; Millet, D.B.; Wells, K.C.; Guenther, A.B. Natural emissions of VOC and NOX over Africa constrained by TROPOMI HCHO and NO2 data using the MAGRITTEv1.1 model. Atmos. Chem. Phys. 2025, 25, 2863–2894. [Google Scholar] [CrossRef]
  15. Mao, Y.; Wang, H.; Jiang, F.; Feng, S.; Jia, M.; Ju, W. Anthropogenic NOX emissions of China, the U.S. and Europe from 2019 to 2022 inferred from TROPOMI observations. Environ. Res. Lett. 2024, 19, 054024. [Google Scholar] [CrossRef]
  16. Wu, H.; Tang, X.; Wang, Z.; Wu, L.; Li, J.; Wang, W.; Yang, W.; Zhu, J. High-spatiotemporal-resolution inverse estimation of CO and NOX emission reductions during emission control periods with a modified ensemble Kalman filter. Atmos. Environ. 2020, 236, 117631. [Google Scholar] [CrossRef]
  17. Chai, T.; Carmichael, G.R.; Tang, Y.; Sandu, A.; Heckel, A.; Richter, A.; Burrows, J.P. Regional NOX emission inversion through a four-dimensional variational approach using SCIAMACHY tropospheric NO2 column observations. Atmos. Environ. 2009, 43, 5046–5055. [Google Scholar] [CrossRef]
  18. Mao, J.; Li, L.; Li, J.; Sulaymon, I.D.; Xiong, K.; Wang, K.; Zhu, J.; Chen, G.; Ye, F.; Zhang, N.; et al. Evaluation of Long-Term Modeling Fine Particulate Matter and Ozone in China During 2013–2019. Front. Environ. Sci. 2022, 10, 872249. [Google Scholar] [CrossRef]
  19. Tessum, C.W.; Hill, J.D.; Marshall, J.D. Twelve-month, 12 km resolution North American WRF-Chem v3.4 air quality simulation: Performance evaluation. Geosci. Model Dev. 2015, 8, 957–973. [Google Scholar] [CrossRef]
  20. Li, X.; Cohen, J.B.; Qin, K.; Geng, H.; Wu, X.; Wu, L.; Yang, C.; Zhang, R.; Zhang, L. Remotely sensed and surface measurement- derived mass-conserving inversion of daily NOX emissions and inferred combustion technologies in energy-rich northern China. Atmos. Chem. Phys. 2023, 23, 8001–8019. [Google Scholar] [CrossRef]
  21. Beirle, S.; Boersma, K.F.; Platt, U.; Lawrence, M.G.; Wagner, T. Megacity Emissions and Lifetimes of Nitrogen Oxides Probed from Space. Science 2011, 333, 1737–1739. [Google Scholar] [CrossRef] [PubMed]
  22. Beirle, S.; Borger, C.; Doerner, S.; Li, A.; Hu, Z.; Liu, F.; Wang, Y.; Wagner, T. Pinpointing nitrogen oxide emissions from space. Sci. Adv. 2019, 5, eaax9800. [Google Scholar] [CrossRef] [PubMed]
  23. Zhang, Q.; Boersma, K.F.; van der Laan, C.; Mols, A.; Zhao, B.; Li, S.; Pan, Y. Estimating the variability in NOX emissions from Wuhan with TROPOMI NO2 data during 2018 to 2023. Atmos. Chem. Phys. 2025, 25, 3313–3326. [Google Scholar] [CrossRef]
  24. Qin, K.; Lu, L.; Liu, J.; He, Q.; Shi, J.; Deng, W.; Wang, S.; Cohen, J.B. Model-free daily inversion of NOX emissions using TROPOMI (MCMFE-NOX) and its uncertainty: Declining regulated emissions and growth of new sources. Remote Sens. Environ. 2023, 295, 113720. [Google Scholar] [CrossRef]
  25. Li, H.; Zheng, B.; Lei, Y.; Hauglustaine, D.; Chen, C.; Lin, X.; Zhang, Y.; Zhang, Q.; He, K. Trends and drivers of anthropogenic NO emissions in China since 2020. Environ. Sci. Ecotechnol. 2024, 21, 100425. [Google Scholar] [CrossRef]
  26. Mittal, V.; Sasetty, S.; Choudhary, R.; Agarwal, A. Deep-Learning Spatiotemporal Prediction Framework for Particulate Matter under Dynamic Monitoring. Transp. Res. Rec. 2022, 2676, 56–73. [Google Scholar] [CrossRef]
  27. Xing, J.; Zheng, S.; Ding, D.; Kelly, J.T.; Wang, S.; Li, S.; Qin, T.; Ma, M.; Dong, Z.; Jang, C.; et al. Deep Learning for Prediction of the Air Quality Response to Emission Changes. Environ. Sci. Technol. 2020, 54, 8589–8600. [Google Scholar] [CrossRef] [PubMed]
  28. Wei, J.; Li, Z.; Lyapustin, A.; Sun, L.; Peng, Y.; Xue, W.; Su, T.; Cribb, M. Reconstructing 1-km-resolution high-quality PM2.5 data records from 2000 to 2018 in China: Spatiotemporal variations and policy implications. Remote Sens. Environ. 2021, 252, 112136. [Google Scholar] [CrossRef]
  29. Wei, J.; Li, Z.; Wang, J.; Li, C.; Gupta, P.; Cribb, M. Ground-level gaseous pollutants (NO2, SO2, and CO) in China: Daily seamless mapping and spatiotemporal variations. Atmos. Chem. Phys. 2023, 23, 1511–1532. [Google Scholar] [CrossRef]
  30. He, T.-L.; Jones, D.B.A.; Miyazaki, K.; Bowman, K.W.; Jiang, Z.; Chen, X.; Li, R.; Zhang, Y.; Li, K. Inverse modelling of Chinese NOX emissions using deep learning: Integrating in situ observations with a satellite-based chemical reanalysis. Atmos. Chem. Phys. 2022, 22, 14059–14074. [Google Scholar] [CrossRef]
  31. Irie, H.; Kanaya, Y.; Akimoto, H.; Tanimoto, H.; Wang, Z.; Gleason, J.F.; Bucsela, E.J. Validation of OMI tropospheric NO2 column data using MAX-DOAS measurements deep inside the North China Plain in June 2006: Mount Tai Experiment 2006. Atmos. Chem. Phys. 2008, 8, 6577–6586. [Google Scholar] [CrossRef]
  32. Kurchaba, S.; Sokolovsky, A.; van Vliet, J.; Verbeek, F.J.; Veenman, C.J. Sensitivity analysis for the detection of NO2 plumes from seagoing ships using TROPOMI data. Remote Sens. Environ. 2024, 304, 114041. [Google Scholar] [CrossRef]
  33. Liu, M.; Lin, J.; Boersma, K.F.; Pinardi, G.; Wang, Y.; Chimot, J.; Wagner, T.; Xie, P.; Eskes, H.; Van Roozendael, M.; et al. Improved aerosol correction for OMI tropospheric NO2 retrieval over East Asia: Constraint from CALIOP aerosol vertical profile. Atmos. Meas. Tech. 2019, 12, 1–21. [Google Scholar] [CrossRef] [PubMed]
  34. Li, M.; Liu, H.; Geng, G.; Hong, C.; Liu, F.; Song, Y.; Tong, D.; Zheng, B.; Cui, H.; Man, H.; et al. Anthropogenic emission inventories in China: A review. Natl. Sci. Rev. 2017, 4, 834–866. [Google Scholar] [CrossRef]
  35. Zheng, B.; Tong, D.; Li, M.; Liu, F.; Hong, C.; Geng, G.; Li, H.; Li, X.; Peng, L.; Qi, J.; et al. Trends in China’s anthropogenic emissions since 2010 as the consequence of clean air actions. Atmos. Chem. Phys. 2018, 18, 14095–14111. [Google Scholar] [CrossRef]
  36. Wu, Y.; Shi, K.; Chen, Z.; Liu, S.; Chang, Z. Developing Improved Time-Series DMSP-OLS-Like Data (1992–2019) in China by Integrating DMSP-OLS and SNPP-VIIRS. IEEE Trans. Geosci. Remote Sens. 2022, 60, 4407714. [Google Scholar] [CrossRef]
  37. Mirjalili, S.; Lewis, A. The Whale Optimization Algorithm. Adv. Eng. Softw. 2016, 95, 51–67. [Google Scholar] [CrossRef]
  38. Mohammed, H.M.; Umar, S.U.; Rashid, T.A. A Systematic and Meta-Analysis Survey of Whale Optimization Algorithm. Comput. Intell. Neurosci. 2019, 2019, 8718571. [Google Scholar] [CrossRef] [PubMed]
  39. Rana, N.; Abd Latiff, M.S.; Abdulhamid, S.i.M.; Chiroma, H. Whale optimization algorithm: A systematic review of contemporary applications, modifications and developments. Neural Comput. Appl. 2020, 32, 16245–16277. [Google Scholar] [CrossRef]
  40. Chen, T.; Guestrin, C.; Assoc Comp, M. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
  41. Ma, H.; Kong, J.; Zhong, Y.; Jiang, Y.; Zhang, Q.; Wang, L.; Wang, X.; Zhang, J. The optimization of XGBoost model and its application in PM2.5 concentrations estimation based on MODIS data in the Guanzhong region, China. Int. J. Remote Sens. 2023, 45, 6954–6975. [Google Scholar] [CrossRef]
  42. Wang, J.; He, L.; Lu, X.; Zhou, L.; Tang, H.; Yan, Y.; Ma, W. A full-coverage estimation of PM2.5 concentrations using a hybrid XGBoost-WD model and WRF-simulated meteorological fields in the Yangtze River Delta Urban Agglomeration, China. Environ. Res. 2022, 203, 111799. [Google Scholar] [CrossRef]
  43. Liu, Z.; Guo, H.; Zhang, Y.; Zuo, Z. A Comprehensive Review of Wind Power Prediction Based on Machine Learning: Models, Applications, and Challenges. Energies 2025, 18, 350. [Google Scholar] [CrossRef]
  44. Semmelmann, L.; Henni, S.; Weinhardt, C. Load forecasting for energy communities: A novel LSTM-XGBoost hybrid model based on smart meter data. Energy Inform. 2022, 5, 24. [Google Scholar] [CrossRef]
  45. Chen, M.; Liu, Q.; Chen, S.; Liu, Y.; Zhang, C.-H.; Liu, R. XGBoost-Based Algorithm Interpretation and Application on Post-Fault Transient Stability Status Prediction of Power System. IEEE Access 2019, 7, 13149–13158. [Google Scholar] [CrossRef]
  46. Song, Y.; Li, H.; Xu, P.; Liu, D.; Cheng, S. A Method of Intrusion Detection Based on WOA-XGBoost Algorithm. Discret. Dyn. Nat. Soc. 2022, 2022, 5245622. [Google Scholar] [CrossRef]
  47. Qian, C.; Li, W.; Wei, S.; Sun, B.; Ren, Y. Fatigue reliability evaluation for impellers with consideration of multi-source uncertainties using a WOA-XGBoost surrogate model. Qual. Reliab. Eng. Int. 2024, 40, 3193–3211. [Google Scholar] [CrossRef]
  48. Qiu, Y.; Zhou, J.; Khandelwal, M.; Yang, H.; Yang, P.; Li, C. Performance evaluation of hybrid WOA-XGBoost, GWO-XGBoost and BO-XGBoost models to predict blast-induced ground vibration. Eng. Comput. 2022, 38, 4145–4162. [Google Scholar] [CrossRef]
  49. Tran, T.T.K.; Janizadeh, S.; Bateni, S.M.; Jun, C.; Kim, D.; Trauernicht, C.; Rezaie, F.; Giambelluca, T.W.; Panahi, M. Improving the prediction of wildfire susceptibility on Hawai’i Island, Hawai’i, using explainable hybrid machine learning models. J. Environ. Manag. 2024, 351, 119724. [Google Scholar] [CrossRef]
  50. Cordero, J.M.; Hingorani, R.; Jimenez-Relinque, E.; Grande, M.; Borge, R.; Narros, A.; Castellote, M. NOX removal efficiency of urban photocatalytic pavements at pilot scale. Sci. Total Environ. 2020, 719, 137459. [Google Scholar] [CrossRef]
  51. Wang, C.; Wang, T.; Wang, P. The Spatial-Temporal Variation of Tropospheric NO2 over China during 2005 to 2018. Atmosphere 2019, 10, 444. [Google Scholar] [CrossRef]
  52. Si, Y.; Wang, H.; Cai, K.; Chen, L.; Zhou, Z.; Li, S. Long-term (2006–2015) variations and relations of multiple atmospheric pollutants based on multi-remote sensing data over the North China Plain. Environ. Pollut. 2019, 255, 113323. [Google Scholar] [CrossRef] [PubMed]
  53. Ali, M.A.; Assiri, M.E.; Islam, M.N.; Bilal, M.; Ghulam, A.; Huang, Z. Identification of NO2 and SO2 over China: Characterization of polluted and hotspots Provinces. Air Qual. Atmos. Health 2024, 17, 2203–2221. [Google Scholar] [CrossRef]
  54. Yao, Y.; He, C.; Li, S.; Ma, W.; Li, S.; Yu, Q.; Mi, N.; Yu, J.; Wang, W.; Yin, L.; et al. Properties of particulate matter and gaseous pollutants in Shandong, China: Daily fluctuation, influencing factors, and spatiotemporal distribution. Sci. Total Environ. 2019, 660, 384–394. [Google Scholar] [CrossRef] [PubMed]
  55. Huang, Q.; Cai, X.; Song, Y.; Zhu, T. Air stagnation in China (1985–2014): Climatological mean features and trends. Atmos. Chem. Phys. 2017, 17, 7793–7805. [Google Scholar] [CrossRef]
  56. Li, M.; Xu, J.; Liu, H.; Gong, A.; Du, X. PM2.5 source apportionment over Jinan Metropolitan Area. J. Environ. Eng. Technol. 2021, 11, 209–216. [Google Scholar]
  57. He, G.; Jiang, W.; Gao, W.; Lu, C. Unveiling the Spatial-Temporal Characteristics and Driving Factors of Greenhouse Gases and Atmospheric Pollutants Emissions of Energy Consumption in Shandong Province, China. Sustainability 2024, 16, 1304. [Google Scholar] [CrossRef]
  58. Jiang, P.; Chen, X.; Li, Q.; Mo, H.; Li, L. High-resolution emission inventory of gaseous and particulate pollutants in Shandong Province, eastern China. J. Clean. Prod. 2020, 259, 120806. [Google Scholar] [CrossRef]
Figure 1. Map of the study area.
Figure 1. Map of the study area.
Sustainability 17 06100 g001
Figure 2. Schematic of WOA-XGBoost.
Figure 2. Schematic of WOA-XGBoost.
Sustainability 17 06100 g002
Figure 3. Annual variation for NO2 VCD in Shandong from 2005 to 2023. The black denotes NO2 VCD, the red line denotes NO2 VCD trend fitting line, and the blue line denotes NOX emissions.
Figure 3. Annual variation for NO2 VCD in Shandong from 2005 to 2023. The black denotes NO2 VCD, the red line denotes NO2 VCD trend fitting line, and the blue line denotes NOX emissions.
Sustainability 17 06100 g003
Figure 4. Seasonal and monthly variations for NO2 VCD in Shandong. (a) Seasonal variation; (b) Monthly variation.
Figure 4. Seasonal and monthly variations for NO2 VCD in Shandong. (a) Seasonal variation; (b) Monthly variation.
Sustainability 17 06100 g004
Figure 5. Spatial averaged distribution of NO2 VCD in Shandong from 2005 to 2023.
Figure 5. Spatial averaged distribution of NO2 VCD in Shandong from 2005 to 2023.
Sustainability 17 06100 g005
Figure 6. Average spatial distribution of NO2 VCD in Shandong for different seasons. (a) Spring (March, April, and May); (b) Summer (June, July, and August); (c) Autumn (September, October, and November); (d) Winter (December, January, and February).
Figure 6. Average spatial distribution of NO2 VCD in Shandong for different seasons. (a) Spring (March, April, and May); (b) Summer (June, July, and August); (c) Autumn (September, October, and November); (d) Winter (December, January, and February).
Sustainability 17 06100 g006
Figure 7. Comparison between prediction from WOA-XGBoost and MEIC on 11 November 2020. The color of the dot denotes normalized data density. n denotes the sample size. The dashed line denotes the 1:1 line, and the solid line denotes the best–fit lines from linear regression.
Figure 7. Comparison between prediction from WOA-XGBoost and MEIC on 11 November 2020. The color of the dot denotes normalized data density. n denotes the sample size. The dashed line denotes the 1:1 line, and the solid line denotes the best–fit lines from linear regression.
Sustainability 17 06100 g007
Figure 8. Spatial comparisons between prediction from WOA-XGBoost and MEIC on 11 November 2020. (a) True value from MEIC; (b) Prediction from WOA-XGBoost; (c) Pre–Tru.
Figure 8. Spatial comparisons between prediction from WOA-XGBoost and MEIC on 11 November 2020. (a) True value from MEIC; (b) Prediction from WOA-XGBoost; (c) Pre–Tru.
Sustainability 17 06100 g008
Figure 9. Comparison between prediction from WOA-XGBoost and MEIC in 2019.
Figure 9. Comparison between prediction from WOA-XGBoost and MEIC in 2019.
Sustainability 17 06100 g009
Figure 10. Comparison between prediction from WOA-XGBoost and MEIC for 2019. (a) Average for different months; (b) Average for different prefecture–level cities.
Figure 10. Comparison between prediction from WOA-XGBoost and MEIC for 2019. (a) Average for different months; (b) Average for different prefecture–level cities.
Sustainability 17 06100 g010
Figure 11. Spatial distribution of NOX emission rates for 2021 and 2022. (a) 2021; (b) 2022; (c) 2022–2021.
Figure 11. Spatial distribution of NOX emission rates for 2021 and 2022. (a) 2021; (b) 2022; (c) 2022–2021.
Sustainability 17 06100 g011
Figure 12. Averaged NOX emission rates for sixteen prefecture–level cities in Shandong Province from 2021 to 2022.
Figure 12. Averaged NOX emission rates for sixteen prefecture–level cities in Shandong Province from 2021 to 2022.
Sustainability 17 06100 g012
Table 1. Summary of previous studies on machine learning used for NOX emission retrieval.
Table 1. Summary of previous studies on machine learning used for NOX emission retrieval.
Ref.ModelContributionsLimitations
[7]NNThe updated emissions by the model can improve the accuracy of CTM simulation.Requires backpropagation algorithm to update emission inventory.
[30]CNN + LSTMThe retrieved total NOX emissions in 2019 were highly consistent with prior emissions.Insufficient spatial refinement.
[12]VAESuccessfully corrected NOX emission underestimation in rural areas and overestimation in urban areas.Requires CTM results to generate training datasets.
[8]eBPNNThe proposed framework is sufficiently flexible to correct emissions.Requires CTM results to generate training datasets.
Table 2. Hyperparameters optimized using WOA for XGBoost.
Table 2. Hyperparameters optimized using WOA for XGBoost.
HyperparametersDescriptionRangeOptimization Value
learning_rateBoosting learning rate[0.01, 0.5]0.12
max_depthMaximum tree depth for base learners[5, 20]20
subsampleSubsample ratio of the training instance[0.01, 1]1
colsample_bytreeSubsample ratio of columns when constructing each tree[0.5, 1]1
gammaMinimum loss reduction required to make a further partition on a leaf[0, 5]0
reg_alpha l 1 regularization term on weights[0, 5]0.006
reg_lambda l 2 regularization term on weights[0, 5]0
Table 3. Input variable for WOA-XGBoost.
Table 3. Input variable for WOA-XGBoost.
WOA-XGBoost Input VariableUnitData SourceDescription
NO2 VCD from days t and days t − 11015 molec/cm2POMINO-TROPOMIVariation in NOX column concentration
Boundary layer heightmERA5Influencing NOX vertical diffusion
Surface net downward shortwave fluxJ/m2ERA5Influencing NOX photolytic transformation
10 m_U windm/sERA5Influencing NOX advective diffusion
10 m_V windm/sERA5Influencing NOX advective diffusion
2 m_temperatureKERA5Influencing NOX photolytic and heterogeneous reactions
2 m_relative humidity%ERA5Influencing NOX heterogeneous reactions and wet deposition
NO2 horizontal flux divergences/mPOMINO-TROPOMI and ERA5NOX advective diffusion
Table 4. Ten–fold CV results of XGBoost and WOA-XGBoost.
Table 4. Ten–fold CV results of XGBoost and WOA-XGBoost.
R2RMSE (μg/m2/s)MAE (μg/m2/s)
XGBoost0.970.110.06
WOA-XGBoost0.990.030.007
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liu, T.; Zhao, J.; Li, R.; Tian, Y. Retrieval and Evaluation of NOX Emissions Based on a Machine Learning Model in Shandong. Sustainability 2025, 17, 6100. https://doi.org/10.3390/su17136100

AMA Style

Liu T, Zhao J, Li R, Tian Y. Retrieval and Evaluation of NOX Emissions Based on a Machine Learning Model in Shandong. Sustainability. 2025; 17(13):6100. https://doi.org/10.3390/su17136100

Chicago/Turabian Style

Liu, Tongqiang, Jinghao Zhao, Rumei Li, and Yajun Tian. 2025. "Retrieval and Evaluation of NOX Emissions Based on a Machine Learning Model in Shandong" Sustainability 17, no. 13: 6100. https://doi.org/10.3390/su17136100

APA Style

Liu, T., Zhao, J., Li, R., & Tian, Y. (2025). Retrieval and Evaluation of NOX Emissions Based on a Machine Learning Model in Shandong. Sustainability, 17(13), 6100. https://doi.org/10.3390/su17136100

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop