Optimizing CO2 Concentrations and Emissions Based on the WRF-Chem Model Integrated with the 3DVAR and EAKF Methods

Liu, Wenhao; Ling, Xiaolu; Li, Chenggang; He, Botao

doi:10.3390/rs18010174

Open AccessArticle

Optimizing CO₂ Concentrations and Emissions Based on the WRF-Chem Model Integrated with the 3DVAR and EAKF Methods

School of Environment and Spatial Informatics, China University of Mining and Technology, Xuzhou 221116, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2026, 18(1), 174; https://doi.org/10.3390/rs18010174

Submission received: 20 November 2025 / Revised: 24 December 2025 / Accepted: 1 January 2026 / Published: 5 January 2026

(This article belongs to the Special Issue Advanced Remote Sensing Approaches for Multi-Scale Atmospheric Components Monitoring and Impact Assessment)

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

Developed an innovative approach to assimilate multi-source carbon satellite fused datasets in the WRF-Chem model.
Proposed a method to quantify emission errors using concentration differences between simulation and assimilation results.

What are the implications of the main findings?

Assimilation of carbon satellite data improves the accuracy and reliability of CO₂ concentration simulations.
This study provides a methodology for dynamic optimization of regional carbon emission inventories.

Abstract

This study developed a multi-source data assimilation system based on the WRF-Chem model integrated with 3DVAR and EAKF methods. By assimilating a multi-source satellite fused XCO₂ concentration dataset, the system achieved simultaneous optimization of CO₂ concentration fields and emission fluxes over China. During the December 2019 experiment, the system successfully reconstructed high-precision CO₂ concentration fields and dynamically corrected the MEIC inventory through emission error inversion derived from concentration differences before and after assimilation. Comparative analysis with the EDGAR inventory demonstrated the superior performance of the EAKF method, which reduced RMSE by 56% and increased the correlation coefficient to 0.360, while the 3DVAR method achieved a 9% RMSE reduction and improved the correlation coefficient to 0.294. In terms of total emissions, 3DVAR and EAKF increased national emissions by 13.6% and 5.1%, respectively, but reduced emissions in Xinjiang by 3.24 MT and 7.99 MT. A comparison of three simulation scenarios (prior emissions, 3DVAR-optimized, and EAKF-optimized) showed significant improvement over the EGG4 dataset, with systematic bias decreasing by approximately 75% and RMSE reduced by about 49%. The assimilation algorithm developed in this study provides a reliable methodological support for regional carbon monitoring and can be extended to multi-pollutant emissions and high-resolution satellite data integration.

Keywords:

WRF-Chem; CO₂ concentration; CO₂ emissions; data assimilation; atmospheric model

1. Introduction

Carbon dioxide (CO₂) is the most significant greenhouse gas influencing the global climate. Although an individual CO₂ molecule traps less heat than methane and other gases, its high atmospheric concentration makes it the largest contributor to the greenhouse effect. Since the Industrial Revolution, human activities have significantly increased greenhouse gas emissions, which are now the main driver of global warming [1,2]. Climate change has led to more frequent extreme weather events, increased pressure on ecosystems, accelerated glacier melt, and rising sea levels, posing threats to coastal regions. Therefore, establishing an accurate greenhouse gas monitoring and research system is essential to track emission sources and trends, providing a scientific basis for developing response strategies [3,4].

Currently, CO₂ concentration monitoring technologies have evolved into an integrated system combining ground-based observations, remote sensing, and numerical modeling [5]. However, each monitoring method has its limitations. Ground-based observations offer high accuracy but suffer from limited spatial coverage, while remote sensing provides broad coverage yet is susceptible to atmospheric interference [6]. In contrast, atmospheric chemistry models, by integrating observational data with physical process simulations, can achieve higher spatial resolution and capture CO₂ distribution characteristics at different altitudes [7,8]. Atmospheric chemistry models are mainly divided into global-scale and regional-scale categories. As a representative model within the Weather Research and Forecasting (WRF) framework that couples chemical processes online, WRF-Chem has been widely applied in recent years [9,10,11,12,13]. Ballav et al. [14] used the WRF-Chem model to analyze drivers of CO₂ concentration variations across different time scales. They found that diurnal variations are mainly influenced by terrestrial carbon flux and boundary layer height, while weather-scale changes are primarily driven by the spatial distribution of surface fluxes and wind direction. Liu et al. [15] simulated atmospheric CO₂ concentrations in the Beijing-Tianjin-Hebei region in 2015 using WRF-Chem. Comparison with GOSAT satellite data showed that the model systematically overestimated concentrations by 2–3 ppm, which may be due to overestimation of tropospheric CO₂ in the model. Dong et al. [16] applied a coupled WRF-Chem and VPRM to study atmospheric CO₂ concentrations across China from 2016 to 2018. They found that spatiotemporal variations were mainly influenced by anthropogenic emissions, while seasonal fluctuations were primarily governed by terrestrial fluxes and the CO₂ background field. Seo et al. [17] simulated atmospheric CO₂ concentrations over East Asia from 2009 to 2018 using WRF-Chem at a high spatial resolution of 9 km. Based on the simulations, they constructed a comprehensive dataset that includes multi-source emissions and biospheric CO₂ fluxes. In summary, thanks to its advantages such as online coupling of meteorological fields and chemical processes, as well as adjustable spatial resolution, the WRF-Chem model has been widely applied in regional atmospheric CO₂ concentration simulation studies. Particularly in East Asia and China, a series of refined modeling practices and multi-level observational data validations have demonstrated that the model can reliably capture the spatiotemporal distribution characteristics of CO₂ concentrations.

To more accurately simulate CO₂ concentrations, data assimilation techniques that integrate observational data with model simulations have emerged [18,19,20,21]. In recent years, data assimilation studies based on the WRF-Chem model have significantly improved the estimation accuracy of carbon concentrations at regional scales. Seo et al. [22] used the WRF-Chem model with three-dimensional variational data assimilation (3DVAR) to investigate the impact of meteorological data assimilation on high-resolution CO₂ simulations in East Asia. Subsequently, Seo et al. [23] applied the WRF-Chem/DART coupled system with the Ensemble Adjustment Kalman Filter (EAKF) algorithm to assimilate surface CO₂ concentration observations, simulating atmospheric CO₂ concentrations in East Asia during January (winter) and July (summer) of 2019, and systematically analyzed the spatiotemporal distribution characteristics of CO₂ in different seasons. Zhang et al. [24] extended the Data Assimilation Research Testbed (DART) system, integrated it with the WRF-Chem model, and used the EAKF method to assimilate OCO-2 satellite XCO₂ data, estimating atmospheric CO₂ concentrations in the Midwestern United States. Building on Zhang et al. [24], Jin et al. [25] coupled OCO-2 satellite observations with the WRF-Chem/DART atmospheric transport model to assimilate and invert CO₂ concentrations and fluxes in Lisbon, Portugal. However, existing studies still face significant challenges. Whether based on surface observations or single-satellite platforms (e.g., OCO-2) providing XCO₂ data, currently available assimilation data sources remain limited in spatiotemporal coverage, suffer from severe data scarcity, and often exhibit spatial discontinuity and temporal sparsity. Furthermore, the specific application scenarios and practical utility of CO₂ concentration datasets generated through such assimilation methods require further exploration, as does the scientific value and supporting role of related achievements in CO₂ emission research.

Emission inventories serve as critical input data for air quality modeling and pollution control, and their accuracy directly impacts the effectiveness of numerical forecasting and the development of emission reduction strategies [26,27]. Currently, the “bottom-up” approach is the mainstream method for constructing emission inventories. This method aggregates activity data and emission factors by sector and process to build annual or monthly total emissions at regional or national scales [28]. However, when disaggregating these macro-level totals into hourly gridded data required by models, significant uncertainties arise due to insufficient energy statistics, limited representativeness of emission factors, and imperfect temporal profile construction [29]. These uncertainties become a key source of error affecting the accuracy of air quality models. To overcome the inherent limitations of the “bottom-up” approach, “top-down” data assimilation techniques have been increasingly applied in recent years to optimize emission inventories of atmospheric pollutants. This category of methods uses chemical transport models as a bridge to integrate multi-source observational data (e.g., ground-based monitoring and satellite remote sensing) with model simulations. By establishing the response relationship between concentration and emissions, it dynamically constrains and optimizes emission fluxes [30,31]. This study improves upon existing emission calculation methods for atmospheric pollutants and successfully extends their application to atmospheric CO₂, achieving more accurate inversion of regional carbon emissions.

Therefore, this study employs the WRF-Chem regional atmospheric chemistry model to assimilate multi-platform satellite-derived CO₂ concentration data, establishing an assimilation system capable of simultaneously optimizing CO₂ concentrations and emission sources. The system enables hourly frequency optimization updates, effectively enhancing the spatiotemporal continuity of assimilation results. By integrating multi-source data, it significantly reduces uncertainties associated with assimilating single observational datasets. Furthermore, the system inverts concentration changes during the assimilation process into emission fluxes, enabling the correction of existing carbon emission inventories and providing a new technical approach for carbon emission monitoring. In this paper, we first used the WRF-Chem model to simulate the concentration distribution for December 2019 as a baseline experiment. Subsequently, we conducted two parallel assimilation experiments (3DVAR and EAKF), assimilating multi-source satellite observations into the model at 6 h intervals to reconstruct a regional CO₂ concentration field. Based on this reconstruction, we inverted the systematic discrepancies between simulated and assimilated concentrations to derive CO₂ emission errors, ultimately achieving dynamic optimization and precise correction of existing emission inventories. This system not only improves the accuracy of regional CO₂ concentration simulations but also provides crucial methodological support for dynamically assessing and optimizing carbon emission inventories.

2. Materials and Methods

2.1. Study Area

This experiment selects the region 16°N–52°N, 81°E–123°E as the study area, as shown in Figure 1a. It covers the entire territory of China and the surrounding East Asian areas. Characterized by complex topography and diverse ecosystem types, the region also serves as a major source of global anthropogenic CO₂ emissions, exhibiting significant spatial heterogeneity and strong emission dynamics. Key target areas, including the Beijing-Tianjin-Hebei (BTH) region, the Yangtze River Delta (YRD), the Pearl River Delta (PRD), Central China (CC), Xinjiang (XJ), and Northeast China (NEC), are designated as focal zones to assess their carbon emission characteristics and to validate the applicability of the assimilation method under different emission backgrounds.

2.2. Data

2.2.1. Input Data for WRF-Chem

Meteorological data were obtained from the Final Operational Global Analysis (FNL) dataset provided by the National Centers for Environmental Prediction (NCEP). This dataset is a high-resolution meteorological repository offering global coverage of atmospheric parameters and state variables. Derived from numerical weather prediction model outputs, the FNL data include multiple variables such as temperature, humidity, wind speed, and precipitation, and are widely used in climate research, weather forecasting, and environmental monitoring. The FNL data used in this study have a spatial resolution of 1° × 1° and a temporal resolution of 6 h. Data were accessed from https://gdex.ucar.edu/datasets/d083002/ (accessed on 24 October 2025).

Anthropogenic CO₂ emissions were taken from the Multi-resolution Emission Inventory for China (MEIC) developed by Tsinghua University. This inventory provides high-resolution data on greenhouse gases and air pollutants, with modeling and analysis specifically tailored to China’s emission characteristics. The MEIC CO₂ emission data used here have a spatial resolution of 0.25° × 0.25° and a monthly temporal resolution. The MEIC reports anthropogenic emissions from sources in five sectors (power, industry, residential, transportation, and agriculture). The meic2wrf tool was utilized in the study to perform the spatial interpolation and allocation of CO₂ emissions from the inventory onto the WRF-Chem model grid. Data were obtained from http://meicmodel.org.cn (accessed on 24 October 2025). The meic2wrf tool was sourced from https://github.com/jinfan0931/meic2wrf (accessed on 24 October 2025). Figure 1b shows the MEIC emission for December 2019.

Biogenic CO₂ emissions were calculated using the Vegetation Photosynthesis and Respiration Model (VPRM) [32]. The VPRM parameterizes the ecophysiological characteristics of various vegetation types—such as forests, grasslands, and croplands—to accurately simulate vegetation photosynthesis (CO₂ uptake) and respiration (CO₂ release). By accounting for seasonal variations, climatic conditions (e.g., temperature, precipitation, and solar radiation), and land use types, the model dynamically captures the spatiotemporal variations in biogenic CO₂ fluxes.

The oceanic CO₂ emission fluxes were derived from the Japan Meteorological Agency (JMA) Ocean Map, which provides high-resolution, data-constrained estimates of air–sea CO₂ exchange. Specifically, the regional version of the dataset was used, offering a fine spatial resolution of 0.25° × 0.25° over the western North Pacific. The JMA data were collected from https://www.data.jma.go.jp/kaiyou/english/co2_flux/co2_flux_data_en.html (accessed on 24 October 2025).

Initial and boundary conditions for CO₂ were derived from the CarbonTracker 2022 dataset. Carbon Tracker is an open-data product released by the National Oceanic and Atmospheric Administration (NOAA) Global Monitoring Laboratory. Based on observations from the NOAA GML greenhouse gas monitoring network and its partner institutions, the dataset provides three-dimensional CO₂ concentrations with a temporal resolution of 3 h and a spatial resolution of 3° × 2°. Data were downloaded from http://carbontracker.noaa.gov (accessed on 24 October 2025).

2.2.2. XCO₂ Fusion Dataset from Satellite Observation

The satellite remote sensing data used in this study were obtained from the multi-source satellite fused XCO₂ concentration dataset developed by Jin et al. [33]. This dataset integrates global XCO₂ observations from multiple satellites (including GOSAT/GOSAT-2 and OCO-2/3) and generates a high-resolution, high-temporal-frequency global XCO₂ product through data fusion and interpolation methods. Covering the period from January 2003 to August 2020, the dataset has a spatial resolution of 0.5° × 0.5° and a temporal resolution of 3 h. By applying the Maximum Likelihood Estimation (MLE) and Optimal Interpolation (OI) methods to fuse multi-source XCO₂ data, the merged product combines the accuracy and coverage advantages of different satellite algorithm products, achieving an average daily coverage rate of 27.49% and enabling more refined capture of global XCO₂ spatiotemporal variations. The dataset demonstrates high accuracy against TCCON (Total Carbon Column Observing Network) data, achieving an R of 0.96, RMSE of 2.62 ppm, and MAE of 1.53 ppm.

Since the fused XCO₂ data used for assimilation represent column concentrations, their relationship with the CO₂ concentration at each layer in WRF-Chem is calculated as follows [34]:

X C O_{2} = \frac{\int_{i = 1}^{i = n} (C O_{2, i} \cdot {∆ p}_{i})}{p_{bottom} - p_{top}}

(1)

where n is the total number of vertical layers in WRF-Chem, i denotes the ith model layer,

{∆ p}_{i}

represents the pressure thickness of the ith layer,

p_{bottom}

and

p_{top}

indicate the bottom-level and top-level pressures of the model, respectively. The calculation method defined in Equation (1) serves as the standard conversion for all transformations between column and layered concentrations throughout this study.

2.2.3. Observations for Validation

The data used in this study for validating CO₂ concentration simulations and assimilation results were obtained from continuous measurements at multiple sites of TCCON, a global high-precision observation network, as well as extensive historical and real-time observational data provided by the World Data Centre for Greenhouse Gases (WDCGG) global data integration and sharing platform.

The study area includes two TCCON sites—Hefei (31.91°N, 117.17°E) and Xianghe (39.8°N, 116.96°E)—and two WDCGG stations—WLG (Mt. Waliguan, 36.29°N, 100.90°E) and HKO (King’s Park, 22.31°N, 114.17°E). Detailed information is provided in Table 1. For validation, daily XCO₂ data from each station in December 2019 were compared with model-simulated XCO₂ values derived from layered CO₂ concentration outputs. The TCCON data were accessed from https://tccondata.org (accessed on 24 October 2025), and the WDCGG data were obtained from https://gaw.kishou.go.jp (accessed on 24 October 2025).

2.3. Methods

2.3.1. WRF-Chem Model

WRF-Chem is an atmospheric modeling system developed by NOAA’s Forecast Systems Laboratory that simultaneously simulates meteorological and chemical processes [35]. This study utilized version 4.2.2 and was configured with a single 27 km domain, which encompasses nearly all regions of China. The configuration included 29 vertical layers from surface to 50 hPa, a 40 s time step, and Lambert conformal projection centered at 35°N, 102°E. CO₂ simulations employed chem_opt = 16 with hourly output. Physical and chemical parameterization schemes are detailed in Table 2 [36].

2.3.2. 3DVAR Method

This study employs the WRF-Chem model with 3DVAR data assimilation to optimize CO₂ concentration fields by integrating observations with model simulations. The method estimates optimal CO₂ states by minimizing the following cost function [44]:

J (x) = \frac{1}{2} {(x - x^{b})}^{T} B^{- 1} (x - x^{b}) + \frac{1}{2} {(H (x) - y^{o})}^{T} R^{- 1} (H (x) - y^{o})

(2)

where

x

represents the CO₂ state variable,

x ᵇ

is the background field,

B

denotes the background error covariance matrix,

y °

corresponds to the CO₂ observations,

H

is the observation operator that maps model states to observation space, and

R

signifies the observation error covariance matrix.

The observation operator

H

in this study uses bilinear interpolation to downscale 27 km model CO₂ fields to 0.5° × 0.5° satellite observation locations, and performs pressure-weighted vertical integration of interpolated layered concentrations into column-averaged dry-air mole fraction, consistent with the satellite product definition. The observation error covariance matrix

R

is constructed based on the reported uncertainties of the multi-source fused XCO₂ dataset [33]. Specifically, we used the per-pixel uncertainty estimates provided in the dataset, which account for retrieval errors under varying conditions (e.g., aerosol loading, surface albedo). These uncertainties are treated as uncorrelated in space for simplicity, resulting in a diagonal

R

matrix.

The background error covariance matrix

B

is constructed using the NMC method, which estimates error structures from forecast differences valid at the same time but initialized at different times. To solve the cost function efficiently, a control variable ν is introduced with the transformation

δ x = U ν

, where

U

is decomposed into horizontal (

U_{h}

), vertical (

U_{v}

), and physical (

U_{p}

) components to handle error correlations across dimensions. In this study, the control variable corresponds to CO₂ concentration increments. Through the control variable transformation, the cost function can be reformulated as [44]

J (ν) = \frac{1}{2} ν^{T} ν + \frac{1}{2} {(H U ν - d)}^{T} R^{- 1} (H U ν - d)

(3)

where

d = y^{0} - H (x ᵇ)

represents the observation increment. The optimal control variable is obtained by minimizing

J (ν)

, which is then used to update the CO₂ analysis field.

In this study, a 60 h pre-processing phase is implemented prior to model initiation to eliminate the spin-up effect of the initial field. The CO₂ concentration assimilation workflow is illustrated in Figure 2. The system performs analysis cycles every 6 h, continuously integrating the latest observational data to dynamically optimize the regional CO₂ concentration distribution. This process provides high-resolution data support for carbon emission assessment and dynamic monitoring.

The assimilation process requires mapping the influence of observations onto the model’s higher-resolution grid. In the 3DVAR assimilation framework, the influence of observational data on model grids is governed by a predefined background error covariance matrix. This matrix, constructed via the NMC method, determines the spatial range and strength of observation impacts, such as a typical horizontal influence radius of 150 km. When satellite column concentration data are assimilated, an adjoint operator works with this covariance matrix to distribute correction signals from observation locations to surrounding model grids according to preset spatial correlation functions. The multivariate covariance scheme adopted in this study employs a physically reasonable control variable transformation, ensuring that information propagation from coarse resolution observations to finer model grids aligns with atmospheric dynamics in both horizontal and vertical dimensions. This approach effectively addresses the technical challenge of constraining a 27 km resolution model grid using approximately 0.5° observational data.

2.3.3. EAKF Method

The Ensemble Adjustment Kalman Filter (EAKF), as a key member of the Ensemble Kalman Filter (EnKF) family, employs a unique deterministic “adjustment” strategy for state updating. Its core concept involves updating the CO₂ concentration field through deterministic transformation of ensemble members rather than stochastic perturbation.

The forecast phase remains consistent with the traditional EnKF approach, where each ensemble member is integrated forward using the WRF-Chem model to form the forecast ensemble of CO₂ concentrations [45]:

x_{f}^{i} = M (x_{a, i}^{t - 1}) + η^{i}, i = 1, \dots, N

(4)

where

x_{f}^{i}

represents the forecast CO₂ concentration field of the ith ensemble member,

M

denotes the WRF-Chem model,

x_{a, i}^{t - 1}

is the analyzed CO₂ concentration field from the previous time step, and

η^{i}

is an optional model error term.

The analysis phase employs a serial observation processing strategy. First, the mean

{\bar{x}}_{f}

and covariance

P_{f}

of the forecast ensemble are calculated to characterize the probability distribution of the forecast CO₂ concentration field [45]:

{\bar{x}}_{f} = \frac{1}{N} \sum_{i = 1}^{N} x_{f}^{i}

(5)

P_{f} = \frac{1}{N - 1} \sum_{i = 1}^{N} (x_{f}^{i} - {\bar{x}}_{f}) {(x_{f}^{i} - {\bar{x}}_{f})}^{T}

(6)

The process then proceeds to the observation assimilation phase. Based on Bayes’ theorem, the EAKF optimally combines CO₂ observational data with model forecasts. It adopts a sequential processing approach, assimilating observations one by one to significantly reduce computational complexity. First, the state space is mapped to the observation space via the observation operator H, yielding the forecast ensemble in observation space [45]:

y_{f}^{i} = H x_{f}^{i}

(7)

Subsequently, the ensemble mean in observation space

{\bar{y}}_{f}

, the state-observation covariance

P_{x y}

, and the observation space covariance

P_{y y}

that includes observation error

R

are calculated [45]:

{\bar{y}}_{f} = \frac{1}{N} \sum_{i = 1}^{N} y_{f}^{i}

(8)

P_{x y} = \frac{1}{N - 1} \sum_{i = 1}^{N} (x_{f}^{i} - {\bar{x}}_{f}) (y_{f}^{i} - {\bar{y}}_{f}) ᵀ

(9)

P_{y y} = \frac{1}{N - 1} \sum_{i = 1}^{N} (y_{f}^{i} - {\bar{y}}_{f}) (y_{f}^{i} - {\bar{y}}_{f}) ᵀ + R

(10)

The Kalman gain is calculated as follows [45]:

K = \frac{P_{x y}}{P_{y y}}

(11)

This gain quantifies the sensitivity of the CO₂ concentration state to the observations. Finally, an ensemble adjustment is applied to each member [45]:

x_{a}^{i} = x_{f}^{i} + K (y_{0} - y_{f}^{i})

(12)

Unlike traditional stochastic perturbation methods, the EAKF does not add random noise to observations. Instead, it performs deterministic adjustments to each ensemble member: the CO₂ concentration field is corrected using the member’s forecast observation residual

y_{0} - y_{f}^{i}

and the Kalman gain

K

. This approach effectively avoids errors introduced by random sampling and is particularly suitable for small to medium-sized ensembles.

In this study, the EAKF algorithm achieves adaptive localization through sequential processing of CO₂ observations, effectively suppressing spurious correlations caused by sparse observations and model errors. Its deterministic adjustment mechanism maintains ensemble statistical consistency while eliminating the need for post-processing. Building upon the EnKF’s capability to handle nonlinearity, the algorithm significantly reduces sampling error and successfully resolves the computational bottleneck associated with covariance matrices in high-dimensional WRF-Chem systems.

Unlike 3DVAR, which relies on static statistical relationships, the EAKF method utilizes flow-dependent ensemble forecast error covariance to dynamically determine observational impacts on model grids. By running multiple parallel model forecasts as an ensemble, real-time estimation of dynamic statistical relationships between model states and observations is achieved. These relationships emerge naturally from atmospheric physical processes such as advection and diffusion during forecasting, accurately reflecting interactions between different spatial points under specific weather conditions. During analysis, the system automatically calculates the contribution weight of each satellite observation point to upstream, downstream, and surrounding model grids based on these dynamic relationships. Differences between observations and model forecasts are then intelligently interpolated and allocated back to higher resolution model grids according to these weather-dependent weights. This method provides an implicit flow-dependent adjoint capability, enabling rational and optimized spatiotemporal allocation of observational information under complex meteorological conditions without requiring explicit adjoint model coding.

2.3.4. Calculation of CO₂ Emission

Building upon the SO₂ emission inventory optimization method proposed by Hu et al. [46], we systematically refined and improved the inversion and correction approach for anthropogenic carbon emission inventories in WRF-Chem, adapting it to the physico-chemical characteristics of CO₂ and adjusting the core calculation logic from mass concentration-based computation applied to SO₂ to molar concentration-based computation tailored for CO₂.

Forecast errors of CO₂ concentrations in WRF-Chem primarily stem from uncertainties in emission sources, meteorological transport errors, and inaccuracies in the initial field. Given the chemical inertness and long atmospheric lifetime of CO₂, errors induced by chemical processes are negligible. Assuming the emission source error at the initial time (

t = 0

) is

δ E^{0}

, the CO₂ concentration forecast error at the next time step (

t = 1

) is

δ x^{1}

. In a 1 h forecast, if the initial field is relatively accurate through assimilation and errors induced by meteorological transport are minor, the grid-scale CO₂ concentration forecast error

δ x^{1}

during this period can be primarily attributed to the emission source error

δ E^{0}

from the preceding time step.

Based on this assumption, the CO₂ concentration forecast error can be translated into an estimation of emission source error. The 3DVAR-based optimization process for CO₂ emissions consists of six key steps: (1) assimilating observations to optimize the background concentration field

x^{0}

at

t = 0

into the analysis field

x^{0 a}

; (2) performing a one-hour WRF-Chem forecast initialized with

x^{0 a}

to obtain the forecast field

x^{1}

at

t = 1

; (3) using

x^{1}

as the new background at

t = 1

and applying 3DVAR with contemporaneous observations to produce the analysis field

x^{1 a}

; (4) calculating the difference between

x^{1 a}

and

x^{1}

to derive the CO₂ concentration forecast error (increment field)

δ x^{1}

; (5) attributing

δ x^{1}

primarily to the emission source error

δ E^{0}

from

t = 0

(i.e.,

δ x^{1} \approx f (δ E^{0})

) under the assumption that other errors such as meteorological transport are negligible; and (6) inverting

δ x^{1}

to estimate the emission source error as

δ E^{0} \approx f^{- 1} (δ x^{1})

, which is then added to the background emissions

E^{0}

to obtain the optimized emission source.

Given the physicochemical characteristics of CO₂, its concentration forecast error can be converted into emission flux error through the following mass balance equation:

δ x^{1} \approx f (δ E^{0}) = \frac{M_{air} \times Δ t}{ρ \times Δ z} \times δ E_{0}

(13)

The calculation formula for the emission source error is

δ E^{0} \approx f^{- 1} (δ x^{1}) = \frac{ρ \times Δ z}{M_{air} \times Δ t} \times δ x_{1}

(14)

In the formula,

M_{air} = 0.029 k g / m o l

represents the molar mass of air,

ρ

is the actual air density (unit: kg/m³),

ρ_{air} = 1.29 kg / m^{3}

is the air density under standard conditions,

Δ z

refers to the height of the bottom model layer (unit: m), and

Δ t

is the time step (unit: s). CO₂ concentration is expressed in ppm (μmol/mol), which denotes a mole fraction ratio.

The CO₂ forecast error

δ x^{1}

is calculated as

δ x^{1} = x_{a}^{1} - x_{b}^{1}

, where

x_{b}^{1}

is the forecast concentration field after one hour, and

x_{a}^{1}

is the analysis field obtained by assimilating observational data with the background field

x_{b}^{1}

. The term

x_{a}^{1} - x_{b}^{1}

also represents the CO₂ concentration increment generated during the assimilation process. As a three-dimensional increment field array derived from the assimilation of the background field,

x_{a}^{1}

characterizes the one-hour forecast error of CO₂ concentration in the model. Therefore, the CO₂ concentration forecast error field corresponds to the hourly assimilation increment field, which can be used to further invert emission source errors and ultimately achieve an optimized estimation of emission sources.

2.3.5. Statistical Evaluation Indicators

In this study, three statistical metrics are employed to evaluate the model’s performance: the correlation coefficient (R), root mean square error (RMSE), and bias. R measures the strength and direction of the linear relationship between the model predictions and observations, indicating how well the simulated values track the observed variations. RMSE quantifies the magnitude of errors between predicted and observed values, while bias represents the average difference between the predictions and observations, serving as an indicator of systematic error. The formulas for these metrics are as follows [47,48,49]:

R = \frac{\sum_{i = 1}^{n} (x_{i} - \bar{x}) (y_{i} - \bar{y})}{\sqrt{\sum_{i = 1}^{n} {(x_{i} - \bar{x})}^{2} \sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}}

(15)

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - x_{i})}^{2}}

(16)

B i a s = \frac{1}{n} \sum_{i = 1}^{n} (y_{i} - x_{i})

(17)

where

x_{i}

denotes the ith observed value,

y_{i}

represents the ith model-predicted value,

\bar{x}

and

\bar{y}

are the mean values of the observations and model predictions, respectively, and n is the number of samples.

2.4. Experimental Design

This study employs 3DVAR and EAKF systems to optimize and conduct a comparative analysis of anthropogenic CO₂ emissions over China in December 2019. The inversion performance and applicability of the two methods are systematically evaluated based on forecast results using their respective optimized emission inventories. Key configurations of the CO₂ data assimilation emission experiments are detailed in Table 3. The accuracy of CO₂ concentration simulation and assimilation is fundamental to ensuring the reliability of optimized emission inventory results. The assimilation methods employed in this study have undergone systematic accuracy validation in prior work based on long-term observational data over China [50,51], thereby providing reliable technical support for the current emission inversion.

Prior to conducting the CO₂ concentration assimilation experiments, we performed a preliminary meteorological data assimilation to optimize the atmospheric state, following the approach described in our previous study [50,51]. Meteorological variables, including wind, temperature, pressure, and relative humidity, were assimilated using observational data from surface stations and radiosondes. The assimilated meteorological fields exhibit high consistency with independent observations: R for wind speed ranges from 0.6 to 0.8, for temperature exceeds 0.96, for pressure falls between 0.99 and 1.0, and for relative humidity is above 0.85. This ensures that meteorological uncertainties exert minimal influence on the subsequent CO₂ inversion experiments. Following the meteorological assimilation, a control experiment (CTRL) using the MEIC inventory as prior emissions was first executed. The WRF-Chem model simulated the entire month of December 2019 to provide physical and chemical initial fields for subsequent assimilation, ensuring consistency in meteorological and chemical parameters.

The 3DVAR assimilation experiment was performed every 6 h, sequentially assimilating multi-source satellite remote sensing fused XCO₂ concentration data. The first assimilation window covered 1200–1800 UTC on 3 December. After assimilating observations within this period, the optimized analysis field at 1200 UTC was output as the updated concentration initial field, while the optimized emissions for 1200–1700 UTC were inverted. Subsequent windows advanced sequentially—for example, observations from 1800 to 2400 UTC were used to optimize the concentration field at 1800 UTC and emissions for 1800–2300 UTC. Prior emissions from 0000 UTC on 1 December to 1200 UTC on 3 December were sourced from the MEIC inventory. After 1200 UTC on 3 December, the prior emissions were dynamically updated using the previous day’s 3DVAR-optimized emissions.

The EAKF assimilation experiment first generated 30 ensemble members by applying Gaussian perturbations with zero mean and 20% standard deviation to the prior CO₂ emissions to characterize emission uncertainty. A 60 h ensemble spin-up forecast (30 November–2 December) was then conducted to allow the system to reach a dynamic-chemical equilibrium state. Starting from 1200 UTC on 3 December, a 6-hcycling assimilation was initiated, using the EAKF to assimilate multi-source satellite XCO₂ concentration data. This process simultaneously updated both the CO₂ concentration field at the current analysis time and the emission fluxes within the preceding 6 h window. The cycling continued until 31 December, ultimately producing an optimized CO₂ emission inventory with a 6 h temporal resolution. The EAKF system applied the updated emissions from each assimilation cycle to the subsequent ensemble forecast, establishing a bidirectional feedback mechanism between emissions and concentrations. To ensure comparability, both the 3DVAR and EAKF experiments used identical initial conditions and WRF-Chem physical/chemical parameterization schemes.

3. Results

3.1. Sensitivity Experiments

3.1.1. Evaluation of Hourly Carbon Emission Correction Through Assimilation

To evaluate the sustained impact of each initial field assimilation on CO₂ emission correction, a sensitivity experiment focusing on initial field assimilation was conducted prior to the formal experiments. The model underwent a 60 h spin-up period starting from 0000 UTC on 1 December 2019, to establish a reasonable atmospheric initial field. Subsequently, multi-source satellite remote sensing fused CO₂ concentration data were assimilated every 6 h, with a 12 h simulation conducted after each assimilation event, starting from 1200 UTC on 3 December 2019. By analyzing the hourly simulation results after assimilation and comparing them with the control experiment without assimilation, the corresponding emission corrections were calculated using Equations (5) and (6) to quantitatively assess the temporal evolution of the assimilation effects. Figure 3a–e display the spatial distribution of emission correction means at 3 h, 6 h, 9 h, and 12 h after assimilation, as well as the mean from 0 h to 12 h after assimilation, respectively. Figure 3f shows the time series of the average CO₂ concentration differences before and after assimilation and the average emission correction amounts over the study region. The results indicate that carbon emission sources were underestimated in most parts of central and eastern China, while they were overestimated in regions such as Yunnan and Guizhou. The magnitude of carbon emission correction gradually diminished over time after assimilation, consistent with the weakening of the carbon assimilation effect. The average correction over the 0 h–12 h period was found to be closer to the distribution observed at 6 h after assimilation, and the correction response remained relatively stable around 6 h after assimilation, indicating that the mid-term assimilation effect is representative.

3.1.2. Evaluation of Hourly Carbon Concentration Through Simulations

To better evaluate the distinct impacts of initial field assimilation and the application of a new emission inventory on CO₂ concentration simulations, we conducted sensitivity experiments using the optimized emission inventory. Starting from 1200 UTC on 3 December, simulation experiments (SIM_N) using this inventory were conducted every 6 h, with each run lasting 12 h. To assess the optimization effect, the SIM_N experiments were compared with assimilation experiments (ASS) that incorporated multi-source satellite remote sensing fused data and simulation experiments using the MEIC inventory (SIM), with a focus on analyzing differences in simulated CO₂ concentrations over every 12 h period.

Figure 4 shows the monthly average spatial distribution of CO₂ concentrations from the ASS experiment at 3, 6, 9, and 12 h after the start of assimilation, as well as the differences with the SIM and SIM_N experiments at corresponding times. The results indicate that the CO₂ concentrations simulated by the SIM_N experiment show smaller deviations from the ASS assimilation experiment. As the assimilation time increases, the deviations I confirm.of both the SIM and SIM_N experiments from the ASS experiment become more similar, further reflecting the gradual weakening of the assimilation effect from 6 to 12 h.

As shown in Table 4, the SIM_N experiment significantly outperforms the SIM experiment across all evaluation metrics for CO₂ concentration simulation, demonstrating the positive impact of updating the emission source on improving simulation performance. The Bias of the SIM_N experiment is lower than that of the SIM experiment at all times. Specifically, the 3 h forecast bias decreased from 1.67 ppm to 0.33 ppm, a reduction of 80.2%, and the 6 h forecast bias decreased from 1.53 ppm to 0.39 ppm, a reduction of 74.6%. This indicates that the new emission source mitigates the issue of systematic overestimation. The RMSE of the SIM_N experiment also shows improvement. The 3 h forecast RMSE decreased from 3.10 ppm to 1.09 ppm, a reduction of 64.8%, and the 6 h forecast RMSE decreased from 2.83 ppm to 0.99 ppm, a reduction of 65.0%. The biases of the new emission source experiment are consistently lower than the average biases of the control experiment.

In summary, the assimilation results and emission source optimization at 1800 UTC demonstrate relatively better performance and tend to stabilize, reaffirming that the 6 h assimilation effect is optimal. Based on the results of the two sensitivity experiments, the 6 h post-assimilation results were selected for subsequent calculation of carbon emission corrections.

3.2. Spatial Changes in Emissions

Figure 5 displays the spatial distribution of the optimized mean emissions for December 2019 and their differences from the MEIC emission inventory. Compared with the MEIC emissions (Figure 1b), both optimized emission results (Figure 5a,b) exhibit spatial patterns largely consistent with MEIC, sharing similar high- and low-value areas. This indicates that the spatial distributions of the optimized emission inventories are reasonable and reliable. An analysis of the differences between the optimized inventories and the MEIC inventory (Figure 5c,d) reveals minor adjustments in carbon emissions over the northeastern and western regions, with slight overestimations observed in western China and coastal areas of Shandong and Jiangsu provinces. In contrast, the MEIC inventory shows slight underestimations in the economically developed eastern regions, such as NEC, BTH, YRD, and PRD, where the optimized emissions increase noticeably. Although certain local discrepancies exist, the differences in most areas remain within a range of ±3 × 10⁴ mol km⁻² h⁻¹, indicating a high overall consistency in the optimization results.

The optimized simulation results of CO₂ emissions in China and its major regions for December 2019 are presented in Table 5. Compared to the MEIC baseline emissions, both the 3DVAR and EAKF assimilation methods indicate an increase in national CO₂ emissions, with increments of 13.06% and 7.51%, respectively. This reflects that anthropogenic emission activities during the simulation period were generally higher than the inventory estimates. Spatially, emission changes exhibit significant heterogeneity across regions.

In the economically developed and industrially intensive YRD region, the CO₂ emissions simulated by both optimization methods are relatively close to the MEIC inventory, only slightly higher than the baseline. This suggests a relatively high representativeness of the inventory in this region, which may also be related to its industrial restructuring and continuously implemented emission reduction measures. BTH and PRD regions show moderate increases in emissions, with increments ranging between 2% and 10%, potentially linked to increased heating demand in winter and the still high intensity of industrial production activities.

In the NEC region, emission growth is more pronounced, particularly with a 12.62% increase in the 3DVAR method. Major cities in the region, such as Harbin, Changchun, and Shenyang, show significant increasing trends in emissions, while the changes in surrounding suburban and rural areas are relatively minor. This spatial disparity may stem from increased coal heating demand in winter and the influence of urban energy consumption structures. CC region exhibits the most significant absolute increase in emissions, with the 3DVAR inversion results showing a 20.08% increase compared to MEIC, while EAKF also indicates an 8.23% rise. As a traditional industrial base, this region has a high emission intensity, and its changes may be influenced by both the post-pandemic resumption of work and production and the regulatory effects of environmental policies.

The XJ region is the only area showing a significant decline in emissions, with a decrease of over 10% particularly in the EAKF results. This aligns with the region’s sparse distribution of emission sources and low intensity of human activities, as the optimization algorithms better capture the weaker emission signals in remote areas.

In terms of methodological comparison, both optimization schemes show consistent directions of change in most regions. However, the emission estimates of the 3DVAR method are generally higher than those of EAKF, with significant differences particularly in Northeast China and Central China. This phenomenon reflects the sensitivity differences in different assimilation algorithms in handling prior error structures and observational information. The 3DVAR method may respond more sensitively to emissions in strong source regions, while EAKF demonstrates stronger constraint stability.

3.3. Temporal Changes in Emissions

Figure 6 illustrates the daily average and hourly average variations of CO₂ emissions in China from 1 to 31 December 2019. It should be noted that the MEIC dataset only provides monthly total emissions and does not include fluctuations at daily or hourly scales. Therefore, daily emissions within the same month are generally assumed to remain constant. In terms of daily average emissions (Figure 6a), the optimized mean emissions from the 3DVAR and EAKF methods are 38.29 MT/day and 36.41 MT/day, respectively, both slightly higher than the MEIC baseline value of 33.86 MT/day. Between 22 and 31 December, both optimized results reached their monthly peaks, with 3DVAR at 42.35 MT/day and EAKF at 38.54 MT/day. This increase can be attributed to multiple factors. Firstly, elevated anthropogenic emissions in December resulted from combined heating demands and industrial activities in northern China [52]. Secondly, stagnant meteorological conditions during this period weakened atmospheric dispersion, leading to CO₂ accumulation. In the inversion system, this suppressed vertical mixing is partially interpreted as an increase in “effective emissions.” Additionally, if systematic underestimation exists in the prior emission inventory, the assimilation process progressively corrects for this bias over time, further contributing to the rising trend in estimated emissions.

On an hourly scale (Figure 6b), the MEIC data exhibit a typical bimodal structure. The emission peaks in the MEIC data occur at 0100 UTC (0900 Beijing Time) and 0900 UTC (1700 Beijing Time), corresponding to the morning and evening commuting rush hours, respectively. This pattern aligns with existing understanding of the temporal characteristics of anthropogenic emissions. However, the two optimization results show significant overestimation during the first peak period and underestimation during the second peak. This discrepancy may be attributed to the superposition of factory emissions and morning traffic flow in the early hours, which elevates actual emissions, whereas social activities decline sharply after the evening peak, leading to a rapid reduction in emission intensity. The optimized emission data generally show higher values between 0100 UTC and 0400 UTC (0900 to 1200 Beijing Time), while lower values are observed from 1600 UTC to 2200 UTC (0000 to 0600 Beijing Time). This indicates sustained strong industrial and transportation emissions from morning to noon, whereas only baseline industrial emissions remain from late night to early morning, reflecting a significant reduction in activity levels.

Figure 7 further illustrates the hourly variation characteristics of optimized emissions across six major regions during the study period. Temporal analysis reveals distinct differences in optimized emission patterns among these regions. In northern China (including the NEC and BTH region), both optimization approaches maintain a bimodal structure, though the second peak is less pronounced. Both optimized results show significant overestimation during 0000–0600 UTC, marked underestimation during 0600–1200 UTC, and close alignment with the MEIC inventory between 1200 and 2300 UTC. During the 0000–1200 UTC period, the 3DVAR optimized results demonstrate smaller adjustments and closer agreement with MEIC, whereas the EAKF method exhibits larger deviations.

The XJ region displays the smallest hourly variation amplitude in optimized emissions among all regions. Except for slightly higher 3DVAR values during 1300–2100 UTC, both optimized results remain below the MEIC inventory throughout the day. This suggests relatively stable emission levels in Xinjiang, potentially attributable to its consistent industrial structure, minimal diurnal fluctuations in human activity, and simpler energy consumption patterns.

In the CC region, both optimized emission trajectories show broadly synchronized temporal variations, remaining significantly higher than MEIC during 0000–0900 UTC and moderately elevated during 1200–2300 UTC. Within the YRD region, EAKF results exceed 3DVAR values from 0000 to 0800 UTC, while the reverse pattern occurs during 0900–1200 UTC. The PRD region shows a single emission peak at 0400 UTC for EAKF compared to 0200 UTC for 3DVAR, with both methods exhibiting gradual declines post-peak. These regional disparities in diurnal emission characteristics likely stem from varying emission source structures, industrial activity patterns, and meteorological dispersion conditions across different areas.

3.4. Evaluation of Posterior Emission Source Simulation Performance

To assess the improvement in CO₂ concentration simulations achieved by posterior emission sources across different regions, this study compares the diurnal variation characteristics of simulated CO₂ concentrations from three experiments (Sim_MEIC, Sim_3DVAR, Sim_EAKF) with observational data for December 2019. WRF-Chem model outputs were evaluated by comparing the simulated surface-layer concentrations (lowest model level) with in situ measurements from the WDCGG stations at WLG and HKO, as shown in Figure 8. Results demonstrate that data assimilation techniques improve the accuracy of CO₂ concentration simulations, though the degree of improvement exhibits regional variations.

At the WLG station, observed CO₂ concentrations show minor fluctuations within the 412–418 ppm range, consistent with its characteristics as a global background station minimally influenced by local anthropogenic emissions. While the Sim_MEIC experiment captures the general background concentration level, it exhibits a systematic positive bias of approximately +3 ppm relative to observations, particularly during early December. Furthermore, it fails to reproduce several minor decreasing fluctuations observed in mid-to-late December. In contrast, both assimilation experiments reduce the systematic overestimation present in Sim_MEIC. The Sim_3DVAR experiment shows the smallest discrepancy from observations, achieving remarkable consistency with measured values after December 25. The Sim_EAKF experiment, however, maintains a slight overestimation, indicating its assimilation performance is inferior to that of Sim_3DVAR.

At the HKO station, the observed concentrations exhibit oscillations exceeding 30 ppm in amplitude, reflecting the emission characteristics of urban areas influenced by anthropogenic activities such as morning-evening traffic peaks and weekday-weekend variations. Additionally, potential marine-source emissions in the vicinity of the station introduce extra uncertainties to the simulations, resulting in substantial biases in the baseline experiment. The Sim_MEIC baseline experiment performed the poorest, demonstrating a systematic underestimation and failing to capture almost all major concentration peaks. This indicates that the prior MEIC emission inventory underestimates both the intensity of anthropogenic CO₂ emissions and their diurnal variation patterns in this region. The Sim_3DVAR experiment shows improvement over Sim_MEIC, successfully elevating simulated concentration levels and capturing the main variation trends. However, it still exhibits insufficient peak magnitudes and phase lags—for instance, the simulated peaks on 10 and 22 December lag behind observations by half a day to a full day. Compared to Sim_3DVAR, the Sim_EAKF experiment yields lower concentration values and demonstrates relatively weaker performance in reproducing peak features.

WRF-Chem derived XCO₂ concentrations by applying pressure-weighted averaging to the simulated vertical concentration profiles, which were then compared with ground-based observations from the Hefei and Xianghe stations in the TCCON network, as shown in Figure 9. At Hefei station, simulated concentrations fluctuated within a range of approximately 409 ppm to 420 ppm, with a total variation of about 11 ppm. However, limited observational data revealed that the Sim_3DVAR experiment performed similarly to Sim_MEIC, matching its results on 16 and 26 December but exhibiting underestimation during other periods.

At Xianghe station, observed XCO₂ concentrations demonstrated fluctuations, ranging from approximately 407 ppm to 416 ppm with a total amplitude of about 9 ppm. This pattern reflects the station’s characteristic as a site influenced by both regional anthropogenic activities and natural processes. The Sim_3DVAR experiment showed improvement over Sim_MEIC by partially correcting the systematic low bias, resulting in better agreement with observations. This was evident during 8–9 and 22 December, when it captured concentration variations. The Sim_EAKF experiment demonstrated a good performance at Xianghe station, with its simulated curve showing the closest agreement with observations throughout the study period. It not only achieved consistency in concentration levels but also captured multiple short-term fluctuations, such as those occurring from 1 to 6 December.

We also compared the results from the three experiments (Sim_MEIC, Sim_3DVAR, and Sim_EAKF) with the CAMS global greenhouse gas reanalysis (EGG4), as shown in Figure 10. The results demonstrate that compared to the Sim_MEIC experiment, both Sim_3DVAR and Sim_EAKF significantly improved the accuracy of CO₂ concentration simulations. Specifically, the bias decreased from 0.566 ppm to 0.140 ppm and 0.169 ppm, respectively, while the RMSE was reduced from 1.177 ppm to 0.599 ppm and 0.626 ppm, representing decreases of approximately 49% and 47%, respectively. The assimilation methods based on 3DVAR and EAKF optimized the simulation results, reducing both systematic bias and random errors, validating the effectiveness of data assimilation in enhancing the accuracy of carbon emission inversion.

4. Discussion

This study developed an atmospheric CO₂ source inversion method based on the WRF-Chem model coupled with 3DVAR/EAKF assimilation frameworks, which enables hourly optimization of emission inventories through the integration of multi-source remote sensing satellite observations of XCO₂ concentrations. However, this methodology relies on two critical assumptions: First, the influence of CO₂ chemical reactions on sources and sinks is negligible within a 1 h time window. Since CO₂ chemical reaction rates are slow under cloud-free and precipitation-free conditions, this requirement can be satisfied by excluding such areas and time periods. Second, it assumes that when wind speed remains below 4 m·s⁻¹ and divergence exceeds 10⁻⁴ s⁻¹ within one hour, CO₂ diffusion is confined to grid-cell regions within the boundary layer. This assumption holds reasonable accuracy under stable boundary layer conditions, as CO₂ primarily undergoes intra-grid diffusion. However, in situations with elevated and unstable boundary layers, where the WRF-Chem model exhibits biases in simulating unstable boundary layer heights and upper-level winds are typically stronger, CO₂ may experience excessive diffusion or advective transport beyond grid boundaries. This could lead to an underestimation of emissions and consequently affect inversion accuracy.

We compared the original and corrected MEIC emission inventories for December 2019 with the EDGAR emission inventory, as shown in Figure 11. Comparative analysis between the MEIC inventory (Figure 1b) and EDGAR inventory (Figure 11a) reveals spatial characteristics: the EDGAR inventory shows higher emission values in urban high-carbon emission areas but lower values in surrounding regions. In contrast, the MEIC inventory demonstrates opposite characteristics, with less pronounced representation of urban high-emission zones but more significant emission distribution in peripheral areas. In terms of overall emission levels, as shown in Figure 11b, the EDGAR inventory generally exhibits higher values than the MEIC inventory. Further comparison between the Sim_3DVAR and Sim_EAKF experimental results with the EDGAR inventory demonstrates that both correction methods effectively reduce the RMSE. Specifically, the 3DVAR method reduces RMSE by 9% and increases R by 12%, while the EAKF method demonstrates better performance, achieving a 56% reduction in RMSE and a 39% improvement in R. These results indicate spatial distribution differences between the MEIC and EDGAR inventories regarding carbon emissions. The data assimilation methods enhance the performance of the MEIC inventory, with the EAKF method exhibiting better correction capability.

We analyzed the average planetary boundary layer height (PBLH) over different time periods within the study area and plotted the daily mean PBLH variation curve, as shown in Figure 12. The study found that during the 1200 UTC to 2300 UTC period, the PBLH was relatively low, generally ranging between 100 and 300 m, indicating a relatively stable planetary boundary layer structure during this time. This stability favors the accumulation of near-surface pollutants, and the simulation results are less influenced by vertical diffusion. In contrast, during the 0300 UTC to 0800 UTC period, the PBLH increased significantly, and the boundary layer became unstable. The stronger vertical diffusion led to the dilution of near-surface CO₂ concentrations, which may weaken the simulation differences between various emission scenarios. Therefore, this study concludes that the simulation results during the 1200 UTC to 2300 UTC period are more reliable, with relatively lower uncertainty in emission source inversion. On the other hand, during the 0300 UTC to 0800 UTC period, due to the intense turbulent mixing caused by the unstable boundary layer, the current method for estimating emission sources still exhibits considerable uncertainty. Future research should incorporate more observational data and improved vertical mixing parameterization schemes.

In the WRF-Chem simulation of CO₂ concentrations, this study utilized the MEIC emission inventory with an original spatial resolution of 0.25° (approximately 27 km). To minimize the uncertainty introduced by the emission inventory in the simulation, the model grid resolution was also set to 27 km, thereby avoiding potential biases caused by spatial resampling of emission data. Additionally, the MEIC inventory provides monthly average emission data, and when processing it into an hourly scale, the hourly emission factors were configured based on recommendations from the MEIC emission inventory development team. This process inevitably introduces certain errors. Moreover, given the significant differences in emission characteristics across different regions, developing region-specific hourly factors in the future will help further enhance the accuracy of the model in simulating CO₂ concentrations.

Currently, widely used global carbon emission inventories (e.g., EDGAR and ODIAC) provide CO₂ concentration data with limited spatial resolution, while regional emission inventories commonly applied in Asia (e.g., MIX and MEIC), though offering regional data products to some extent, still require improvements in temporal resolution. In practical validation processes, the availability of real emission source data for comparison is extremely limited. Therefore, to establish a reliable reference, this study adopts an assimilated CO₂ concentration field as the optimized result. Based on this posterior concentration field, carbon emissions are calculated using a retrieval method, serving as a proxy for the “true” emission values. However, this approach still has certain limitations. On one hand, the inversion results are influenced by factors such as model errors, assimilation algorithms, and the representativeness of observational data, leading to inherent uncertainties. On the other hand, the lack of independent real emission data for validation also restricts the reliability of the inversion results as a benchmark truth. Future efforts should incorporate more ground-based observations and higher-resolution remote sensing data to further enhance the accuracy of emission estimates.

Prior to WRF-Chem assimilation, we implemented rigorous quality control on the multi-source fused XCO₂ dataset from Jin et al. [33] by excluding all data points with uncertainties exceeding 3 ppm, which enhanced the dataset’s accuracy and reliability, yet residual assimilation errors still mainly stem from regionally heterogeneous uncertainties in the dataset construction process—variable regional observation conditions disrupt satellite CO₂ spectral detection and leave residual retrieval errors that cannot be fully eliminated via uncertainty-weighted fusion, the uneven global distribution of TCCON validation stations weakens uncertainty quantification in data-sparse regions where fused XCO₂ uncertainties depend more on model parameterization than ground-truth constraints thus introducing uncalibrated errors, 30-day temporal smoothing and CarbonTracker-based gap-filling add extra uncertainties in sparse-observation regions where most data are model-simulated and carry biases from CarbonTracker’s representation of regional CO₂ transport and source-sink processes, and Maximum Likelihood Estimation and Optimal Interpolation fusion algorithms fail to resolve divergent satellite retrievals in complex terrain and land-ocean transition zones. These uncertainties propagate into the assimilation system and may lead to spatially correlated biases in the optimized fluxes.

In this study, the optimization of the anthropogenic carbon emission inventory is based on a clear and reasonable premise. Short-term variations in regional atmospheric CO₂ concentrations are primarily influenced by anthropogenic carbon emissions. We treat non-anthropogenic sources—including biogenic fluxes simulated by the VPRM, ocean fluxes based on JMA data, and negligible biomass burning emissions during the non-fire season—as deterministic background or secondary contributors. The magnitudes of these non-anthropogenic fluxes are far lower than those of anthropogenic emissions, and their potential errors have statistically and physically negligible impacts on the inversion results. Therefore, the concentration discrepancies identified by the assimilation system between the model and observations can be robustly attributed to spatiotemporal errors in the anthropogenic emission inventory, rather than uncertainties in natural source fluxes. This ensures the reliability of the emission inversion results in accurately reflecting anthropogenic activities.

Potential uncertainties may also arise from the processing of MEIC inventory data, primarily due to the spatiotemporal redistribution performed by the meic2wrf tool. The tool applies empirical parameters to determine emission height distribution coefficients and temporal allocation factors for five major anthropogenic sectors, including power and industry. While the default parameters are suitable for national averages or typical regions, they may not fully capture local industrial structures and energy consumption patterns in certain study areas. Moreover, these parameters do not account for dynamic factors such as seasonal and meteorological variations, which can influence the spatiotemporal characteristics of emissions. Consequently, minor discrepancies may occur between the allocated emissions and actual conditions. Future research could refine this process through regionalization and dynamic adjustments.

The initial and boundary conditions used in this study are derived from CarbonTracker 2022. According to a recent systematic validation over China by Ruan et al. [53], CT2022 shows good agreement with ground-based TCCON XCO₂ observations (RMSE = 1.78 ppm, R = 0.92) and performs robustly in comparisons with multiple satellite products (GOSAT/GOSAT-2, OCO-2/OCO-3). This indicates that CT2022 has high accuracy over China and is suitable as the background field for this study. However, any background field carries inherent uncertainties, and CT2022 may still exhibit regional or seasonal systematic biases. Within our inversion framework, such large-scale, systematic concentration biases are partially attributed and corrected to surface fluxes during the cost-function minimization process, thereby reducing their direct impact on the final inversion results. The discrepancies between simulated and observed concentrations in this study primarily reflect uncertainties in the prior anthropogenic emission inventory, rather than significant biases in the CT2022 background field. The optimization of fluxes through the assimilation process effectively corrects the concentration simulation biases caused by inaccuracies in the prior emissions.

5. Conclusions

Based on the WRF-Chem model, this study developed a multi-source data assimilation system that integrates multi-platform satellite observations to simultaneously optimize regional CO₂ concentration fields and emission fluxes. Through simulation and assimilation experiments conducted in China during December 2019, the system successfully reconstructed better CO₂ concentration fields. Furthermore, based on concentration discrepancies, emission errors were inverted, enabling dynamic correction and precise optimization of the carbon emission inventory.

We evaluated the performance of 3DVAR and EAKF methods in optimizing CO₂ emissions and improving forecast accuracy over China during December 2019. The study employed WRF-Chem models with identical configurations and implemented hourly surface CO₂ observation data assimilation for both methods using a multi-source satellite fused CO₂ concentration dataset. The results show that 3DVAR and EAKF produced optimized emissions with similar spatiotemporal distribution patterns across most regions of China, demonstrating both methods’ effectiveness in reducing prior emission inventory uncertainties. Compared with the MEIC emission inventory, the optimized emissions increased by 13.6% and 5.1% for 3DVAR and EAKF, respectively. Nationwide emission increases were observed except in Xinjiang. Specifically, December 2019 saw carbon emission reductions of 3.24 MT and 7.99 MT in Xinjiang under 3DVAR and EAKF, respectively, while central China exhibited increases of 74.5 MT and 30.52 MT.

By designing three simulation scenarios (using prior emissions, 3DVAR-optimized emissions, and EAKF-optimized emissions), this study evaluated the improvement effects of emission optimization on CO₂ forecasting. Comparative validation against ground-based observations from TCCON and WDCGG stations demonstrated that optimized emission inventories enhanced simulation performance. Evaluation against the EGG4 dataset revealed that both 3DVAR and EAKF methods reduced systematic biases and random errors in CO₂ concentration simulations, achieving approximately 75% reduction in bias and 49% decrease in RMSE.

Comparison with the EDGAR dataset shows that both the 3DVAR and EAKF optimization methods improve the simulation accuracy based on the MEIC inventory, with EAKF performing better by reducing RMSE by 56% and increasing the correlation coefficient by 39%, enhancing the ability to represent regional carbon emissions. The CO₂ emission assimilation algorithm developed in this study offers an effective tool for regional carbon monitoring, with future applications extendable to different climate zones and multi-pollutant emission optimization, including aerosols, nitrogen oxides, ammonia, and methane for building a broader-coverage, higher-accuracy dynamic emission inventory system.

Author Contributions

Conceptualization, W.L. and X.L.; methodology, W.L.; software, W.L.; validation, W.L., C.L. and B.H.; formal analysis, W.L.; investigation, X.L.; resources, C.L.; data curation, C.L.; writing—original draft preparation, W.L.; writing—review and editing, W.L.; visualization, B.H.; supervision, X.L.; project administration, X.L.; funding acquisition, X.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Intergovernmental International Science And Technology Innovation Cooperation Program under National Key Research and Development Plan [2024YFE0198600], and the National Natural Science Foundation of China (NSFC) [42075114].

Data Availability Statement

The datasets and model codes used in this study are all publicly available. The WRF-Chem model version 4.2.2 was obtained from https://github.com/wrf-model/WRF/releases (accessed on 24 October 2025). CO₂ observations were provided by the Total Carbon Column Observing Network available at https://doi.org/10.14291/tccon.ggg2020, and by the World Data Centre for Greenhouse Gases at https://gaw.kishou.go.jp (accessed on 24 October 2025). The fused multi-source satellite XCO₂ concentration dataset is accessible at https://doi.org/10.1016/j.atmosres.2022.106385. The FNL Operational Global Analysis data at https://doi.org/10.5065/D6M043C6. Carbon emissions data were obtained from the Multi-resolution Emission Inventory model for Climate and air pollution research at https://doi.org/10.1007/s11430-023-1230-3, and initial and boundary conditions were obtained from NOAA’s CarbonTracker CT2022 dataset at https://doi.org/10.25925/z1gj-3254. All datasets are openly accessible for scientific research purposes.

Acknowledgments

The authors acknowledge the CO₂ observations from TCCON (https://tccondata.org/, accessed on 24 October 2025) and WDCGG (https://gaw.kishou.go.jp/, accessed on 24 October 2025), the fused multi-source satellite XCO₂ concentration dataset (https://doi.org/10.1016/j.atmosres.2022.106385, accessed on 24 October 2025), the WRF-Chem model (https://www2.acom.ucar.edu/wrf-chem/, accessed on 24 October 2025) and all the connected input data including FNL (Final) Operational Global Analysis data from NCEP of the NOAA (https://rda.ucar.edu/datasets/d083002/, accessed on 24 October 2025), MEIC dataset from Tsinghua University’s Department of Earth System Science (http://meicmodel.org.cn/, accessed on 24 October 2025) and the CarbonTracker CT2022 dataset from NOAA’s Global Monitoring Laboratory (GML) (https://gml.noaa.gov/ccgg/carbontracker/, accessed on 24 October 2025).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Bashir, A.; Ali, M.; Patil, S.; Aljawad, M.S.; Mahmoud, M.; Al-Shehri, D.; Hoteit, H.; Kamal, M.S. Comprehensive review of c geological storage: Exploring principles, mechanisms, and prospects. Earth-Sci. Rev. 2024, 249, 104672. [Google Scholar] [CrossRef]
Ma, B.; Karimi, M.S.; Mohammed, K.S.; Shahzadi, I.; Dai, J. Nexus between climate change, agricultural output, fertilizer use, agriculture soil emissions: Novel implications in the context of environmental management. J. Clean Prod. 2024, 450, 141801. [Google Scholar] [CrossRef]
Bhatti, U.A.; Bhatti, M.A.; Tang, H.; Syam, M.S.; Awwad, E.M.; Sharaf, M.; Ghadi, Y.Y. Global production patterns: Understanding the relationship between greenhouse gas emissions, agriculture greening and climate variability. Environ. Res. 2024, 245, 118049. [Google Scholar] [CrossRef]
Cheng, W.; Duan, X.; Moore, J.C.; Deng, X.; Luo, Y.; Huang, L.; Wang, Y. Unevenly distributed CO₂ and its impacts on surface energy balance. Atmos. Res. 2022, 274, 106196. [Google Scholar] [CrossRef]
Ciais, P.; Dolman, A.J.; Bombelli, A.; Duren, R.; Peregon, A.; Rayner, P.J.; Miller, C.; Gobron, N.; Kinderman, G.; Marland, G.; et al. Current systematic carbon-cycle observations and the need for implementing a policy-relevant carbon observing system. Biogeosciences 2014, 11, 3547–3602. [Google Scholar] [CrossRef]
Crisp, D.; Pollock, H.R.; Rosenberg, R.; Chapsky, L.; Lee, R.A.M.; Oyafuso, F.A.; Frankenberg, C.; O’Dell, C.W.; Bruegge, C.J.; Doran, G.B.; et al. The on-orbit performance of the Orbiting Carbon Observatory-2 (OCO-2) instrument and its radiometrically calibrated products. Atmos. Meas. Tech. 2017, 10, 59–81. [Google Scholar] [CrossRef]
Zhang, Q.; Li, M.; Wang, M.; Mizzi, A.P.; Huang, Y.; Wei, C.; Jin, J.; Gu, Q. CO₂ Flux over the Contiguous United States in 2016 Inverted by WRF-Chem/DART from OCO-2 XCO₂ Retrievals. Remote Sens. 2021, 13, 2996. [Google Scholar] [CrossRef]
Lian, J.; Breon, F.; Broquet, G.; Lauvaux, T.; Zheng, B.; Ramonet, M.; Xueref-Remy, I.; Kotthaus, S.; Haeffelin, M.; Ciais, P. Sensitivity to the sources of uncertainties in the modeling of atmospheric CO₂ concentration within and in the vicinity of Paris. Atmos. Chem. Phys. 2021, 21, 10707–10726. [Google Scholar] [CrossRef]
Callewaert, S.; Brioude, J.; Langerock, B.; Duflot, V.; Fonteyn, D.; Muller, J.; Metzger, J.; Hermans, C.; Kumps, N.; Ramonet, M.; et al. Analysis of CO₂, CH₄, and CO surface and column concentrations observed at Reunion Island by assessing WRF-Chem simulations. Atmos. Chem. Phys. 2022, 22, 7763–7792. [Google Scholar] [CrossRef]
Nerobelov, G.; Timofeyev, Y.; Foka, S.; Smyshlyaev, S.; Poberovskiy, A.; Sedeeva, M. Complex Validation of Weather Research and Forecasting-Chemistry Modelling of Atmospheric CO₂ in the Coastal Cities of the Gulf of Finland. Remote Sens. 2023, 15, 5757. [Google Scholar] [CrossRef]
Zheng, T.; Feng, S.; Davis, K.J.; Pal, S.; Morgui, J. Development and evaluation of CO₂ transport in MPAS-A v6.3. Geosci. Model Dev. 2021, 14, 3037–3066. [Google Scholar] [CrossRef]
Sheng, M.; Hou, Y.; Song, H.; Ye, X.; Lei, L.; Ma, P.; Zeng, Z. Estimating anthropogenic CO₂ emissions from China’s Yangtze River Delta using OCO-2 observations and WRF-Chem simulations. Remote Sens. Environ. 2025, 316, 114515. [Google Scholar] [CrossRef]
Liang, A.; Gu, J.; Xiang, C. Multi-Source Satellite and WRF-Chem Analyses of Atmospheric Pollution from Fires in Peninsular Southeast Asia. Remote Sens. 2023, 15, 5463. [Google Scholar] [CrossRef]
Ballav, S.; Patra, P.K.; Takigawa, M.; Ghosh, S.; De, U.K.; Maksyutov, S.; Murayama, S.; Mukai, H.; Hashimoto, S. Simulation of CO₂ Concentration over East Asia Using the Regional Transport Model WRF-CO₂. J. Meteorol. Soc. Jpn. 2012, 90, 959–976. [Google Scholar] [CrossRef]
Liu, Y.; Yue, T.; Zhang, L.; Zhao, N.; Zhao, M.; Liu, Y. Simulation and analysis of XCO₂ in North China based on high accuracy surface modeling. Environ. Sci. Pollut. Res. 2018, 25, 27378–27392. [Google Scholar] [CrossRef] [PubMed]
Dong, X.; Yue, M.; Jiang, Y.; Hu, X.; Ma, Q.; Pu, J.; Zhou, G. Analysis of CO₂ spatio-temporal variations in China using a weather-biosphere online coupled model. Atmos. Chem. Phys. 2021, 21, 7217–7233. [Google Scholar] [CrossRef]
Seo, M.; Kim, H.M.; Kim, D. High-resolution atmospheric CO₂ concentration data simulated in WRF-Chem over East Asia for 10 years. Geosci. Data J. 2024, 11, 1024–1043. [Google Scholar] [CrossRef]
Scholze, M.; Kaminski, T.; Knorr, W.; Vossbeck, M.; Wu, M.; Ferrazzoli, P.; Kerr, Y.; Mialon, A.; Richaume, P.; Rodriguez-Fernandez, N.; et al. Mean European Carbon Sink Over 2010-2015 Estimated by Simultaneous Assimilation of Atmospheric CO₂, Soil Moisture, and Vegetation Optical Depth. Geophys. Res. Lett. 2019, 46, 13796–13803. [Google Scholar] [CrossRef]
Tian, X.; Xie, Z.; Cai, Z.; Liu, Y.; Fu, Y.; Zhang, H. The Chinese carbon cycle data-assimilation system (Tan-Tracker). Chin. Sci. Bull. 2014, 59, 1541–1546. [Google Scholar] [CrossRef]
Chevallier, F.; Remaud, M.; O’Dell, C.W.; Baker, D.; Peylin, P.; Cozic, A. Objective evaluation of surface- and satellite-driven carbon dioxide atmospheric inversions. Atmos. Chem. Phys. 2019, 19, 14233–14251. [Google Scholar] [CrossRef]
He, W.; van der Velde, I.R.; Andrews, A.E.; Sweeney, C.; Miller, J.; Tans, P.; van der Laan-Luijkx, I.T.; Nehrkorn, T.; Mountain, M.; Ju, W.M.; et al. CTDAS-Lagrange v1.0: A high-resolution data assimilation system for regional carbon dioxide observations. Geosci. Model Dev. 2018, 11, 3515–3536. [Google Scholar] [CrossRef]
Seo, M.; Kim, H.M. Effect of meteorological data assimilation using 3DVAR on high-resolution simulations of atmospheric CO₂ concentrations in East Asia. Atmos. Pollut. Res. 2023, 14, 101759. [Google Scholar] [CrossRef]
Seo, M.G.; Kim, H.M. Evaluation of high-resolution regional CO₂ data assimilation-forecast system in East Asia using observing system simulation experiment and effect of observation network on simulated CO₂ concentrations. Q. J. R. Meteorol. Soc. 2025, 151, e4987. [Google Scholar] [CrossRef]
Zhang, Q.W.; Li, M.Q.; Wei, C.; Mizzi, A.P.; Huang, Y.J.; Gu, Q.R. Assimilation of OCO-2 retrievals with WRF-Chem/DART: A case study for the Midwestern United States. Atmos. Environ. 2021, 246, 118106. [Google Scholar] [CrossRef]
Jin, J.P.; Huang, Y.J.; Wei, C.; Wang, X.P.; Xu, X.J.; Gu, Q.R.; Wang, M.Q. Analysis of CO₂ Concentration and Fluxes of Lisbon Portugal Using Regional CO₂ Assimilation Method Based on WRF-Chem. Atmosphere 2025, 16, 847. [Google Scholar] [CrossRef]
Jiang, F.; He, W.; Ju, W.; Wang, H.; Wu, M.; Wang, J.; Feng, S.; Zhang, L.; Chen, J.M. The status of carbon neutrality of the world’s top 5 CO₂ emitters as seen by carbon satellites. Fundam. Res. 2022, 2, 357–366. [Google Scholar] [CrossRef]
Ma, C.; Wang, T.; Jiang, Z.; Wu, H.; Zhao, M.; Zhuang, B.; Li, S.; Xie, M.; Li, M.; Liu, J.; et al. Importance of Bias Correction in Data Assimilation of Multiple Observations Over Eastern China Using WRF-Chem/DART. J. Geophys. Res.-Atmos. 2020, 125, e2019JD031465. [Google Scholar] [CrossRef]
Zheng, B.; Tong, D.; Li, M.; Liu, F.; Hong, C.; Geng, G.; Li, H.; Li, X.; Peng, L.; Qi, J.; et al. Trends in China’s anthropogenic emissions since 2010 as the consequence of clean air actions. Atmos. Chem. Phys. 2018, 18, 14095–14111. [Google Scholar] [CrossRef]
Gao, D.; Xie, M.; Liu, J.; Wang, T.; Ma, C.; Bai, H.; Chen, X.; Li, M.; Zhuang, B.; Li, S. Ozone variability induced by synoptic weather patterns in warm seasons of 2014-2018 over the Yangtze River Delta region, China. Atmos. Chem. Phys. 2021, 21, 5847–5864. [Google Scholar] [CrossRef]
Dai, T.; Cheng, Y.; Suzuki, K.; Goto, D.; Kikuchi, M.; Schutgens, N.A.J.; Yoshida, M.; Zhang, P.; Husi, L.; Shi, G.; et al. Hourly Aerosol Assimilation of Himawari-8 AOT Using the Four-Dimensional Local Ensemble Transform Kalman Filter. J. Adv. Model. Earth Syst. 2019, 11, 680–711. [Google Scholar] [CrossRef]
Ma, C.; Wang, T.; Mizzi, A.P.; Anderson, J.L.; Zhuang, B.; Xie, M.; Wu, R. Multiconstituent Data Assimilation With WRF-Chem/DART: Potential for Adjusting Anthropogenic Emissions and Improving Air Quality Forecasts Over Eastern China. J. Geophys. Res.-Atmos. 2019, 124, 7393–7412. [Google Scholar] [CrossRef]
Mahadevan, P.; Wofsy, S.C.; Matross, D.M.; Xiao, X.; Dunn, A.L.; Lin, J.C.; Gerbig, C.; Munger, J.W.; Chow, V.Y.; Gottlieb, E.W. A satellite-based biosphere parameterization for net ecosystem CO₂ exchange: Vegetation Photosynthesis and Respiration Model (VPRM). Glob. Biogeochem. Cycle. 2008, 22, 2–17. [Google Scholar] [CrossRef]
Jin, C.; Xue, Y.; Jiang, X.; Zhao, L.; Yuan, T.; Sun, Y.; Wu, S.; Wang, X. A long-term global XCO₂ dataset: Ensemble of satellite products. Atmos. Res. 2022, 279, 106385. [Google Scholar] [CrossRef]
Connor, B.J.; Boesch, H.; Toon, G.; Sen, B.; Miller, C.; Crisp, D. Orbiting carbon observatory: Inverse method and prospective error analysis. J. Geophys. Res.-Atmos. 2008, 113. [Google Scholar] [CrossRef]
Grell, G.A.; Peckham, S.E.; Schmitz, R.; McKeen, S.A.; Frost, G.; Skamarock, W.C.; Eder, B. Fully coupled “online” chemistry within the WRF model. Atmos. Environ. 2005, 39, 6957–6975. [Google Scholar] [CrossRef]
Liu, W.; Ling, X.; Xue, Y.; Wu, S.; Gao, J.; Zhao, L.; He, B. Study on the Concentration of Top Air Pollutants in Xuzhou City in Winter 2020 Based on the WRF-Chem and ADMS-Urban Models. Atmosphere 2024, 15, 129. [Google Scholar] [CrossRef]
Lin, Y.; Farley, R.D.; Orville, H.D. Bulk parameterization of the snow field in a cloud model. J. Appl. Meteorol. Climatol. 1983, 22, 1065–1092. [Google Scholar] [CrossRef]
Mlawer, E.J.; Taubman, S.J.; Brown, P.D.; Iacono, M.J.; Clough, S.A. Radiative transfer for inhomogeneous atmospheres: RRTM, a validated correlated-k model for the longwave. J. Geophys. Res. Atmos. 1997, 102, 16663–16682. [Google Scholar] [CrossRef]
Dudhia, J. Numerical Study of Convection Observed during the Winter Monsoon Experiment Using a Mesoscale Two-Dimensional Model. J. Atmos. Sci. 1989, 46, 3077–3107. [Google Scholar]
Grell, G.A.; Dudhia, J.; Stauffer, D.R. A Description of the Fifth-Generation Penn State/NCAR Mesoscale Model (MM5); University Corporation for Atmospheric Research: Boulder, CO, USA, 1994. [Google Scholar]
Chen, F.; Dudhia, J. Coupling an advanced land surface-hydrology model with the Penn State-NCAR MM5 modeling system. Part I: Model implementation and sensitivity. Mon. Weather Rev. 2001, 129, 569–585. [Google Scholar] [CrossRef]
Hong, S.Y.; Noh, Y.; Dudhia, J. A new vertical diffusion package with an explicit treatment of entrainment processes. Mon. Weather Rev. 2006, 134, 2318–2341. [Google Scholar] [CrossRef]
Grell, G.A.; Dévényi, D. A generalized approach to parameterizing convection combining ensemble and data assimilation techniques. Geophys. Res. Lett. 2002, 29, 38-1–38-4. [Google Scholar] [CrossRef]
Barker, D.; Huang, W.; Guo, Y.; Bourgeois, A. A Three-demiensional Variational (3DVAR) Data Assimilation System for Use With MM5. NCAR Tech. Note 2002, 68, 1–68. [Google Scholar]
Anderson, J.L. An ensemble adjustment Kalman filter for data assimilation. Mon. Weather Rev. 2001, 129, 2884–2903. [Google Scholar] [CrossRef]
Hu, Y.W.; Li, Y.; Ma, X.Y.; Liang, Y.F.; You, W.; Pan, X.B.; Zang, Z.L. The optimization of SO₂ emissions by the 4DVAR and EnKF methods and its application in WRF-Chem. Sci. Total Environ. 2023, 888, 163796. [Google Scholar] [CrossRef]
Taylor, R. Interpretation of the Correlation Coefficient: A Basic Review. J. Diagn. Med. Sonog. 1990, 6, 35–39. [Google Scholar] [CrossRef]
Chai, T.; Draxler, R.R. Root mean square error (RMSE) or mean absolute error (MAE)?—Arguments against avoiding RMSE in the literature. Geosci. Model Dev. 2014, 7, 1247–1250. [Google Scholar] [CrossRef]
Chen, J.W.; Dong, H.D.; Wang, X.; Feng, F.L.; Wang, M.; He, X.N. Bias and Debias in Recommender System: A Survey and Future Directions. Acm Trans. Inf. Syst. 2023, 41, 1–39. [Google Scholar] [CrossRef]
Liu, W.; Ling, X.; Li, C.; He, B.; Xu, H. Simulation and Assimilation of CO₂ Concentrations Based on the WRF-Chem Model. Processes 2025, 13, 4010. [Google Scholar] [CrossRef]
Liu, W.; Xue, Y.; Ling, X.; Li, C.; He, B.; Han, L. Study on CO₂ Concentration Assimilation by Integrating Multi-Source Satellite Fusion Dataset Based on the WRF-Chem Model. Int. J. Remote Sens. 2025; accepted. [Google Scholar]
Cao, W.; Zheng, Y.; Zhang, S.; Klimont, Z.; Wang, X.; Jiang, F.; Qi, Z.; Chen, C.; Feng, Y.; Zhang, Z.; et al. Co-drivers of air pollutant and CO₂ emissions in China from 2000 to 2020. npj Clim. Atmos. Sci. 2025, 8, 250. [Google Scholar] [CrossRef]
Ruan, F.; Qin, F.; Li, J.; Mu, W. Evaluation of Multi-Source Satellite XCO₂ Products over China Using the Three-Cornered Hat Method and Multi-Reference Comprehensive Comparisons. Remote Sens. 2025, 17, 3869. [Google Scholar] [CrossRef]

Figure 1. Study area and the MEIC emissions for December 2019. (a) study area; (b) spatial distribution of MEIC emissions.

Figure 2. Schematic Diagram of WRF-Chem Simulations and 3DVAR Assimilation.

Figure 3. Emission correction results based on the differences between assimilation and simulation: (a–e), respectively, show the spatial distribution of average emission corrections at 3 h, 6 h, 9 h, 12 h after assimilation, and the mean over 0 h–12 h; (f) displays the time series of mean CO₂ concentration differences before and after assimilation and mean emission correction over the study region.

Figure 4. Monthly mean spatial distribution of differences in CO₂ concentration between the ASS experiment and the corresponding SIM and SIM_N experiments at the same time points; (a–d) show the hourly mean concentration differences between the SIM control experiment and the ASS assimilation experiment; (e–h) show the hourly mean concentration differences between the SIM_N new emission source experiment and the ASS assimilation experiment; Four columns represent the simulated CO₂ concentrations at 3, 6, 9, and 12 h, respectively.

Figure 5. Average emissions over China during December 2019 for (a) 3DVAR-optimized emissions and (b) EAKF-optimized emissions, (c) 3DVAR-optimized emissions minus MEIC and (d) EAKF-optimized emissions minus MEIC (units: mol km⁻² h⁻¹).

Figure 6. (a) Daily CO₂ emissions and (b) average hourly CO₂ emissions over China during December 2019. The blue line represents the MEIC inventory, the red line represents the 3DVAR method, and the green line represents the EAKF method.

Figure 7. Hourly average CO₂ emissions during the study period in the six regions. (a) Beijing–Tianjin–Hebei region, (b) Northeastern China, (c) Xinjiang province, (d) Central China, (e) Yangtze River Delta, and (f) Pearl River Delta. The line styles follow the same conventions as in Figure 4.

Figure 8. Daily CO₂ concentration accuracy for the Sim_MEIC, Sim_3DVAR and Sim_EAKF experiments. (a) WLG station. (b) HKO station.

Figure 9. Daily XCO₂ concentration accuracy for the Sim_MEIC, Sim_3DVAR and Sim_EAKF experiments. (a) Hefei station, (b) Xianghe station.

Figure 10. Comparison of simulated XCO₂ concentrations with EGG4 reanalysis data. (a) comparison between Sim_MEIC simulation results and EGG4 reanalysis data; (b) comparison between Sim_3DVAR simulation results and EGG4 reanalysis data; (c) comparison between Sim_EAKF simulation results and EGG4 reanalysis data.

Figure 11. Comparison of EDGAR Carbon Emission Inventory with Original and Corrected MEIC Carbon Emission Inventory for December 2019. (a) spatial distribution of EDGAR emissions; (b) Comparison between EDGAR and MEIC carbon emission inventories; (c) Comparison between 3DVAR-corrected MEIC and EDGAR carbon emission inventories; (d) Comparison between EAKF-corrected MEIC and EDGAR carbon emission inventories.

Figure 12. Spatiotemporal Characteristics of PBLH: (a) Stable Boundary Layer Period (1200–2300 UTC); (b) Unstable Boundary Layer Period (0300–0800 UTC); (c) Daily Mean Distribution; (d) Diurnal Variation Series of PBLH.

Table 1. Details of ground-based observation stations in this study.

Station Type	Station Name	Latitude (°N)	Longitude (°E)	Location
TCCON	Hefei	31.91	117.17	Hefei
TCCON	Xianghe	39.80	116.96	Xianghe
WDCGG	WLG	36.29	100.90	Mt. Waliguan
WDCGG	HKO	22.31	114.17	King’s Park

Table 2. Physical and chemical options in WRF-Chem.

Scheme	Chosen Option	Input Settings
Microphysical process	Lin scheme [37]	mp_physics = 2
Long-wave Radiation	RRTM scheme [38]	ra_lw_physics = 1
Short-wave Radiation	Dudhia scheme [39]	ra_sw_physics = 1
Surface Layer	MM5 scheme [40]	sf_sfclay_physics = 1
Land Surface Model	Noah scheme [41]	sf_surface_physics = 2
Boundary Layer	YSU scheme [42]	bl_pbl_physics = 1
Cumulus parameterization	Grell3 scheme [43]	cu_physics = 5
Chemical Mechanisms	Greenhouse gas CO₂ only tracers	chem_opt = 16

Table 3. Emission source configurations and study period for all experiments.

Name	Emission			Study Period
Name	Spin-Up	Prior	Optimized	Study Period
CTRL	MEIC			All month
3DVAR	MEIC	MEIC and 3DVAR-optimized	3DVAR- optimized	Every 6 h
EAKF	MEIC	MEIC and EAKF-optimized	EAKF- optimized	Every 6 h
SIM_3DVAR	3DVAR-optimized emissions			All month
SIM_EAKF	EAKF-optimized emissions			All month

Table 4. Correlation analysis of CO₂ concentrations between the SIM experiment and the ASS experiment, and between the SIM_N experiment and the ASS experiment.

		3 h	6 h	9 h	12 h
BIAS(ppm)	SIM	1.67	1.53	1.10	0.94
BIAS(ppm)	SIM_N	0.33	0.39	0.55	0.66
RMSE(ppm)	SIM	3.10	2.83	2.62	2.63
RMSE(ppm)	SIM_N	1.09	0.99	0.83	0.84
R	SIM	0.74	0.75	0.78	0.77
R	SIM_N	0.86	0.86	0.82	0.81

Table 5. Comparison of prior and optimized CO₂ emissions and their change ratios across different regions of China (Units: MT day⁻¹).

Region	MEIC	3DVAR	EAKF	(3DVAR − MEIC) /MEIC (%)	(EAKF − MEIC) /MEIC (%)
China	1049.83	1186.99	1128.68	13.06	7.51
BTH	92.91	96.68	95.37	4.06	2.65
NEC	95.19	107.20	101.70	12.62	6.84
XJ	53.89	50.65	45.90	−6.01	−14.82
CC	371.02	445.52	401.54	20.08	8.23
YRD	135.87	145.39	143.59	7.01	5.68
PRD	38.70	42.52	42.56	9.86	9.97

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liu, W.; Ling, X.; Li, C.; He, B. Optimizing CO₂ Concentrations and Emissions Based on the WRF-Chem Model Integrated with the 3DVAR and EAKF Methods. Remote Sens. 2026, 18, 174. https://doi.org/10.3390/rs18010174

AMA Style

Liu W, Ling X, Li C, He B. Optimizing CO₂ Concentrations and Emissions Based on the WRF-Chem Model Integrated with the 3DVAR and EAKF Methods. Remote Sensing. 2026; 18(1):174. https://doi.org/10.3390/rs18010174

Chicago/Turabian Style

Liu, Wenhao, Xiaolu Ling, Chenggang Li, and Botao He. 2026. "Optimizing CO₂ Concentrations and Emissions Based on the WRF-Chem Model Integrated with the 3DVAR and EAKF Methods" Remote Sensing 18, no. 1: 174. https://doi.org/10.3390/rs18010174

APA Style

Liu, W., Ling, X., Li, C., & He, B. (2026). Optimizing CO₂ Concentrations and Emissions Based on the WRF-Chem Model Integrated with the 3DVAR and EAKF Methods. Remote Sensing, 18(1), 174. https://doi.org/10.3390/rs18010174

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Optimizing CO₂ Concentrations and Emissions Based on the WRF-Chem Model Integrated with the 3DVAR and EAKF Methods

Highlights

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Data

2.2.1. Input Data for WRF-Chem

2.2.2. XCO₂ Fusion Dataset from Satellite Observation

2.2.3. Observations for Validation

2.3. Methods

2.3.1. WRF-Chem Model

2.3.2. 3DVAR Method

2.3.3. EAKF Method

2.3.4. Calculation of CO₂ Emission

2.3.5. Statistical Evaluation Indicators

2.4. Experimental Design

3. Results

3.1. Sensitivity Experiments

3.1.1. Evaluation of Hourly Carbon Emission Correction Through Assimilation

3.1.2. Evaluation of Hourly Carbon Concentration Through Simulations

3.2. Spatial Changes in Emissions

3.3. Temporal Changes in Emissions

3.4. Evaluation of Posterior Emission Source Simulation Performance

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Optimizing CO2 Concentrations and Emissions Based on the WRF-Chem Model Integrated with the 3DVAR and EAKF Methods

Highlights

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Data

2.2.1. Input Data for WRF-Chem

2.2.2. XCO2 Fusion Dataset from Satellite Observation

2.2.3. Observations for Validation

2.3. Methods

2.3.1. WRF-Chem Model

2.3.2. 3DVAR Method

2.3.3. EAKF Method

2.3.4. Calculation of CO2 Emission

2.3.5. Statistical Evaluation Indicators

2.4. Experimental Design

3. Results

3.1. Sensitivity Experiments

3.1.1. Evaluation of Hourly Carbon Emission Correction Through Assimilation

3.1.2. Evaluation of Hourly Carbon Concentration Through Simulations

3.2. Spatial Changes in Emissions

3.3. Temporal Changes in Emissions

3.4. Evaluation of Posterior Emission Source Simulation Performance

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Optimizing CO₂ Concentrations and Emissions Based on the WRF-Chem Model Integrated with the 3DVAR and EAKF Methods

2.2.2. XCO₂ Fusion Dataset from Satellite Observation

2.3.4. Calculation of CO₂ Emission