Global Scale Inversions from MOPITT CO and MODIS AOD

: Top-down observational constraints on emissions ﬂux estimates from satellite observations of chemical composition are subject to biases and errors stemming from transport, chemistry and prior emissions estimates. In this context, we developed an ensemble data assimilation system to optimize the initial conditions for carbon monoxide (CO) and aerosols, while also quantifying the respective emission ﬂuxes with a distinct attribution of anthropogenic and wildﬁre sources. We present the separate assimilation of CO proﬁle v9 retrievals from the Measurements of Pollution in the Troposphere (MOPITT) instrument and Aerosol Optical Depth (AOD), collection 6.1, from the Moderate Resolution Imaging Spectroradiometer (MODIS) instruments. This assimilation system is built on the Data Assimilation Research Testbed (DART) and includes a meteorological ensemble to assimilate weather observations within the online Community Atmosphere Model with Chemistry (CAM-chem). Inversions indicate an underestimation of CO emissions in CAMS-GLOB-ANT_v5.1 in China for 2015 and an overestimation of CO emissions in the Fire INventory from NCAR (FINN) version 2.2, especially in the tropics. These emissions increments are consistent between the MODIS AOD and the MOPITT CO-based inversions. Additional simulations and comparison with in situ observations from the NASA Atmospheric Tomography Mission (ATom) show that biases in hydroxyl radical (OH) chemistry dominate the CO errors.


Introduction
The goal of atmospheric data assimilation (DA) is to accurately determine the state of the atmosphere, including its gas-phase and aerosol chemical composition, by optimally and objectively combining observations and model simulations [1][2][3].As pointed out by Bocquet et al. [4], there are few studies of joint meteorological and chemical data assimilation within models that explicitly represent meteorological processes and chemical transformations.
Satellite-based remote-sensing instruments produce large amounts of valuable atmospheric composition data that can be used in assimilation.For instance, the aerosol optical depth (AOD) is available from operating instruments including low Earth orbit satellite sensors, such as the Moderate Resolution Imaging Spectroradiometer (MODIS), the Visible/Infrared Imager Radiometer Suite (VIIRS), and the Multi-Angle Imaging Spectroradiometer (MISR).There have been numerous studies on the feasibility of assimilating aerosol observations to improve the initial conditions for aerosol forecasting [5][6][7].The choice of control variables is important since AOD measurements are used to initialize a dozen aerosol mass concentrations for which background error statistics are difficult to characterize (e.g., [8]).Schutgens et al. [9] performed the assimilation of AERONET data using a Local Ensemble Transform Kalman Filter (LETKF), which presents an efficient way to model a background error covariance that varies in space and time.In the context of air quality (AQ) forecasting, both PM 2.5 and MODIS AOD have been assimilated in Chemistry Transport Models (CTM) (e.g., [10,11]).Pagowski and Grell [12] compared the 3D-VAR and the ensemble Kalman filter (EnKF) assimilation methods for the assimilation of PM 2.5 and found that when meteorology alone was perturbed, the ensemble spread was too small to accurately represent the model errors.This points to the problem of representing model error in chemical data assimilation, since CTMs are sensitive to errors in emissions and aerosol models are too simplistic for computational purposes.Schwartz et al. [13] introduced a hybrid variational-ensemble DA system and found that an important limitation of the system was the small ensemble spread.To overcome these difficulties and include meteorology ensembles and perturbations in the aerosol emissions sources, one can use adaptive inflation methods to correct the ensemble spread [14,15].The better quantification of emissions uncertainties was also introduced in the 3D-Var framework by increasing the background error variance of the accumulation and Aitken mode aerosols, which improved the quality of the analyses [16].The influence of initial condition (IC) optimization fades with time and the pollutant field at the surface because of emissions and other parameters such as chemical reactivity and deposition velocity [17].
Following the pioneering work of Elbern et al. [18] and Miyazaki et al. [19], more work has been dedicated to simultaneously optimizing the initial conditions and the associated emissions, where the emission increments act as an improved boundary condition during the forecast [20] and the system includes additional constraints from the simultaneous assimilation of different species [21,22].The latter study [22] pointed out limitations due to the treatment of model error in nonlinear chemical processes and the lack of constraints on volatile organic compounds' (VOCs) ICs and emissions.Ma et al. [23] assimilated surface measurements of gas-phase species (SO 2 , NO 2 , O 3 , CO) and aerosols (PM 2.5 , PM 10 ), along with with MODIS AOD, and found that the emissions update interval may be an important factor for CO in EnKF emission inversion studies.They concluded that the posterior emissions could converge to a relatively stable state and that the updated anthropogenic emissions can improve the model forecasts.This is consistent with Gaubert et al. [24], in which the Measurement of Pollution in the Troposphere (MOPITT) CO assimilation showed improved CO forecasts and analyses when compared to independent aircraft observations when the CO ICs and emissions were jointly optimized.
Global aerosol, chemical, and meteorological reanalyses are consistently produced within the same DA infrastructure [25][26][27][28] to better understand the role of aerosols and chemistry in the climate system.A common approach in gas-phase chemical data assimilation systems is to estimate atmospheric states and/or emissions using variational [29][30][31][32][33][34] and ensemble Kalman filter approaches [19,[35][36][37][38], often with offline CTMs.Notable exceptions include using coupled chemistry-meteorology DA for stratospheric application [39], for the upper atmosphere [40].Within the Monitoring Atmospheric Composition and Climate (MACC) and the Copernicus Atmosphere Monitoring Service (CAMS) European projects, global chemical, aerosols, and meteorological observations are assimilated within the Composition-IFS Integrated Forecasting System (IFS) 4D-Var system [28,[41][42][43][44].Despite the assimilation of many observations of different species and satellite platforms, the impact of the data assimilation is very limited near the surface, especially for short-lived species such as NO 2 [45].Recent developments include the assimilation of Tropospheric Monitoring Instrument (TROPOMI) total column CO in the CAMS system, which alleviates the low CO bias in the free troposphere, especially when assimilating clear-sky TROPOMI CO retrievals [46].
Attributing CO sources is complicated by its strong coupling with atmospheric chemistry via the hydroxyl radical (OH), which acts as both a strong source (via methane and VOCs oxidation) and the main chemical sink of CO.An ensemble of 26 global Chemistry Transport Models (CTM) has shown that CO is underestimated against surface observations and satellite retrievals from MOPITT.While all the models were using the same CH 4 field and emissions, the comparison showed that the variation in OH explained around 80% of the inter-model variance in CO [47].Other differences in CO were attributable to different representations of VOC emission sources, while meteorology appeared to be less important for large-scale budgets.Later, Voulgarakis et al. [48] compiled model simulations of OH and CH 4 lifetimes from the Atmospheric Chemistry and Climate Modeling Intercomparison Project (ACCMIP) [49] and found large differences in the inter-model estimations of OH and CH 4 lifetimes.Furthermore, Naik et al. [50] showed that the ACCMIP multimodel mean OH concentration is overestimated by 5 to 10%, because OH concentrations are overestimated in the northern hemisphere.The POLARCAT Model Inter-Comparison Project (POLMIP) indicated that arctic OH differences are a larger source of inter-model variability than transport differences [51].Conversely, assimilating MOPITT in such a coupled CTM reduces this bias [52], while Strode et al. [53] found that removing biases in O 3 and water vapor, as well as reducing northern hemisphere NO x , can also effectively reduce the global mean OH by 18%.In response to this, there has been a strong motivation to elucidate the role of individual driving factors controlling the OH budget [48,54].
Observationally derived OH estimates imply that (1) the global and annual average tropospheric methane lifetime (τ) is 9.5 years [55], (2) the interannual variability is small, by 1 to 3% [56], and (3) the northern hemisphere to southern hemisphere (N/S) ratio is close to 1 [57], with a ratio of 0.97 ± 0.12 for the years 2004-2011 [58].Zhao et al. [59] used the Chemistry-Climate Model Initiative (CCMI)-modeled OH and indeed found large sensitivities of the simulated CH 4 to the use of alternative OH fields that have different spatial and temporal distributions and overall magnitude.Gaubert et al. [60] argued that the observed 20% global decline in CO burden during the 2002-2013 period, due to the effective implementation of pollution control technologies and regulations, has also resulted in increased CH 4 oxidation by OH, corresponding to a ∼8% shorter CH 4 lifetime and higher CO chemical production.Nguyen et al. [61] followed this work with simple chemistry box model calculations and suggested that there can be a 25% bias in methane source estimates when chemical feedbacks through OH are neglected.The comparison of free running simulations with data assimilation and prescribed meteorology through nudging implies that while water vapor plays an important role as a source of OH, and therefore impacts global OH abundance, it does not determine the modeled long-term OH trend [24,60,62].Zhang et al. [63] optimized the annual hemispheric concentrations of tropospheric OH following the assimilation of GOSAT CH 4 and found a reduction in OH concentrations that was mostly in the northern hemisphere, leading to an N/S ratio of 1.02 ± 0.05 in the posterior estimate, compared to 1.16 in the prior estimate.Recently, CH 4 inversions have been carried out using an OH that is constrained by precursor observations (CO, CH 2 O), which resulted in better agreement with the MCF-based OH inversions [64].It confirms the results from earlier studies that the assimilation of precursor emissions usually improves OH metrics such as estimates of the CH 4 lifetime and associated N/S ratios closer to 1 [52,[65][66][67].Müller et al. [68] performed IASI CO inversions with alternative distributions of the N/S ratio and different global OH and showed that the varying hemispheric mean OH level has a strong impact on posterior emission estimates.The CO inversions made with offline chemistry will scale the CO emissions according to the OH distribution, and lower OH, especially in the northern hemisphere, leads to a better CO match with independent observations.In this context, we developed an ensemble data assimilation framework to optimize the CO initial conditions in an online and interactive atmospheric chemistry model.The goal is to reduce potential biases in OH, transport, and other model-related systematic errors that affect CO and thus the posterior emissions.We presented in a previous study the posterior emission fluxes with a focus on anthropogenic sources in East Asia in May 2016, during the Korea-United States Air Quality (KORUS-AQ) campaign in South Korea [24].The CO assimilation in an interactive CTM provided constraints on OH, hydroperoxyl radical (HO 2 ), and O 3 through the joint optimization of CO initial conditions and emissions.
This paper is structured as follows.We introduce the modeling and assimilation framework as well as the assimilated observations and conducted experiments in Section 2. We present the global-scale results of CO inversion experiments for August 2015 in Section 3, including its evaluation with Fourier transform spectrometer observations of the dry-air tropospheric column-averaged mole fraction of CO from the Network for Detection of Atmospheric Composition Change (NDACC).The results for AOD are presented in Section 4. In Section 5, we investigate the sensitivities in deterministic CAM-chem simulations, to scaled OH and posterior emissions, and we further evaluate with MOPITT CO and MODIS AOD level 3 monthly observations.In the following Section 6, we compare the CAM-chem simulations of CO, OH, and HO 2 to the NASA Atmospheric Tomography Mission (ATom).Finally, we discuss our findings in the larger context in Section 7 and the main points are summarized in Section 8.

Community Atmosphere Model with Chemistry
We employ the Community Atmosphere Model with Chemistry (CAM-chem), the atmospheric chemistry constituent of the Community Earth System Model (CESM) version 2. 2 [69].This current version uses the finite volume dynamical core at a spatial resolution of 0.95 • in latitude by 1.25 • in longitude and with 32 vertical layers [70,71].The Modal Aerosol Model (MAM4) [72,73] is used to represent aerosols over four modes and includes secondary organic aerosols estimated by the Volatility Basis Set (VBS) [71].The gas-phase chemistry is represented by the MOZART Tropospheric and Stratospheric (TS1) scheme [74].Following Gaubert et al. [24], we employ the heterogeneous aerosol uptake coefficient (γ) of 0.1 for HO 2 , with the reaction producing H 2 O instead of H 2 O 2 .We update the nitric acid trihydrate (NAT) particle number densities from 0.01 to 10 −5 cm −3 , which increases the denitrification rate in the stratosphere, resulting in the higher catalytic destruction of stratospheric ozone by chlorine [75].The Community Land Model version 5 (CLM5) model is active and coupled with the atmosphere [76].The dry deposition of aerosols and gases as well as biogenic emissions are calculated online in CLM5.Specifically, the Model of Emissions of Gases and Aerosols from Nature (MEGAN v2.1) is used for biogenics [77] and the wet deposition of gases and aerosols is estimated from the Neu and Prather [78] parameterization.
The CAMS-GLOB-ANT_v5.1 inventory was released in spring 2021 and is used for anthropogenic emissions [79].The dataset is derived from the spatial distribution and monthly emissions (2000 to 2015) of version 5 of the Emissions Database for Global Atmospheric Research (EDGAR v5) from the European Commission Joint Research Centre [80].Values are extrapolated to 2021 using trends calculated from version 2 of the Community Emissions Data System (CEDS) [81].It also includes updated ship and aircraft emissions.The emission's seasonal cycle is derived from monthly adjustments provided by CAMS-GLOB-TEMPO.Daily biomass burning emissions are estimated by the Fire INventory from NCAR [82,83] version 2.2 (FINNv2.2).Among other datasets, the inventory is derived by active fire detection from both VIIRS and MODIS.
We also perform deterministic CAM-chem simulations to assess the differences in using prior and posterior emissions (Table 1).The CAM-chem simulated temperature as well as the zonal and meridional winds are nudged to the Modern-Era Retrospective Analysis for Research and Applications version 2 [84], as in recent studies [85,86].The nudging is performed at every physical time step (30 min) with a Newtonian relaxation of 6 h and reading input files at the highest possible temporal frequency of 3 h [87].In all CAM-chem simulations, the sea-ice interfaces and ocean are prescribed from a daily analysis of sea surface temperatures (SST) at 0.25 • × 0.25 • spatial resolution.The dataset is derived from in situ observations and the Advanced Very High Resolution Radiometer (AVHRR) SST observations [88].

Data Assimilation Research Testbed
DART is an open source community software for efficient ensemble data assimilation, developed and maintained at the National Center for Atmospheric Research [89].Global meteorological reanalyses were performed within the Community Atmosphere Model (CAM) or CAM6 + DART [90].Our chemical data assimilation system is built upon CAM6 + DART with the inclusion of online chemistry to efficiently assimilate chemical observations [24,52].Briefly, the first step of the filter is to run a 30 member ensemble of CAM-chem simulations with different initial conditions, forcing files such as emissions as well as various model parameters.The analysis step is the assimilation of observations using a deterministic ensemble square root filter, the ensemble adjustment Kalman filter (EAKF) [91,92].We apply multiplicative covariance inflation to the ensemble before the analysis step to optimally adjust the ensemble spread.The adaptive inflation parameters vary in space and time [93] and are calculated following the updated algorithm from Gharamti [94].The spatial localization [95] is set to a half width value of ∼600 km (0.1 radians) horizontally and of about 1200 m in the vertical for both meteorological and chemical observations.

Assimilation Experiments
Our numerical simulations include the assimilation of MOPITT CO (MOPITT-DA), MODIS AOD (MODIS-DA), and baseline meteorological observations (Control-DA).The ensemble consists of 30 CAM-chem runs with different initial conditions and model parameters, including emissions and nudging strength values.The ensemble spinup simulation is initialized from a CAM-chem deterministic simulation on 21 July 2015.During the CAMchem ensemble spinup period, the temperature and the wind velocity components are nudged towards the CAM6+DART reanalysis ensemble mean [90], at every physical step (30 min).To generate a dynamical spread, we perturb the global strength Q of the nudging so that the ensemble mean is a nudging of Q = 0.4 with a standard deviation of 0.15, which corresponds to relaxation times mostly between 10 and 20 h.For this simulation and all of the following experiments, the global dust emission factor is perturbed by 10% and the sea salt global emission factor is perturbed by 25%.A different noise distribution (estimated as in Gaubert et al. [96]) is applied to CO, VOC, and aerosol emissions for biomass burning (BB), with a decorrelation length of 250 km, and for anthropogenic direct emissions, with a decorrelation length of 500 km.Thus, anthropogenic and BB CO sources are completely uncorrelated in the prior ensemble.Using an ensemble for meteorological data assimilation allows the characterization of errors in the model horizontal and vertical dynamics and physics.The meteorological data assimilation starts on 28 July 2015 and will serve as a control run, so the nudging to the CAM6+DART reanalysis is turned off.We initialize the MOPITT-DA and the MODIS-DA assimilation experiments on 1 August 2015 from the control-run.
DART is also designed to handle both the spatial and the "variable localization" [97].The variable localization restricts the impact of an observation to only relevant model fields.For instance, the "variable localization" means that the MOPITT assimilation impacts only the chemical state vector (i.e., CO and CO emissions), with no correction on the meteorological state vector (U , V , T , Q, Ps).Table 2 shows the state vectors used in the different experiments.All experiments assimilate meteorological observations in a consistent manner, including for the AOD and CO assimilation runs.In the MOPITT-CO experiment, the CO initial conditions are optimized and the anthropogenic (SFCO CAMS ) and fire (SFCO FI NN ) emissions are updated following the method presented in Gaubert et al. [24].Briefly, the persistence of the emissions is assumed since there is no model for emissions.However, the relative increments are applied by source for the next few days of the input emission fluxes files with a decreasing weight over time.We use a τ of 5 days for fires and 8 days for the anthropogenic source.The native temporal resolution of the anthropogenic emissions is monthly, so we consider impacts up to 16 days (2τ).To discriminate sources, only fire emissions are updated if their fluxes constitute more than 10% of the CAMS emissions for a given grid cell.Otherwise, only SFCO CAMS are updated.For aerosol DA, only the black carbon (SFBC) and primary organic matter (SFPOM) emissions are optimized, using the same CO tag ratio to determine which of the CAMS or FINN emission files will be updated for each grid cell.In the MODIS-DA experiment, the total mass mixing ratios of the 4 bins are in the state vector.The 16 different aerosol species and the 4 number variables are scaled after the analysis step.MOPITT is a radiometer instrument that measures CO in both thermal and short-wave infrared bands [98].We present here our first assimilation of the latest (V9J) MOPITT multispectral retrievals [99].V9J retrievals feature a revised cloud detection algorithm that now includes observations over highly polluted scenes, with relatively high aerosol optical depths, that were previously treated as cloudy scenes [100].The adopted algorithm results in greater retrieval coverage over land, particularly over polluted scenes such as the Amazon Basin during the fire season and the North China Plain.While we follow the same approach as in previous studies, by assimilating daytime-only joint retrieval profiles [24], we now only employ the measurement error covariance as the observation error instead of the total retrieval error covariance.This is because the model is smoothed by the averaging kernel during the forward operator, so the smoothing error (which is included in the retrieval error covariance) is already taken into account [101].The comparison of the two approaches indicates that only including the measurement error allows for a better fit of the MOPITT observations.

MODIS
We use the AOD dataset from the MODIS collection 6.1 [102], from both the Terra and Aqua satellites and retrieved with the Deep Blue algorithm.The dataset is processed to achieve low systematic bias and random error for use in data assimilation, following 3 steps: (1) a quality assurance (QA) analysis, (2) an empirical correction, and (3) aggregation [103][104][105].The QA procedure includes standard error checks, buddy checks, and QA flag checks.The standard error check is designed to identify cloud-contaminated pixels by evaluating the spatial variation of the MODIS AOD, buddy checks remove isolated pixels, and only data flagged as good and best with a cloud fraction lower than 80% are used.The empirical correction is then applied to further reduce the biases from cloud artifacts, surface wind speed effects, and aerosol microphysics effects.The data are then averaged on a 0.5 • × 0.5 • grid for a 6-hourly time window.
We also use a MODIS AOD level 3 climatology for the evaluation of deterministic CAM-chem simulations.The AOD monthly data are from the Terra MODIS Collection 6.1 Monthly Level 3 (1 • × 1 • ) Combined Dark Target and Deep Blue product [106], which has been regridded to the CAM6 grid via bilinear interpolation.

Network for Detection of Atmospheric Composition Change
To further evaluate the impacts of the CO assimilation, we compare the simulated CO with ground-based Fourier transform infrared spectrometer (FTIR) CO observations at multiple sites across the world [86].This dataset is part of the NDACC (http://ndacc.org,accessed on 30 September 2023) Infrared Working Group (IRWG, https://www2.acom.ucar.edu/irwg, accessed on 30 September 2023)).The sites are located in either close to urban areas or in remote locations.The FTIR acquires direct-sun spectra under cloud-free conditions in selected spectral regions through the mid-infrared with a nominal spectral resolution of 0.004 cm −1 .For CO, a liquid-nitrogen-cooled InSb (∼1850-6000 cm −1 ) and narrow bandpass filter are used to yield typical accuracy and precision of about 3% and 1%, respectively.The modeled vertical profiles are smoothed with the FTIR averaging kernels and a priori vertical profiles to account for the FTIR altitude sensitivity.For the comparison, vertical profiles are used to calculate the air mass weighted tropospheric mixing ratio (wVMR) [86].

NASA ATom
The NASA Atmospheric Tomography Mission (ATom) is an airborne campaign designed to evaluate the chemical processes and the role of anthropogenic emissions that control the short-lived climate forcing agents such as methane and ozone [107].The fully instrumented NASA DC-8 aircraft sampled vertical profiles from 0 to ∼13 km of the marine remote troposphere over the Pacific and the Atlantic basin over four deployments.The CO measurements were made by the NOAA Picarro CO instrument [108].The OH and HO 2 data were obtained by the Penn State Airborne Tropospheric Hydrogen Oxides Sensor (ATHOS) instrument [109].For ATom, a careful evaluation of the OH and HO 2 observations indicated an uncertainty of ±35% [110].All observations and model output were binned into 10 • latitude and 100 hPa increments and averaged by ocean basin.The data were then averaged by the ATom mission, excluding data that showed a strong biomass burning impact by dismissing bins where the California Institute of Technology (CIT) Chemical Ionization Mass Spectrometer (CIMS) hydrogen cyanide (HCN) measurements [111] were greater than 600 pptv.

CO Assimilation Results
Figure 1 shows the time series of the bias for the Control-DA and the MOPITT-DA against the MOPITT column-average dry-air mole fraction of CO (XCO).The Control-DA overestimates MOPITT XCO by around 10 ppb or 13% in the tropics and underestimates MOPITT XCO in the northern hemisphere extratropics by 20 ppb or 23%, with only small changes over time.
It suggests systematic differences in the drivers of XCO and their errors between the tropics and the northern extratropics.As expected, in the MOPITT-DA, the bias evolves from the bias of the initial ensemble to the bias of the analysis model states, with residual 6 h forecast errors of less than 10 ppb towards the end of the month.With the adjustments of the emissions and no constraints on OH, it suggests that transport error or an OH overestimation is responsible for the negative bias during the 6 h forecast.With the bias becoming stationary for the last 10 days, we will be assuming a spinup period of 15 days.We show the 15-day MOPITT XCO average in Figure 2. Most of the large XCO enhancements (top panel) are caused by fires, such as in the western United States, Eastern Europe, Siberia in Amazonia, and the Indonesian peninsula, with the largest values found over southern tropical Africa.Large XCO values from mainly anthropogenic sources are also found in East Asia.As a result, the background XCO is large over the Pacific and Atlantic continental outflow north of 30 • N. Large values are also found in the tropical Atlantic, Pacific, and Indian oceans, following easterly outflows at the equator from mainly biomass burning sources.A strong underestimation is found over the extratopical northern hemisphere, with slightly larger biases over the continental areas (middle panel).The only exception is found over Northern and Eastern India and close to the fires in Siberia, with slightly positive biases.The background CO is better reproduced on average in the tropics and in the southern hemisphere.However, positive biases, larger than a 20 ppb overestimation of the columns, are found over the biomass burning regions in South America, tropical Africa, and the Indonesian Peninsula.The overestimation extends far from the sources along the equator.As expected from the assimilation, the MOPITT-DA shows considerably reduced biases (bottom panel).The errors remain within the 10 ppb range, which is reasonable considering the observation errors (∼10%) and other sources of model error stemming from transport and chemistry representation.Aside from the equatorial fire plumes, XCO tends to be underestimated.We further evaluate the MOPITT assimilation results by comparing them with the NDACC FTIR observations.Figure 3

Aerosol Optical Depth (AOD) Assimilation Results
The verification of the assimilation experiment is first performed by comparing the 6 h forecast statistics for the MODIS-DA and Control-DA simulations against the MODIS AOD observations.Figure 4 displays the global and daily biases and unbiased root mean square error (RMSE) for both simulations.For the Control-DA, the error remains stable for the entire month, with an unbiased RMSE above 0.7 and a positive bias of around 0.2.The MODIS-DA unbiased RMSE rapidly decreases in the first few days and stabilizes between 0.5 and 0.6, which corresponds to an error reduction of 30%.
Figure 5 illustrates the spatial distribution of the assimilated MODIS AOD observations and the bias for the Control-DA and the MODIS-DA averaged over the entire month of August 2015.The largest AODs result from the fires in the tropics and over Siberia, and from anthropogenic sources in Northeast China and over the Indo-Gangetic Plains, in line with previous studies [112].As for CO, the AOD values over the fire sources and outflows are overestimated, with the largest biases over South America (top panel).The Control-DA underestimates the AOD over most of the northern hemisphere, mostly over land, including the dust source over the Sahara Desert.The assimilation corrects the forecast AOD, with much smaller biases in most places but South America.The magnitude and the spatial extent of the AOD biases are reduced over southern regions and downwind.The AOD biases are similar to CO and are thus indicating a strong signal from similar emission sources and potentially transport.However, the processes responsible for the total AOD are different than for CO, reflecting a larger diversity of sources such as sea salt, dust, and sulfate aerosols, and physical and chemical processes including secondary organic aerosol formation.Figure 6 shows the differences in AOD speciation between the MODIS-DA and the Control-DA.It allows the identification of which species are changed following the assimilation and which ones are potentially the dominant sources of errors.The patterns for the primary and secondary organic aerosols and black carbon are similar to the CO bias patterns as well, with an overestimation of tropical fire and an underestimation of fires in the mid-latitudes.The extent of the correction indicates that the emissions are not the only source of errors.The sulfates were increased following the assimilation, with the largest increase over the Arabian Peninsula, Northeast China, and South America.The large sulfate aerosol abundance is responsible for the residual biases over South America and indicates shortcomings in the assimilation, likely due to model errors such as convection.The assimilation also indicates errors in the transport of dust aerosols, with decreases over some source regions such as Eastern Sahara and Australia, but increases downwind over Northwest Africa and the Northern Tropical Atlantic Ocean.The sea salt aerosols were slightly overestimated in the Control-DA simulation.

Impacts of Posterior Emissions and Chemistry
We average the posterior emissions across the ensemble and for the entire month to smooth out the ensemble noise.Figure 7 displays the differences for the absolute fluxes obtained from the inversions.As for the concentration, the patterns of the emission increments are strikingly similar between the MOPITT-CO and the MODIS-AOD inversions.This confirms the systematic uncertainties, resulting from a combination of mostly emissions and potentially transport contributions of the model errors, since these two factors are common for both CO and AOD.The anthropogenic CO covers larger areas-for instance, over the United States and Eurasia.Strong positive increments are found over Northeast China and South Asia, while negative increments are found over Northern India.The main difference between CO and POM (and BC) resides in Northern India, where the POM (and BC) emissions are increased, but anthropogenic CO emissions are decreased.The differences seen in India could be a misattribution of the FINN fire CO overestimation to anthropogenic sources.Elsewhere, the sign of the fire emission increments is consistent with aerosol emission increments for the AOD and CO inversions, with decreases in South Table 3 shows various emission estimates at the global scale and for large countries in 2015.It shows overall consistency at the global and annual scale between the various estimates.The posterior anthropogenic CO emissions are increased compared to the CAMS-GLOB-ANT_v5.1.The total yearly Chinese emissions for 2015 are 127 Tg a −1 for CAMS-GLOB-ANT_v5.1 (prior) and in 147 Tg a −1 in our posterior, 160 Tg a −1 for CEDSv2, and 165 Tg a −1 for HTAPv3.Our top-down emission estimate is around 16% higher than the CAMS-GLOB-ANT_v5.1, but remains lower than version 3 of the Hemispheric Transport of Air Pollution (HTAPv3) [113] and the Community Emissions Data System (CEDSv2) [81].It is also lower than the value of 162 Tg a −1 in 2015 reported in Li et al. [114] from the Multi-Resolution Emission Inventory for China (MEIC v1.2) This suggests that the CAMS-GLOB-ANT_v5.1 inventory underestimates anthropogenic CO emissions for China in 2015.The CAMS-GLOB-ANT_v5.1 and the posterior and HTAPv3 show good correspondence over Nigeria and Brazil, where CEDSv2 is much lower.Table 3.Total emissions for 2015.In addition to our prior CAMS-GLOB-ANT_v5.1 (CAMS v5.1), we also show the Community Emissions Data System (CEDSv2) [81] and Hemispheric Transport of Air Pollution (HTAPv3) [113].We also compare with the Quick-Fire Emissions Dataset or QFED for fire emissions [115].For both BC and OC, all simulations include the same FINN2.5  As already noted in Wiedinmyer et al.
[83], the FINN2.5 emissions estimate much larger emissions, 777 Tg a −1 in 2015, when QFED is 345 Tg a −1 .This confirms our reduction estimated from top-down assimilation but our totals remains much higher than QFED.The comparison with QFED also indicates much larger FINN emissions in China and India.A longer assimilation run is needed to provide an accurate estimate of biomass burning emissions for fires.
For aerosols, only anthropogenic emissions inventories are shown.The slight increase in emissions for both BC and OC brings the posterior closer to HTAPv3 and CEDSv2 estimates at the global scale and for India and Russia.As opposed to CO, the higher emissions in China and India and for BC and OC are not supported by the other inventories.The comparison with QFED fire emissions suggests that this could be due to an overestimation of FINN fire emissions in India and China.
As summarized in Table 1, we performed a series of forward simulations to assess the impact of posterior emissions and of a change in the main tropospheric OH source.The CAM-chem simulations were initialized from the same initial conditions on 1 January 2015 and run until 1 January 2019.The resulting relative emissions changes were applied to the original fluxes, i.e., the daily FINN2.2emissions, and by sector for the monthly CAMS-GLOB-ANT_v5.1.Assuming that the errors were stationary, we applied these increments over all time steps for both emission sources.
In order to verify the impact of sustained emissions changes across all time steps, we performed deterministic simulations with the posterior CO emissions (CAM-chem-post) and with the posterior BC and POM emissions (CAM-chem-post-aer).Since JO( 1 D) plays an important role in OH formation (e.g., [54,116]), we performed another simulation where we arbitrarily reduced the ozone photolysis of JO( 1 D) by 10% globally (CAM-chem-O1D simulation).This simulation was designed to cut the OH production from its primary source, which was JO( 1 D), and the dominant source in remote environments where ATom measurements are made.
Figures 8-10 contrast the results with respect to the control for CAM-chem-post (updated CO emissions), CAM-chem-O1D (reduction in JO( 1 D)), and aerosol emissions (CAM-chem-post-aer), respectively.It shows the differences in RMSE compared to monthly MODIS AOD and MOPITT CO observations.The strongest signals comes from East Asian emissions and tropical fires, which show large-scale improvements close to the emission hotspots, with RMSE reductions in XCO by up to 15 ppb.The reduction in RMSE can propagate through continental outflows at large scale and far from the sources.Large biases in tropical fire estimates propagate through the entire southern hemisphere [117].On average, the XCO improvement can be seen over the entire southern hemisphere.This is driven by the bias reduction during the fire season, from August to October.The bias reduction then propagates through the lower latitudes, spreading over the whole southern hemisphere by December.The performance of the updated simulation is degraded with respect to the reference in the equatorial latitudes of the northern hemisphere, where the bias has the opposite sign to the emission correction.In particular, there have been no constraints on the fire sources in Southeast Asia and an increase in anthropogenic emissions.During the fire season in boreal spring, the FINN2.2emissions are overestimated and the posterior simulation's bias is increased, impacting the Pacific outflow.This result is consistent with the Southeast Asian overestimation reported in Wiedinmyer et al. [83].
The simulated tropospheric and global CH 4 lifetimes for 2015 are not changed between the two CO emissions, 7.9 years, which is 1.5 years too short.The reduction in fire CO emissions might compensate for the increase in anthropogenic emissions.The lack of differences between the two simulations could be explained by the fact that an increase in CO emissions could lead to O 3 production [52], which, in turn, produces OH, especially in boreal summer.The CH 4 lifetime increases to 8.3 years in the CAM-chem-O1D simulation.
Figure 9 illustrates this pattern, with a clear reduction in RMSE in the northern midlatitudes and an increase in RMSE in the tropics and in the southern hemisphere.While the dipole can be explained by the biases in emissions that we identified earlier, this global decrease in OH has a strong impact on XCO and confirms the important role of modeled OH on the CO systematic errors.Beyond the role of emissions, the XCO error sensitivity to OH does not show a seasonal pattern.
It emphasizes the role of the ongoing metrics of inter-hemispheric parity as playing a strong role in forward simulations.Indeed, the simulated N/S ratio is 1.42 in 2015 and is only slightly reduced to 1.41 in the CAM-chem-post simulation.While a JO( 1 D) change should lead to a linear and proportional reduction in primary OH production of 10% [116], we only found a 4.5% reduction in air mass weighted global and annual (2015) tropospheric OH.This suggests strong secondary OH production, which seems to explain the lack of sensitivity to CO emission changes.It would play a role in inverse modeling systems that do not correct for initial conditions in the case of interactive chemistry or with a wrongly prescribed OH otherwise.The impacts on aerosols are more localized because of the shorter lifetime, with improvements close to the source, especially over China, the Congo Basin, and over the Amazon (Figure 10).The improvements are better than for CO over India and the degradation is not as pronounced in Southeast Asia and the Pacific Ocean.The changes in the African tropical outflow are a contrast between larger improvements and degradation overall.These are the result of the convolution of opposing bias between emissions and transport.There is mostly a reduction in emissions in the southern (northern) part of the outflow around 15 • S (0 • S), where the AODs are underestimated (overestimated), increasing (decreasing) the relative errors, respectively.

Evaluation against NASA ATom
Figure 11 shows the average CO over latitudinal bands for the four ATom experiments.The CO underestimation in the remote atmosphere of the northern hemisphere is actually found for all seasons and is the largest for ATom-3 in the fall 2017.The increase in emissions leads to a modest improvement in winter (ATom-2) and spring (ATom-4) and is not particularly noticeable in summer (ATom-1) or fall (ATom-3).In the tropics, the challenge is in representing the biomass burning enhancement.Both simulations tend to overestimate CO for all seasons.Following the reduction in fire emissions, the bias is reduced for ATom-1 and ATom-3, corresponding to the fire season in southern tropical Africa and South America.The profile for ATom-3 indicates the large convective transport of CO to the upper troposphere, a feature not present in the observation.The least that we can say is that the sensitivity to a change in emissions is minuscule in the remote atmosphere.As for the column, it suggests that the background concentration, because of the chemical loss on one hand and the chemical production on the other hand, is driven by chemistry.
The impact on CO is at least as large as the emission change in most cases.In the southern hemisphere, differences of up to 5 ppb over the entire profile are found for ATom-1 and Atom-3, which correspond to austral winter and spring.Figure 12 shows the vertical OH observations and simulations for both hemispheres.The model simulations and observations show good agreement overall, especially in the northern hemisphere.The simulations tend to overestimate OH in the upper troposphere of the southern hemisphere.There is an apparent bias for ATom-2 and ATom-4 with overestimation for the entire profiles, which could explain some of the CO bias observed in boreal winter and spring.As expected, the reduction in JO( 1 D) reduces OH for all ATom in both hemispheres.HO 2 (Figure 13) is mostly overestimated in the upper troposphere for both hemispheres.This indicates that the HO x are overestimated in a fairly consistent manner.Since HO 2 tends to be slightly underestimated in the lower troposphere, some CO direct or indirect sources may still be missing, which would reduce OH and increase HO 2 , although we cannot rule out the effect of nitrogen and other oxidants' chemistry.

Discussion
Other studies have focused on directly quantifying urban CO emissions from model simulations and satellite data [118][119][120].With the increased spatial resolution from more recent satellites, such as the Tropospheric Monitoring Instrument (TROPOMI), emissions can be detected at smaller, i.e., sub-city, scales [121] and by sector, such as facility-level industrial sources [122].Recently, Sun [123] extended such a framework to estimate emissions over the entire contiguous United States (CONUS) on a 0.04 • mesh.Recent studies have investigated the use of machine learning algorithms to estimate high-resolution near-surface CO concentrations by combining surface observations and MOPITT observations [124], and directly from satellite radiances augmented by a deep forest (DF) model [125].However, Tang et al. [126] raised the question of the spatial and temporal representativeness of satellites compared to the surface observations using data assimilation over China.Emission estimates from the various approaches must be compared in future studies.
The impact of meteorological uncertainties through dynamical and physical processes are taken into account in our chemical ensemble data assimilation system.However, biases in the planetary boundary layer height and deep convection are likely to play an important role in predicting the concentrations of gas-phase and aerosol atmospheric constituents.Examples of the comparison of simulated planetary boundary layer heights with LiDAR observations indicate daytime overestimation, implying the excessive dilution of emission fluxes (e.g., [127,128]).The boundary layer mixing is also important in chemistry and can have impacts on wet and dry deposition, impacting global tropospheric OH and O 3 budgets [129].Refining the grid mesh to a higher resolution should help in representing smaller-scale topography gradients, while increasing the vertical resolution will improve the representation of parameters relevant to vertical transport.An increase in spatial resolution will help to alleviate representation errors due to differences in the urban CO and synoptic CO background [126].We will further investigate these issues with currently developed modeling tools such as the Multi-Scale Infrastructure for Chemistry and Aerosols (MUSICA) [130].

Conclusions
The current CESM2.2/CAM-chemmodel overestimates OH in the northern hemisphere, leading to a short CH 4 lifetime of 7.9 years.Conversely, CO tends to be overestimated in the southern hemisphere and over the tropics.This is illustrated by a northern hemisphere to southern hemisphere (N/S) ratio of 1.42.However, the optimization of the CO initial conditions with data assimilation mitigates the OH-and transport-induced CO biases during the emissions update.The emissions update allows for good forecast performance over the source regions, as opposed to reanalyses with only state assimilation (e.g., [45]).
The assimilation of MODIS AOD and MOPITT CO led to similar posterior emissions increments, indicative of biases stemming from the inventories FINN2.2 and CAMS-GLOB-ANT_v5.1 for both organic aerosols, black carbon, and CO.The FINN2.2 emissions are overestimated in the tropics for both CO and aerosols, and this bias has an impact in India and China.Overall, only India is a large source area, where CO from CAMS-GLOB-ANT_v5.1 is overestimated in the Indo-Gangetic Plain, while aerosols are underestimated.The comparison with QFED fire emissions as well as CEDSv2 and HTAPv3 indicates that this could be due to an overestimation of FINN2.2 fire emissions in India.The posterior emissions are well within the range of the bottom-up inventory in China, the top anthropogenic CO emitter.For 2015, our total annual emissions for China are 147 Tg a −1 , between our prior, the CAMS-GLOB-ANT_v5.1 (127 Tg a −1 ), and both CEDSv2 (160 Tg a −1 ) and HTAPv3 (165 Tg a −1 ).We recommend looking at various emission estimates, including top-down, when assessing health impacts, as done in, e.g., [131].
We applied the relative posterior emission increments, which led to the improvement of the deterministic model simulation close to the source region, and performed a simulation with a reduction in the OH primary source by 10%.Since the fire emissions play an important role in the southern hemisphere and over the tropics and impact CO at remote locations, reducing the emissions bias improved the simulated CO and AOD over large areas of these regions.Our sensitivity experiments for the year 2015 to 2018 suggest that OH is the main driver of the commonly found CO underestimation in the northern hemisphere.The comparison with NASA ATom observations indicates that the remaining biases in remote regions are driven by transport and chemistry, as well as the long-range transport of emissions, which should be considered when performing inversions.Finally, in the case of emissions, flux inversion can remediate many systematic errors.For instance, the reduction in the FINN fire emissions using observations from 2015 improved the simulations for the entire period of 2015 to 2019 close to the sources and in the outflow, for both CO and aerosols.The biases in CO are reduced in the tropics in comparison with ATom in August 2016 and during ATom-3 in October 2017, when and where the fire plumes were sampled over the Atlantic and Pacific basins.Similarly, a reduction in OH consistently improves CO in the northern mid-latitudes.Funding: This research was funded by the National Oceanic and Atmospheric Administration (grant no.NA18OAR4310283) and the National Aeronautics and Space Administration (grant no.80GSFC19C0032 and grant no.80NSSC19K0947).
shows the improvements in the tropospheric wVMR CO columns as measured by 15 NDACC sites for 15 August 2015 to 31 August 2015.The mean absolute bias is reduced from around 19 ppb to 11 ppb and the correlation coefficient increases from 0.74 to 0.94.

Figure 3 .
Figure 3. Scatter plots and statistics summary of comparison of (A) Control-DA and (B) MOPITT-DA with NDACC FTIR tropospheric weighted volume mixing ratio (wVMR) CO observations.The 80 observations are daily averages across 15 sites.

Figure 4 .
Figure 4. Daily observation space statistics for unbiased RMSE and Control-DA evaluated with the 6 h forecast.

Figure 5 .
Figure 5. MODIS AOD (top panel, A), averaged between 1 August 2015 and 31 August 2015.The respective bias is shown for the Control-DA on the middle panel (B) and for MODIS-DA on the bottom panel (C).

Figure 6 .
Figure 6.Average differences in AOD between the MODIS-DA and the Control-DA for August 2015.The AOD differences are calculated by species, which are integrated over the various modes.BC (A), POM (B), SOA (C), SO4 (D), DUST (E), and SS (F) indicate black carbon, primary organic matter, secondary organic aerosols, sulfates, dust, and sea salt, respectively.

Figure 7 .
Figure 7. Differences between the posterior and prior emissions flux for August 2015.The top panels (A,B) show the anthropogenic and the fire sources of CO.The bottom panels show the total emissions used in the model simulations, for (C) BC and (D) POM.

Figure 8 .
Figure 8. Absolute CO changes in XCO RMSE between the CAM-chem-ref and the CAM-chem-post simulation.The RMSE is calculated against the monthly MOPITT CO L3 product.A negative value indicates an improvement in model calculated XCO.The top panel represents the monthly zonal average and the bottom panel is the 2015-2018 average map.

Figure 9 .
Figure 9. Same as Figure 8, but for the absolute CO changes in XCO RMSE between the CAM-chem-ref and the CAM-chem-O1D simulations.

Figure 10 .
Figure 10.Relative change in AOD root mean square errors between the CAM-chem-ref and the CAM-chem-post-aer simulation.The RMSE is calculated against the monthly MODIS AOD mean.A negative (positive) value indicates an improvement (deterioration) in model calculated AOD.The top panel represents the monthly zonal average and the bottom panel is the 2015-2018 average map.

Table 2 .
State vector employed in assimilation experiments.
fire emissions.