Use of Satellite Data for Air Pollution Modeling in Bulgaria

: Air pollution continues to be of concern for Bulgarian cities, mainly due to particulate matter of aerodynamic diameter smaller than 10 µ m (PM 10 ). There is public and expert interest in the improvement of two operational air quality modeling systems: the Bulgarian Chemical Weather Forecast System (BgCWFS) and the Local Air Quality Management System (LAQMS) for the city of Plovdiv. The aim of the study is to investigate the effects of satellite data assimilation in BgCWFS on surface concentrations over Bulgaria (resolution 9 km), to downscale BgCWFS output to LAQMS (resolution 250 m), and to examine effects on PM 10 in Plovdiv. Data from the Global Ozone Monitoring Experiment-2 (GOME-2) (MetOP satellites) for aerosols, nitrogen dioxide (NO 2 ), and sulfur dioxide (SO 2 ) were assimilated in BgCWFS using objective analysis. Simulation experiments with and without satellite data were conducted for a summer and a winter month. The comparison to surface observations in the country showed improvement of results when using satellite data, especially in the summer due to mineral dust events captured by satellites. The decrease in the normalized mean bias (NMB) over the two months was 43% (PM 10 ) and 73% (SO 2 ). The LAQMS estimated background contributions to PM 10 in the city as 32%. The absolute NMB by LAQMS decreased by 38%.


Introduction
Air quality is a key element for the environment, economy, and public health. In Europe, air pollution is today recognized as the most important environmental health risk, with particulate matter (PM), ozone (O 3 ), and nitrogen dioxide (NO 2 ) concentrations exceeding air quality (AQ) standards in many urban areas [1]. In densely populated urban areas, the multiple emission sources in and around the city cause population exposure to polluted air. In most Bulgarian cities, the primary emission sources are traffic and household heating.
In Europe, about 400,000 premature deaths per year are due to population exposure to PM 2.5 (PM with a diameter less than 2.5 µm), while for Bulgaria (with a population of about 7 million), this number was 12,500 in 2018 [1]. This fact ranks Bulgaria in one of the top places among the 41 European countries analyzed in [1]. In 2018, 48% of the urban population in 28 European states were exposed to PM 10 concentrations (PM with a diameter less than 10 µm) exceeding the annual value of 20 µgm −3 , defined as the air quality guideline by the World Health Organization [2]. The percentage of urban Earth 2021, 2 587 population in Bulgaria exposed to PM 10 concentrations above the daily limit value of the European Union (50 µgm −3 ) was 65.4% in 2018 [3].
Despite emission reductions, air pollutant concentrations still remain high, with exceedance leading to long-term urban population exposure to PM [4]. Air quality in Bulgaria continues to be of great concern, posing new requirements and challenges not only to decision-makers, but also to researchers for providing reliable, timely, and detailed information on air pollution levels both at regional and urban scale.
A variety of methods exist for air quality assessment, forecasting, and understanding air pollution processes in different spatio-temporal scales. Analysis of observational data (e.g., ground-based monitoring networks, remote sensing instruments on board of satellites or on the ground), application of statistical methods, application of modeling systems for transport and chemical transformation of pollutants in the atmosphere, and combinations of these are used worldwide. Combining data from different sources is of increasing interest as this leads to more comprehensive assessment of particulate matter and to estimates of effects of emissions reduction scenarios, for example [5,6].
Satellite data have the advantage of covering wide regions in a short time, capturing the transport of natural aerosols (mineral dust, smoke from wildfires). However, they have some drawbacks [7,23]: polar-orbiting satellites provide data for a given region once a day; clouds can lead to gaps in the spatial coverage; and the derived parameters are representative for the columnar abundance and do not necessarily correspond to surface concentrations. One of the important parameters retrieved by instruments on board of satellites is the Aerosol Optical Depth (AOD), which represents the aerosol load in the column from the Earth's surface to the top of the atmosphere.
Chemical transport models (CTM) are powerful tools for providing concentrations of different pollutants with high spatial and temporal resolution. Furthermore, they can estimate the effect of emission scenarios on concentrations and are particularly useful for policy support and abatement measures. CTM results, however, exhibit uncertainties related to, among others, the model's parameterizations and emissions' input [24]. To improve the accuracy of modeled aerosols, data assimilation of satellite-retrieved AOD has been exploited in the recent years, both for the purposes of forecasting [12,13,25] and analyzing aerosol processes in different geographical areas [26,27].
Data assimilation techniques optimally combine theory (model results) and observations. Methods for chemical data assimilations are discussed in [16,28]. The assimilation of satellite AOD in CTM aims to improve model initial conditions using information not contained in the model emissions (e.g., natural sources of aerosols). The approaches for satellite data assimilation in air quality modeling systems have different levels of complexity (simple optimal interpolation (OI) techniques, Kalman filters, variational methods) [16]. The objective analysis scheme with Cressman successive correction in [29] was beneficial for PM 2.5 simulated by a regional CTM over the United States. The application of optimal interpolation led to a decrease in modeled biases by regional scale models over east Asia [23] and the United States [30,31]. State-of-the-art methods are based on variational approaches-3DVAR for assimilation made at a fixed moment [12,13] and 4DVAR for assimilation of a series of observational data [25,32]. Comparable results between 3DVAR and OI are reported in [13].
This study focused on a combination of satellite-retrieved parameters and two air quality modeling systems-a regional one, based on a chemical transport model, and an urban one, with dispersion models for air quality management at city scale.
The Bulgarian Chemical Weather Forecast System, running operationally in Bulgaria since 2012 [33][34][35][36][37], was built upon the WRF-CMAQ model chain (see details in Section 2.2). Evaluation of results upon surface observations indicated underestimations of particulate matter [38]. The local air quality management system (LAQMS) [39,40] was designed and implemented for operational work in the Municipality of Plovdiv (see details in Section 2.5). Plovdiv is the second largest city in Bulgaria (370,000 inhabitants) experiencing high pollution levels and PM exceedances almost during the whole year. Only two regulatory AQ stations are operating in Plovdiv, which does not allow capturing the spatial variability of the air pollution in the city. As there is public and expert concern to enhance the possibilities of the operational modeling systems to better predict PM surface concentrations, both modeling systems were modified to account for satellite-retrieved data in the model's initial conditions. Satellite data on atmospheric chemistry have not been used in Bulgaria in support of air pollution modeling so far. As a first attempt in this direction, we chose an optimal interpolation technique to improve the chemical initial conditions (IC) in the BgCWFS system.
The main objective of this study was to investigate the effects of satellite data assimilation in BgCWFS on the model's performance for surface concentrations over Bulgaria (resolution 9 km) during one summer and one winter month. The additional objective was to downscale the BgCWFS output to the finer scale of the LAQMS (resolution 250 m) and to examine the modeled PM 10 concentrations in Plovdiv determining the contributions from different emission sources.

Satellite Data for AOD, NO 2 , and SO 2
The selection of satellite data to be used in the air pollution system BgCWFS (see Section 2.2) was determined mainly by: (a) the available data for three parameters: aerosol optical properties (aerosol absorbing index AAI, or aerosol optical depth AOD) and the vertical column densities of nitrogen dioxide and sulfur dioxide (VCD_NO 2 and VCD_SO 2 ) over the territory of the Balkan Peninsula; and (b) the spatial resolution of the data appropriate for the modeling domains of BgCWFS [41].
Data from the Global Ozone Monitoring Experiment-2 (GOME-2) instrument on board the polar orbiting satellites MetOp A and MetOp B were used in this study, with a possibility to also include MetOp C data. The footprint size is 40 × 40 km 2 , swath path 960 km (MetOp A, MetOp C) or footprint size 40 × 80 km 2 , swath path 1920 km (MetOp B). The overpass time for the Balkan Peninsula is around 08-09 UTC, the time difference between MetOp A and MetOp B passage is about 1 h. Level-2 products in .hdf5 format (freely available upon registration at [42]) and in .nc format (freely available at [43]) were post-processed to accommodate BgCWFS model domains and to extract the three parameters to be further used in the modeling system. The data vary by date and parameter; the example in Figure 1 shows the available data on 22.08.2017 in the biggest BgCWFS domain (over Europe).
BgCWFS is running in five nested domains: the European region (81 km horizontal resolution), the Balkan Peninsula (27 km), Bulgaria (9 km), and two smaller domains around Sofia: Sofia Province (3 km) and Sofia City (1 km).
The emission data in BgCWFS are based on the emission inventory for Europe prepared by the Netherlands Organization for Applied Scientific Research (TNO) for the year 2010 [47] and on the Bulgarian emission inventory for 2015. A Geographic Information System (GIS)-based web-platform was created to grid European and Bulgarian inventory data over the 5 BgCWFS domains. The gridded data are processed by specially designed pre-processor modules which perform temporal allocation and speciation splitting for the area (AS) and large point (LPS) sources. SMOKE is used to estimate the biogenic emissions and to merge them with the AS and LPS ones.
The initial conditions (ICs) and boundary conditions (BCs) for the meteorological calculations are taken from the Global Forecast System (GFS) (weather forecast model) at the National Centers for Environmental Prediction (NCEP) [48] with grid resolution 1 • × 1 • and temporal resolution of 6 h. The chemical BCs over the outer domain (Europe) are set according to the climatic profiles embedded in the CMAQ software. The presumption is that the possible errors decrease when moving away from the boundaries inside the domain because of dispersion and removal processes, and because of the continuous action of the pollution sources inside the domain. The BCs for each of the other domains are determined from the senior ones.
The vertical structure of CMAQ is determined by fourteen layers of varying thickness between the surface and 100 hPa, with the first layer being between 55 and 60 m thick. The Planetary Boundary Layer (PBL) is presented by the lowest eight levels. The system is set up with well-known and widely accepted parameterization schemes for meteorological processes and chemical mechanisms (see details in [49]). The output of CMAQ contains 3D hourly files for concentrations, depositions, and visibility parameters related to 78 pollutants. Post-processing is carried out to extract the surface concentrations of the main pollutants (e.g., ozone, particulate matter, nitrogen dioxide, etc.) and visualize them as maps on the website of NIMH [50].

AOD Calculation in BgCWFS
AOD is not a default parameter in CMAQ. For its estimation, five different approaches were analyzed and tested [51]. Four of the methods were embedded in CMAQ. Two of them are based on the estimation of the visibility parameters and the extinction coefficients (Mie theory and reconstruction method), while the other two are based on light extinction for different species, and empirical relationships between visibility parameters and aerosol mass concentrations (algorithm based on data from the Interagency Monitoring of Protected Visual Environments (IMPROVE) monitoring network in the United States). The fifth method is the post-processing tool FlexAOD [52,53], which uses mass concentrations of the speciated aerosol (i.e., sulfate, nitrate, ammonium, black carbon, organic carbon, sea salt, soil dust) and the distribution of the relative humidity. Based on the comparative analysis in [51], the FlexAOD software was selected to estimate AOD at 550 nm (denoted further as AOD) from CMAQ output.

Assimilation of Satellite-Retrieved Data in BgCWFS
As this was the first attempt to use satellite-retrieved data in BgCWFS, we chose to implement a simple assimilation technique based on optimal interpolation [16,24] and correction of the initial model profiles. The correction factor was calculated as the ratio between BgCWFS vertical column parameters (e.g., AODm estimated by BgCWFS) and the parameters obtained from satellite data (e.g., AODs), and it was further used to modify the profile concentrations, Figure 2. In the case of AOD assimilation, this involved correction of different PM species (type and size mode). The same correction approach was also applied for the vertical column densities of NO 2 and SO 2 .
eters and aerosol mass concentrations (algorithm based on data from the Interagency Monitoring of Protected Visual Environments (IMPROVE) monitoring network in the United States). The fifth method is the post-processing tool FlexAOD [52,53], which uses mass concentrations of the speciated aerosol (i.e., sulfate, nitrate, ammonium, black carbon, organic carbon, sea salt, soil dust) and the distribution of the relative humidity. Based on the comparative analysis in [51], the FlexAOD software was selected to estimate AOD at 550 nm (denoted further as AOD) from CMAQ output.

Assimilation of Satellite-Retrieved Data in BgCWFS
As this was the first attempt to use satellite-retrieved data in BgCWFS, we chose to implement a simple assimilation technique based on optimal interpolation [16,24] and correction of the initial model profiles. The correction factor was calculated as the ratio between BgCWFS vertical column parameters (e.g., AODm estimated by BgCWFS) and the parameters obtained from satellite data (e.g., AODs), and it was further used to modify the profile concentrations, Figure 2. In the case of AOD assimilation, this involved correction of different PM species (type and size mode). The same correction approach was also applied for the vertical column densities of NO2 and SO2.
The assimilation took place in the first three model domains (Europe, the Balkan Peninsula, and Bulgaria) at the time of the satellite overpass hour Hs (fixed at 09:00 UTC). WRF, the meteorological model, was run once at the beginning of the assimilation. CMAQ was run twice: a normal 24-h run and a run with part-day integration from Hs to the end of the day. The assimilation created new concentrations fields used as initial conditions for the next-day runs.

Local Air Quality Management System (LAQMS)
The LAQMS was developed at NIMH in 2004 as an operational system intended for the authorities of the city of Plovdiv-the second largest city in Bulgaria, experiencing permanent air quality problems in the past decade mainly due to high concentrations of Figure 2. Algorithm for AOD assimilation in BgCWFS, "s"-satellite, "m" -model, "a" -assimilated, CF-correction factor, "PM"-species of particulate matter.
The assimilation took place in the first three model domains (Europe, the Balkan Peninsula, and Bulgaria) at the time of the satellite overpass hour Hs (fixed at 09:00 UTC). WRF, the meteorological model, was run once at the beginning of the assimilation. CMAQ was run twice: a normal 24-h run and a run with part-day integration from Hs to the end of the day. The assimilation created new concentrations fields used as initial conditions for the next-day runs.

Local Air Quality Management System (LAQMS)
The LAQMS was developed at NIMH in 2004 as an operational system intended for the authorities of the city of Plovdiv-the second largest city in Bulgaria, experiencing permanent air quality problems in the past decade mainly due to high concentrations of particulate matter. The design of the system was determined by its main objective-to be in support of local authorities to make both short-term and long-term decisions on air quality management. The system was set up for two domains, Plovdiv region (63 km × 44 km with grid resolution of 1 km) and Plovdiv city (12 km × 11 km with grid resolution of 250 m). The LAQMS has four main modules [40]: • Meteorological pre-processor modules, • Emission modules, • Dispersion modules, and • Post-processing modules and interface for AQ experts (expert module).
The meteorological pre-processor module was modified so as to receive the output of the WRF model of BgCWFS in domain Bulgaria (resolution 9 km). The emission modules accounted the following pollution sources in the city: household heating, traffic, and industry. For this study, a bottom-up approach [54,55] was applied for the emission inventory of household heating and traffic in Plovdiv. The emissions from household Earth 2021, 2 591 heating were estimated using high-resolution data for the distribution of the households in the city. Details for the type of fuel used and the electricity consumption were obtained based on data from 450 questionnaires distributed among the citizens of Plovdiv. For the emissions from the road transport in the city, data from traffic cameras were analyzed, also applying high resolution mapping of the street network. GIS functionalities were used to automatically perform spatial operations. In Figure 3, the estimated emissions of particulate matter from these two sources are shown.


Dispersion modules, and  Post-processing modules and interface for AQ experts (expert module).
The meteorological pre-processor module was modified so as to receive the output of the WRF model of BgCWFS in domain Bulgaria (resolution 9 km). The emission modules accounted the following pollution sources in the city: household heating, traffic, and industry. For this study, a bottom-up approach [54,55] was applied for the emission inventory of household heating and traffic in Plovdiv. The emissions from household heating were estimated using high-resolution data for the distribution of the households in the city. Details for the type of fuel used and the electricity consumption were obtained based on data from 450 questionnaires distributed among the citizens of Plovdiv. For the emissions from the road transport in the city, data from traffic cameras were analyzed, also applying high resolution mapping of the street network. GIS functionalities were used to automatically perform spatial operations. In Figure 3, the estimated emissions of particulate matter from these two sources are shown. The dispersion module incorporates two models: Poltran (a combination of an Eulerian advection scheme and numerical calculation of the turbulent diffusion) [39], used as a main dispersion model, and AUSTAL2000 (a Lagrangian type, the reference dispersion model of the German Environment Agency [56]), used as a complement to Poltran regarding dispersion from point and linear sources during the first hour of the simulations.
The LAQMS separately simulates the dispersion of pollutants released from the different groups of emission sources. Thus, the concentrations include concentrations caused by household heating, traffic, and industry in the city, and background concentrations caused by emission sources located outside the city. The effect of satellite data was introduced in LAQMS through the background concentrations. The parameters calculated by BgCWFS_sat for use in the LAQMS included meteorological variables (wind, temperature, etc.) and the concentrations of eight pollutants (NO2, NO, CO, SO2, O3, NH3, PM10, The dispersion module incorporates two models: Poltran (a combination of an Eulerian advection scheme and numerical calculation of the turbulent diffusion) [39], used as a main dispersion model, and AUSTAL2000 (a Lagrangian type, the reference dispersion model of the German Environment Agency [56]), used as a complement to Poltran regarding dispersion from point and linear sources during the first hour of the simulations.
The LAQMS separately simulates the dispersion of pollutants released from the different groups of emission sources. Thus, the concentrations include concentrations caused by household heating, traffic, and industry in the city, and background concentrations caused by emission sources located outside the city. The effect of satellite data was introduced in LAQMS through the background concentrations. The parameters calculated by BgCWFS_sat for use in the LAQMS included meteorological variables (wind, temperature, etc.) and the concentrations of eight pollutants (NO 2 , NO, CO, SO 2 , O 3 , NH 3 , PM 10 , and PM 2.5 ). These parameters were used to determine the background concentrations for the domains of LAQMS.
The expert module of LAQMS is a post-processing tool intended to visualize the spatial distribution of concentrations, due to different groups of emission sources, and to aggregate the hourly values in different time scales (e.g., daily, annual, etc.). Examples can be found in [57].

Simulations and Evaluation of the Models' Performance
Simulations with BgCWFS and LAQMS were carried out for two months-August 2017 and February 2019. The periods were chosen so as to represent the summer period and the winter period, also accounting for available GOME-2 data for the Balkan Peninsula (BgCWFS domain 2).
August in Bulgaria is one of the hottest and driest months. August 2017 was characterized by temperatures exceeding the climatic norm by +0.5 • C to +3.9 • C and gross monthly precipitation around the norm. The dry weather was favorable for wildfires, mainly in the south-western and south-eastern regions of the country. The AOD over the country was affected by such fires, but also by Saharan dust outbreaks [43,58]. According to satellite data, higher AOD values were observed on about 12 days in this month. February is Earth 2021, 2 592 usually one of the coldest months in Bulgaria with long-lasting anticyclonic conditions leading to elevated air pollution. February 2019, however, featured mild temperatures (deviation from the monthly climatic norm by 2.1 • C) and less precipitation (deviation from the monthly norm by 52%). The average number of days with clear sky was 10 (higher than the climatic norm); therefore, more satellite data than for other winter months were available.
Simulations were carried out by BgCWFS in offline mode in two versions: "mod" (without assimilation of satellite data) and "sat" (with assimilation of satellite data). The local system, LAQMS, was run for the city of Plovdiv using background concentrations as calculated by BgCWFS_mod and BgCWFS_sat.
To evaluate the model's results for domain Bulgaria, surface observations for particulate matter (PM 10 and PM 2.5 ), SO 2 , and NO 2 at regulatory air quality monitoring stations in Bulgaria were used. The data are freely available in [59]. Only background stations with data availability of more than 75% during the two months were included in the analysis, Figure 4. Some widely used statistical metrics (Appendix A) were calculated for the monthly mean values at all stations, while boxplots and kernel density functions were used to visualize the model's performance at selected stations: two urban stations in Sofia and Plovdiv, and one rural/mountain station, Sofia-Kopitoto (1321 m.a.s.l.). and the winter period, also accounting for available GOME-2 data for the Balkan Peninsula (BgCWFS domain 2).
August in Bulgaria is one of the hottest and driest months. August 2017 was characterized by temperatures exceeding the climatic norm by +0.5 °С to +3.9 °С and gross monthly precipitation around the norm. The dry weather was favorable for wildfires, mainly in the south-western and south-eastern regions of the country. The AOD over the country was affected by such fires, but also by Saharan dust outbreaks [43,58]. According to satellite data, higher AOD values were observed on about 12 days in this month. February is usually one of the coldest months in Bulgaria with long-lasting anticyclonic conditions leading to elevated air pollution. February 2019, however, featured mild temperatures (deviation from the monthly climatic norm by 2.1 °C) and less precipitation (deviation from the monthly norm by 52%). The average number of days with clear sky was 10 (higher than the climatic norm); therefore, more satellite data than for other winter months were available.
Simulations were carried out by BgCWFS in offline mode in two versions: "mod" (without assimilation of satellite data) and "sat" (with assimilation of satellite data). The local system, LAQMS, was run for the city of Plovdiv using background concentrations as calculated by BgCWFS_mod and BgCWFS_sat.
To evaluate the model's results for domain Bulgaria, surface observations for particulate matter (PM10 and PM2.5), SO2, and NO2 at regulatory air quality monitoring stations in Bulgaria were used. The data are freely available in [59]. Only background stations with data availability of more than 75% during the two months were included in the analysis,

Results
We discuss results obtained by two versions of BgCWFS: BgCWFS_mod and BgCWFS_sat focusing on the ground-level concentrations for selected pollutants (PM 10 , PM 2.5 , NO 2 , and SO 2 ) and the columnar parameters (AOD and VCD_NO 2 and VCD_SO 2 ). The simulations were carried out for two months. Surface observations from the AQ monitoring network in Bulgaria were used to check BgCWFS performance. At urban scale, for the city of Plovdiv, LAQMS results for PM 10 are shown, analyzing the contribution of BgCWFS background values to the monthly mean concentrations at the two AQ stations in Plovdiv.

BgCWFS_mod vs. BgCWFS_sat
The hourly time evolution of the differences (sat − mod) displayed typical spatial distribution pattern for all pollutants. At the time of satellite overpass hour (fixed at Hs = 09:00 UTC), disturbances that resembled area pollution source were noticed in the fields. These disturbances evolved with time due to atmospheric transport and transformations, changing position and decreasing in magnitude [49]. As examples of the spatial distribution of monthly mean values, in Figure  2019 are shown. The AOD map by BgCWFS_sat indicated higher values, especially in the southern and eastern parts, associated with the frequent Saharan dust outbreaks in August which approached Bulgaria from the south-west and moved towards the north-east. The corresponding maps for PM 10 and PM 2.5 in domain Bulgaria ( Figure S1) featured increased values from BgCWFS_sat in the eastern part of the country and the adjacent Black Sea area. The NO 2 and SO 2 maps ( Figure S2) displayed higher values for big urbanized and industrial areas. The assimilation of satellite data led to noticeable overall increase of SO 2 in the domain with apparently no effect on NO 2 concentrations. distribution pattern for all pollutants. At the time of satellite overpass hour (fixed at Hs = 09:00 UTC), disturbances that resembled area pollution source were noticed in the fields. These disturbances evolved with time due to atmospheric transport and transformations, changing position and decreasing in magnitude [49]. As examples of the spatial distribution of monthly mean values, in Figure 5 AOD in BgCWFS domain Balkan Peninsula for August 2017 and in Figure 6 PM10 in model domain Bulgaria for February 2019 are shown. The AOD map by BgCWFS_sat indicated higher values, especially in the southern and eastern parts, associated with the frequent Saharan dust outbreaks in August which approached Bulgaria from the south-west and moved towards the north-east. The corresponding maps for PM10 and PM2.5 in domain Bulgaria ( Figure S1) featured increased values from BgCWFS_sat in the eastern part of the country and the adjacent Black Sea area. The NO2 and SO2 maps ( Figure S2) displayed higher values for big urbanized and industrial areas. The assimilation of satellite data led to noticeable overall increase of SO2 in the domain with apparently no effect on NO2 concentrations.  The map for PM 10 over Bulgaria in February 2019 from BgCWFS_sat ( Figure 6b) indicated an overall increase in surface PM 10 concentrations compared to the map from BgCWFS_mod, especially in the northern part of the domain. The PM 2.5 distribution ( Figure S3) was similar to the PM 10 distribution, with a noticeable increase in the eastern part of the domain. The NO 2 and SO 2 maps for this winter month ( Figure S4) had the same main features as the maps for the summer month, with a significant increase of SO 2 from BgCWFS_sat over the entire domain.
Earth 2021, 2, FOR PEER REVIEW 9 BgCWFS_mod, especially in the northern part of the domain. The PM2.5 distribution (Figure S3) was similar to the PM10 distribution, with a noticeable increase in the eastern part of the domain. The NO2 and SO2 maps for this winter month ( Figure S4) had the same main features as the maps for the summer month, with a significant increase of SO2 from BgCWFS_sat over the entire domain.  The differences (sat − mod) in the monthly mean values of the surface and columnar parameters averaged over domain Bulgaria are shown in Figure 7. The domain mean concentrations of all parameters from BgCWFS_sat increased (positive differences), except for NO 2 (surface and columnar values). The increase for the domain mean PM 10 was 7-10 µgm −3 , which corresponds to percentage increase from BgCWFS_sat of 117% in August and 42% in February. The pronounced increase in the summer was associated with the frequent Saharan dust intrusions, captured by GOME-2 with good coverage over Bulgaria under low cloudiness conditions. A significant increase of SO 2 and VCD_SO 2 was also observed, more noticeably for February (161% and 335%, respectively). This is more likely due to higher SO 2 emissions in south-eastern Europe in winter resulting from energy production by coal-fired thermal power plants, as indicated by the long-term satellite-derived SO 2 data [60]. The impact on NO 2 was weak. NO 2 is a highly reactive gas originating mainly from transport and combustion sources near the surface. It has a short lifetime (a few hours to one day [61]), it is involved in nonlinear chemistry, and its distribution is characterized by strong horizontal inhomogeneity and well-expressed gradients in the Planetary Boundary Layer (PBL). These peculiarities pose challenges for the assimilation of satellite-retrieved tropospheric NO 2 [16]. The weak impact of the assimilation observed in our NO 2 model results might be attributed to different reasons: coarse satellite footprint not allowing to reproduce local scale NO 2 distribution [62], inappropriate assimilation technique [63], or temporary scale (monthly) analysis that could mask the effects on a daily basis. Results with 4DVAR assimilation in the Integrated Forecasting System (IFS) of the European Center for Medium Range Weather Forecasts (ECMWF) [64] indicate that the small effect of the assimilation of satellite-retrieved tropospheric NO 2 columns is due to the short lifetime of NO 2.

Comparison of BgCWFS Results to Surface Observations
The results from the two versions of BgCWFS in domain Bulgaria were compared to data from the surface AQ monitoring network for PM10, PM2.5, NO2, and SO2. Tables 1 and 2 provide statistical parameters (defined in Appendix A) for the monthly mean values based on all available data, respectively for August 2017 and February 2019, while Figure  8 shows the comparison for the monthly mean values. The assimilation of satellite data led to a decrease in the FGE and NMB scores for all pollutants, except NO2 and PM2.5 in February. For PM10, the NMB decreased in BgCWFS_sat more noticeably in August, by ~50%, than in February, by 34%. For SO2, the sat-version showed a decrease in the NMB by 72% on average for the two months. PM2.5 concentrations from BgCWFS_sat were overestimated, more evidently in February, 34%. It should be noted that only 4 stations provided PM2.5 data for this month. NO2 from BgCWFS_mod was strongly underestimated (by a factor of 3 in August). The possible reason might be associated with the ozone overestimation in the summer period found in previous evaluation of BgCWFS_mod [35]. The NO2 underestimation also indicated a deficit in modeled emissions from the traffic. As highlighted for the domain mean NO2 concentrations, the assimilation of satellite data did not lead to improvement at the stations either.

Comparison of BgCWFS Results to Surface Observations
The results from the two versions of BgCWFS in domain Bulgaria were compared to data from the surface AQ monitoring network for PM 10 , PM 2.5 , NO 2 , and SO 2 . Tables 1 and 2 provide statistical parameters (defined in Appendix A) for the monthly mean values based on all available data, respectively for August 2017 and February 2019, while Figure 8 shows the comparison for the monthly mean values. The assimilation of satellite data led to a decrease in the FGE and NMB scores for all pollutants, except NO 2 and PM 2.5 in February. For PM 10 , the NMB decreased in BgCWFS_sat more noticeably in August, by~50%, than in February, by 34%. For SO 2 , the sat-version showed a decrease in the NMB by 72% on average for the two months. PM 2.5 concentrations from BgCWFS_sat were overestimated, more evidently in February, 34%. It should be noted that only 4 stations provided PM 2.5 data for this month. NO 2 from BgCWFS_mod was strongly underestimated (by a factor of 3 in August). The possible reason might be associated with the ozone overestimation in the summer period found in previous evaluation of BgCWFS_mod [35]. The NO 2 underestimation also indicated a deficit in modeled emissions from the traffic. As highlighted for the domain mean NO 2 concentrations, the assimilation of satellite data did not lead to improvement at the stations either.   In Figures 9 and 10, the comparison for PM10 and SO2 at three selected stations (see Figure 4) for August 2017 is shown. PM10 values from BgCWFS_sat were underestimated at all stations, but to a lesser extent than the values from BgCWFS_mod (average NMB values −31.7% and −64.3%, respectively). The station with the highest underestimation from BgCWFS_sat (NMB −47% for PM10 and −37% for SO2) was Plovdiv-Kamenitza, which In Figures 9 and 10, the comparison for PM 10 and SO 2 at three selected stations (see Figure 4) for August 2017 is shown. PM 10 values from BgCWFS_sat were underestimated at all stations, but to a lesser extent than the values from BgCWFS_mod (average NMB values −31.7% and −64.3%, respectively). The station with the highest underestimation from BgCWFS_sat (NMB −47% for PM 10 and −37% for SO 2 ) was Plovdiv-Kamenitza, which indicates that air pollution at this site was more influenced by local factors, not captured by the system with 9 km resolution. Similar plots for PM 2.5 ( Figure S5) confirm general improvement from BgCWFS_sat, especially at the urban sites. As already shown by the distribution maps, NO 2 was underestimated by both model versions by a factor of 6 on average for the selected stations ( Figure S6). For February 2019, the comparison of PM 10 and PM 2.5 ( Figures S7 and S8) showed better matching from BgCWFS_sat compared to the observations at the Sofia-Hipodruma station with absolute value of the NMB less than 8%. For Plovdiv-Kamenitza station, significant underestimation of PM from BgCWFS_sat was still observed (NMB −55% on average). At Sofia-Kopitoto mountain station, PM concentrations from BgCWFS_sat were overestimated by a factor of about 3. At Sofia-Drujba and Plovdiv-Kamenitza stations, NO 2 ( Figure S9) was underestimated by both model versions with NMB of about −75%. BgCWFS_sat led to clear improvement of SO 2 at Plovdiv-Kamenitza station ( Figure S10) with NMB decreasing from −64% to −36%. For the other two stations (Sofia-Drujba and Sofia-Kopitoto), the assimilation of satellite data resulted in overestimated SO 2 with an average NMB of 55%.

BgCWFS-LAQMS Results for Plovdiv
The LAQMS simulates surface concentrations of pollutants for the city of Plovdiv due to different emission sources in the city (household heating, traffic, industry) and outside of the city (background concentrations). The background concentrations were provided by BgCWFS_mod and BgCWFS_sat. In the LAQMS domain (size about 10 × 10 km 2 ), the background values from BgCWFS showed negligible changes. Thus, for simulations with LAQMS, the background concentrations, and accordingly the satellite corrections were assumed to be constant within the LAQMS model domain.
The PM 10 background concentrations from BgCWFS_sat were significantly higher than those from BgCWFS_mod, as shown by the monthly mean values in Figure 11. The increase in both months was more than 50%.

BgCWFS-LAQMS Results for Plovdiv
The LAQMS simulates surface concentrations of pollutants for the city of Plovdiv due to different emission sources in the city (household heating, traffic, industry) and outside of the city (background concentrations). The background concentrations were provided by BgCWFS_mod and BgCWFS_sat. In the LAQMS domain (size about 10 × 10 km 2 ), the background values from BgCWFS showed negligible changes. Thus, for simulations with LAQMS, the background concentrations, and accordingly the satellite corrections were assumed to be constant within the LAQMS model domain.
The PM10 background concentrations from BgCWFS_sat were significantly higher than those from BgCWFS_mod, as shown by the monthly mean values in Figure 11. The increase in both months was more than 50%. Figure 11. Monthly mean PM10 background concentrations for the city of Plovdiv from BgCWFS_mod and BgCWFS_sat.
In Figures 12 and 13, the monthly mean PM10 concentrations for August 2017 and February 2019 are shown. In the winter month, almost the whole domain featured PM10 above 50 μgm −3 , with some central parts registering even higher than 100 μgm −3 . In Figures 12 and 13, the monthly mean PM 10 concentrations for August 2017 and February 2019 are shown. In the winter month, almost the whole domain featured PM 10 above 50 µgm −3 , with some central parts registering even higher than 100 µgm −3 . Figure 11. Monthly mean PM10 background concentrations for the city of Plovdiv from BgCWFS_mod and BgCWFS_sat.
In Figures 12 and 13, the monthly mean PM10 concentrations for August 2017 and February 2019 are shown. In the winter month, almost the whole domain featured PM10 above 50 μgm −3 , with some central parts registering even higher than 100 μgm −3 . In Figure 13, additional information about the main sources contributing to the PM 10 concentration in February 2019 is provided. The maps indicate zones with higher concentrations due to household heating and zones where traffic is the major contributor to PM 10 . In Figure 13, additional information about the main sources contributing to the PM10 concentration in February 2019 is provided. The maps indicate zones with higher concentrations due to household heating and zones where traffic is the major contributor to PM10.  Tables 3 and 4 show the comparison of modeled to observed PM10 monthly mean values, respectively for August 2017 and February 2019, at the two AQ stations in Plovdiv-Kamenitza (urban background) and Trakia (traffic) (locations marked in Figures 12 and  13). The contribution of BgCWFS_mod background concentrations to the calculated PM10 concentrations for February 2019 was 20% at Kamenitza station and 18.5% at Trakia station. With the use of satellite data, the contribution of BgCWFS_sat background concentration increased to 28% at Kamenitza and to 26% at Trakia. For August 2017, the corresponding numbers using BgCWFS_mod were 24.4% and 21.9%, while for BgCWFS_sat they were, respectively, 38.7% and 35%.
The use of BgCWFS_sat and LAQMS models led to a small overestimation for PM10 increased to 28% at Kamenitza and to 26% at Trakia. For August 2017, the corresponding numbers using BgCWFS_mod were 24.4% and 21.9%, while for BgCWFS_sat they were, respectively, 38.7% and 35%.
The use of BgCWFS_sat and LAQMS models led to a small overestimation for PM 10 at Kamenitza station for the two months (by 8% on average) while at the Trakia traffic station, PM 10 remained underestimated, but with a decrease in the NMB (from −23% to −12%).

Discussion
Satellite-retrieved data for atmospheric chemistry parameters were used for the first time in Bulgaria in air pollution modeling systems-a regional one, based on a chemical transport model (BgCWFS), and an urban one, designed for local air quality management in the city of Plovdiv (LAQMS). To investigate the effect of satellites data assimilation on surface concentrations in the country and in the city of Plovdiv, BgCWFS was set up in offline mode and modified for the calculation of vertical columnar parameters in the chemical transport model of BgCWFS, for the assimilation of GOME-2 derived data for AOD, VCD_NO 2 , and VCD_SO 2 in the three coarser BgCWFS domains (Europe, the Balkan Peninsula, and Bulgaria). A rather simple approach (objective analysis and correction of pollutants profiles) was used for the satellite data assimilation in BgCWFS at satellite overpass time fixed at 09:00 UTC. The output of BgCWFS at country level (horizontal resolution 9 km) was used to estimate the background concentrations for the LAQMS. Furthermore, the emissions in the city of Plovdiv were updated and clarified using a GIS-based bottom-up emission inventory.
The simulation experiments conducted using the two versions of the system (with and without satellite data), BgCWFS_sat and BgCWFS_mod, for two months (August 2017 and February 2019) showed an increase in the domain mean surface concentrations of PM 10 , PM 2.5 , and SO 2 , while the NO 2 values remained almost unchanged. The increase in domain mean PM 10 was highest in the summer month (10 µgm −3 ), where the map for the spatial distribution showed enhanced values especially in the southern and eastern part of Earth 2021, 2 600 the country, corresponding to the mineral dust events with air masses approaching from the south-west and moving towards the north-east. Thus, the assimilation of satellite data provided the possibility to account for emissions not included in BgCWFS. The increase of domain mean SO 2 was highest in the winter month (~6 µgm −3 ) and this is likely due to the relatively high emissions by coal-fired thermal power plants in the Balkan Peninsula. The weak impact of satellites data assimilation on surface and columnar NO 2 values confirmed the results in [60].
The comparison of BgCWFS monthly mean surface concentrations to observations at air quality monitoring sites in Bulgaria suggested improvement when using satellite data. The NMB from BgCWFS_sat, averaged over the two months, was lower than the NMB from BgCWFS_mod by 43% (PM 10 ) and 73% (SO 2 ). The monthly mean PM 10 values simulated by BgCWFS_sat were still underestimating the observed concentrations in most of the air quality stations. This was expected, as model values are representative for a grid of 9 km, while observations represent local conditions. Interestingly, we noticed a tendency for PM 2.5 overestimation which is more likely due to the assimilation approach of correcting profiles of different species. Further analysis is needed to understand this behavior. NO 2 concentrations were strongly underestimated by both BgCWFS versions, suggesting a deficit in model emissions.
The LAQMS benefits from the satellite data assimilation through the background concentrations for the Plovdiv domain provided by BgCWFS_sat. The simulation experiments with LAQMS for the two months estimated the PM 10 concentrations caused by different emission sources in the city (household heating, traffic, industry) and outside the city through the background concentrations. The LAQMS estimated the contribution of the background concentrations to PM 10 at the locations of the two air quality stations in Plovdiv as 32%, averaged for the two months. The absolute value of the NMB decreased from 15.5% (without satellite data) to 9.6% (with satellite data).
The assimilation algorithms developed for GOME-2 retrieved data can be adapted to data from other instruments that have finer spatial resolution (e.g., from the Tropospheric Monitoring Instrument (TROPOMI) on board the Copernicus Sentinel-5 Precursor satellite, TROPOMI-S5p) and could be used for data assimilation in the smaller domains of BgCWFS (e.g., the Sofia region).
Finally, this study is a significant step towards the development of an operational system for assimilation of satellite data over Bulgaria. The presented prototype of the modeling chain BgCWFS-LAQMS is in pre-operative mode. This prototype can be further modified for operational runs, with activities on streaming the satellite dataflow at the overpass time to BgCWFS runs in nested domains and setting up links to LAQMS in near-real time. Acknowledgments: Deep gratitude is due to ESA services providing data from the GOME-2 instrument on board the MetOp-A, B and C satellites. The authors are grateful also to the US EPA for providing WRF-CMAQ models, to the TNO for the emission inventory at European scale used in BgCWFS, and to the European Environment Agency for observational data from the surface air quality monitoring network. We acknowledge the use of data and/or imagery from NASA's Fire Information for Re-source Management System (FIRMS) (https://earthdata.nasa.gov/firms (accessed on 31 August 2021)), part of NASA's Earth Observing System Data and Information System (EOSDIS).

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results. The views expressed herein сan in nоway be taken to reflect the official opinion of the Еurореan Space Agency.