Implications of COVID-19 Restriction Measures in Urban Air Quality of Thessaloniki, Greece: A Machine Learning Approach

Following the rapid spread of COVID-19, a lockdown was imposed in Thessaloniki, Greece, resulting in an abrupt reduction of human activities. To unravel the impact of restrictions on the urban air quality of Thessaloniki, NO2 and O3 observations are compared against the business-asusual (BAU) concentrations for the lockdown period. BAU conditions are modeled, applying the XGBoost (eXtreme Gradient Boosting) machine learning algorithm on air quality and meteorological surface measurements, and reanalysis data. A reduction in NO2 concentrations is found during the lockdown period due to the restriction policies at both AGSOFIA and EGNATIA stations of −24.9 [−26.6, −23.2]% and −18.4 [−19.6, −17.1]%, respectively. A reverse effect is revealed for O3 concentrations at AGSOFIA with an increase of 12.7 [10.8, 14.8]%, reflecting the reduced O3 titration by NOx. The implications of COVID-19 lockdowns in the urban air quality of Thessaloniki are in line with the results of several recent studies for other urban areas around the world, highlighting the necessity of more sophisticated emission control strategies for urban air quality management.


Introduction
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the virus responsible for coronavirus disease 2019 (COVID- 19) [1], was first reported in Wuhan, China [2] in late December 2019. Its high contagiousness [3] and the substantial early undocumented infections [4] facilitated the rapid worldwide spread of COVID-19, which was confirmed as a pandemic by the World Health Organization (WHO) in March 2020 [5]. In the absence of vaccinations, and to control the COVID-19 outbreak, several social, working, and transportation restrictions were gradually imposed around the world [6], with the strictness and duration of the so-called "lockdowns" varying by country.
The sharp reduction in human activities, as a result of restrictions measures, resulted in a decrease of major air pollutants, such as carbon dioxide (CO 2 ) [6,7] and nitrogen dioxide (NO 2 ) [8,9]. Assessing the impact of COVID-19-associated anthropogenic emissions decline on air quality is a challenging task, as weather variability should also be considered [10], before attributing air pollutant concentration changes to COVID-19 restrictions alone. Recently, several studies have investigated the potential NO 2 and O 3 benefits ascribed to COVID-19 related emission reductions using statistical (machine learning, multivariate regression, difference-in-difference, generalized additive models) [11][12][13][14] and/or modeling [15][16][17] approaches. In particular, Keller et al. [18] and Grange et al. [11] explored the implications of COVID-19 restrictions in NO 2 and O 3 concentrations around the globe and in Europe, respectively, while other studies were performed for individual countries [17,[19][20][21][22][23] and cities [24][25][26]. Overall, in urban environments, NO x reductions during the COVID-19 lockdowns were followed by O 3 increases [27,28]. Apart from reduced O 3 titration by NO x , the increasing O 3 pollution is also subject to variations of volatile organic compounds (VOC) emissions [29].
Thessaloniki, located at the northern part of Greece, is the second largest city of the country, with over one million inhabitants in its metropolitan area. In addition, the city is located within a local pollution hot spot [30]. In accordance with the national restrictions imposed during the COVID-19 outbreak, Thessaloniki's residents strictly limited their social, working, and educational activities. In particular, and after the first confirmed COVID-19 case in Thessaloniki (first in Greece also) on 26 February 2020, the following actions were taken: March 2020: the operations of educational institutions, at all levels nationwide, were suspended. • 14 to 18 March 2020: coffee shops, restaurants, bars, markets, tourism services, and museums were closed, and several social activities were held. • 23 March 2020: a strict lockdown was applied with significant restrictions on the movement of citizens throughout the territory.
The reverse process-in regard to the lifting of restrictions-was applied as follows: • 5 May 2020: a gradual lifting of restrictions was introduced. • 11 May 2020: reopening of markets. • 25 May 2020: reopening of coffee shops, restaurants, and bars. • 1 July 2020: reopening of tourism services.
As COVID-19 cases increased in late October 2020, and considering the available epidemiological data, the government announced a second strict lockdown in Thessaloniki, which was soon later imposed across the whole country.
The present study assesses the impacts of COVID-19 restriction policies on the urban air quality (NO 2 and O 3 ) of Thessaloniki, Greece, using NO 2 and O 3 measurements from two ground-based stations, meteorological observations from one meteorological station, high-resolution meteorological reanalysis data, and the so-called XGBoost (eXtreme Gradient Boosting) machine learning algorithm. Section 2 presents the observational and reanalysis data used in the analysis, providing also a description of the applied XGBoost algorithm and the overall methodological approach. In Section 3, the main results are presented and discussed, in regard to recent research findings. Finally, Section 4 summarizes the key findings of the study.

Data Description
Air quality data from two ground-based stations located in the urban area of Thessaloniki are used in the analysis. NO 2 and O 3 hourly concentration measurements from the Agia Sofia Square station (hereafter called AGSOFIA) (40.634°N, 22.945°E), from 2018 to 2020, were provided by the Department of Environment and Hydroeconomy of the Region of Central Macedonia. AGSOFIA is a low altitude (27 m) station located in Thessaloniki city center and is characterized as an urban-traffic station. The NO x (NO-NO 2 -NO x ) analyzer is the HORIBA Ltd. (Kyoto, Japan) model APNA-360, based on the chemiluminescence measurement technique, with a detection limit of 0.5 ppb, while the calibration of the instrument is performed according to the technical standards. The O 3 analyzer is the HORIBA Ltd. model APOA-360, using the principle of UV fluorescence, with a detection limit of 0.5 ppb and a calibration according to the technical standards.
In addition, NO 2 hourly concentrations from the Egnatia station (hereafter called EGNATIA) (40.638°N, 22.941°E), from 2018 to 2020, were provided by the Environmental Department of the Municipality of Thessaloniki. EGNATIA station is an urban-traffic station in the city center of Thessaloniki with an elevation of 11 m. The NO x (NO-NO 2 -NO x ) analyzer is the Envea (environment s.a.) model AC32M, which is in compliance with; EN 14211 (2012); U.S. EPA approved as compliant, no. RFNA-0202-146; certified by the TÜV report N°936/21205818/C; and ISO 7996, VDI4202. The measurement technique is based on chemiluminescence, with a detection limit of 0.4 ppb. The calibration of the instrument is performed on a monthly basis with the gas phase titration technique and calibration gas cylinders. Descriptive statistics for NO 2 and O 3 concentrations at AGSOFIA station, and NO 2 concentrations at EGNATIA station, from 1 January 2018 to 31 December 2019, are provided in Table S1 of the Supplementary Materials.
Hourly meteorological data from the meteorological station of the Aristotle University of Thessaloniki (hereafter called METEO-AUTH) were also used, for the time period 2018 to 2020, including temperatures at 2 m, mean sea level pressure, wind speed at 10 m, total precipitation, relative humidity, and total radiation. METEO-AUTH is located in the Aristotle University Campus at an altitude of 32 m, being in close proximity to the aforementioned air quality stations (∼1 km from AGSOFIA and ∼1.5 km from EGNATIA). The location of the air quality and meteorological stations is depicted in Figure 1.
Moreover, hourly data of boundary layer height, wind speed at 10 m, temperature at 2 m, mean sea level pressure, and total radiation for the period from 2018 to 2020, were obtained from the state-of-the-art ERA5 reanalysis dataset [31]. ERA5 is issued by the European Centre for Medium-Range Weather Forecasts (ECMWF) and produced with the 4D-Var data assimilation and model forecasts of the ECMWF CY41R2 Integrated Forecast System (IFS), including 137 hybrid sigma/pressure levels in the vertical and a horizontal resolution of 0.25°× 0.25°. The ERA5 data were extracted for the grid cell (22.95°E, 40.65°N) that includes the air quality and meteorological stations.

Business as Usual (BAU) Model
Here, we apply the XGBoost machine learning algorithm [32] to model NO 2 and O 3 hourly concentrations at the air quality stations, in order to unravel the impact of COVID-19 restrictions on the observed air quality. XGBoost is a supervised learning technique based on the ensemble of gradient boosting decision trees, which is nowadays widely used in data science due to its scalability, speed, and performance [32]. Detailed information on the XGBoost algorithm, hyperparameters, and implementation can be found at https://xgboost.readthedocs.io/en/latest/index.html (accessed on 11 November 2021). Here, the python XGBoost library is used.
For each air quality monitoring station and species, we built an XGBoost model using meteorological and time explanatory variables (so-called features) to explain the observed time-varying concentrations of the examined species. A past period was used for the training of the models, which represents a business-as-usual (BAU) situation. Subsequently, we applied the BAU model for the COVID-19 period in order to predict what would be the expected species hourly concentrations during this period in BAU conditions (absence of restriction policies). The residuals of the observations and BAU during the COVID-19 period reveal the impact of COVID-19 restrictions on the observed air quality. The benefit of the applied approach is that natural meteorological variability is consider before extracting the implications for air quality, unlike simply subtracting species concentrations during past periods from the respective concentrations during the COVID-19 period. Similar XGBoost-based machine learning approaches for the same scientific purpose were also applied recently by Keller et al. [18] and Granella et al. [33].
The air quality data (target variables) at both examined stations exhibit a data completeness ≥70% for the train period (1 January 2018 to 31 December 2019). The meteorological features include observations of air temperature at 2 m, total precipitation, mean sea level pressure, relative humidity, wind speed at 10 m, and total radiation, as well as ERA5 boundary layer height. Missing data of air temperature at 2 m (∼0.4%), mean sea level pressure (∼0.4%), total radiation (∼5.3%), and wind speed at 10 m (∼8%) are filled by the ERA5 reanalysis. The correlation coefficients between observations and ERA5 hourly values for temperature at 2 m, mean sea level pressure, total radiation, and wind speed are 0.97, 0.98, 0.93, and 0.65, respectively. The wind direction was initially included in the BAU model, but as it had no significant contribution, it was excluded. The time explanatory variables include hour of the day, day of week, and month, and were created using the one-hot encoding method.

Hyperparameter Tuning
Prior to the creation of the XGBoost machine learning model for prediction, two basic processes are essential; the hyperparameters tuning and testing of the model performance on unseen data. The hyperparameters are a set of parameters that shape the learning process and, thus, control the overall behavior of the XGBoost model. To this end, we perform a tuning of the hyperparameters max_depth, n_estimators, min_child_weight, sub_sample, eta, reg_alpha, reg_lambda, and gamma using the Hyperopt Python library [34] that uses a form of Bayesian optimization, providing the hyperparameters that minimize/maximize a specified metric (R 2 here) for the given model. For as fair as possible tuning, and to avoid overfitting, we apply an 8-fold cross-validation with shuffle split to the dataset that extended from 1 January 2018 to 31 December 2019. The mean (out of eight cross validations) R 2 values during the tuning process for NO 2 at AGSOFIA, O 3 at AGSOFIA, and NO 2 at EGNATIA, are 0.67, 0.82, and 0.57, respectively. With the selected hyperparameters (see Table S2 in the Supplementary Materials) applied, the XGBoost model (BAU) is trained over the period from 1 January 2018 to 31 December 2019, applying the "early stopping" technique that terminates training at the point where performance on the test dataset starts to decrease while performance on the training dataset continues to improve, in order to avoid overfitting.

Evaluation of BAU Model
To gain confidence in the BAU model use, we evaluate its prediction performance from 1 January to 15 February 2020 (1 January to 10 February 2020, for the EGNATIA station due to missing data), a period which was not included in the train process and, thus, is suitable for an independent evaluation. For brevity, we present evaluation results only for the AGSOFIA station, while for the EGNATIA station, the respective results are included as Supplementary Materials ( Figure S1). Figure 2a presents the observed and BAU (predicted) time series of NO 2 hourly concentrations at the AGSOFIA station, indicating a good agreement between observations and BAU during the test period, with a Pearson correlation coefficient (R) of 0.83, a mean bias (MB) of −0.19 µg m −3 , and a normalized root mean squared error (NRMSE) of 0.28. The former is also confirmed by the scatter plot of the observed and BAU NO 2 hourly concentrations shown in Figure 2b. Noteworthy are the normally distributed NO 2 concentration residuals (Figure 2c), revealing that the BAU model over-and underpredicts with a relatively equal probability, underestimating/overestimating high/low observed NO 2 concentrations. As for O 3 , Figure 3a

Estimation of Uncertainty
For a more robust interpretation of the results, estimation of the uncertainty in the BAU model predictions is essential. To this end, two methodologies are applied to estimate the uncertainty of BAU predictions following Petetin et al. [21] and Keller et al. [18]. At first, the 2018-2019 period was randomly split in eight sub-periods, and subsequently, the BAU model was applied using seven sub-periods for train and the remaining for test, in an iterative process of eight steps. The hourly residuals (BAU-OBS) were then calculated and the following two methodologies were applied to estimate the uncertainty.

•
Method1: following Petetin et al. [21], the 5th and 95th percentiles of the hourly residuals are found forming a fixed asymmetric 90% confidence interval for the hourly predictions. As our results are presented as a 1-month moving average, the 1-month moving average of the hourly residuals are calculated, and the respective 5th and 95th percentiles are used for the 90% confidence interval. • Method 2: following Keller et al. [18], the standard deviation of the hourly residuals is used as the uncertainty of the hourly predictions. The temporal (monthly) average of the uncertainty is then calculated as follows: where σ the mean uncertainty, σ i the hourly uncertainty, and N the sample size. Here we use the 2σ interval corresponding to an approximately 95% confidence interval.
The resulting uncertainties of each BAU model predictions are presented in Table 1.

Results
To unravel the impact of COVID-19 restrictions on urban air quality of Thessaloniki, we compare the BAU-predicted NO 2 and O 3 concentrations during the COVID-19 period with the actual observations. To obtain more robust conclusions from the results, the MB of BAU 1-month moving average concentrations during the test period is removed (positive/negative bias is subtracted/added) from the respective BAU values during the lockdown period. This correction process is only applied for the reported in the text differences and percentage differences between observations and BAU, while in all figures, we present the original concentrations (without correction). As both uncertainty estimation methods presented in Section 2.2.4 give quite similar intervals, hereafter, only uncertainties obtained by method 2 are reported as 95% confidence interval enclosed in brackets. Figure 4a presents the observed and BAU time series of NO 2 1-month moving average concentrations for the AGSOFIA station during the year 2020. Until the first lockdown at 16 March 2020, the observed NO 2 time series followed closely the BAU NO 2 time series, indicating that the observed situation before the lockdown was a BAU one. From 16 March to early May 2020, a distinct decrease of observed NO 2 concentrations was seen in contrast to the respective BAU time series, which remained relatively constant. From early May 2020, and as the COVID-19 restrictions were gradually lifted, the observed NO 2 concentrations were at first stabilized, and from early June 2020, increased, approaching the NO 2 BAU concentration levels. The residual (observation-BAU) of NO 2 and O 3 1-month moving average concentrations are presented in Figure 4c, indicating a sharp decrease of the NO 2 residual concentrations, which, during the lockdown period (16 March to 5 May 2020) ranged from −0.6 to −8.5 µg m −3 with an average value of −5.7 [−6.21, −5.19] µg m −3 , reflecting the direct effect of COVID-19 restrictions in NO 2 levels at AGSOFIA. The respective 1-month moving average NO 2 percentage differences between observations and BAU depicted in Figure 4d reveal a similar decrease for the same period ranging from −2.6 to −39.1% with an average value of −24.9 [−26.6, −23.2]%. Recently, Keller et al. [18] examined observational data from 46 countries reporting that during the lockdowns, NO 2 concentrations were on average 18% lower than BAU. At a European scale, Grange et al. [11] estimated a mean percentage change of −34% for NO 2 based on European traffic stations, using Bayesian inference to detect change points and subsequently the lockdown periods. As for Greece, Grivas et al. [24] using in situ observations found an NO 2 decline of 32% compared to the pre-lockdown period in Athens, Greece, while Koukouli et al. [35] using satellite observations and a chemical transport model, attributed a 12% decrease of tropospheric NO 2 columns at Thessaloniki during March to emissions changes.
As regards O 3 , prior to the lockdown, the observed and BAU concentrations varied similarly, while from the lockdown onset, the observed O 3 levels started to diverge incrementally from the respective BAU levels up to early May 2020, returning close to BAU levels in mid-June 2020 (Figure 4b). The aforementioned NO 2 decline was accompanied with a synchronous increase of O 3 residual (observation-BAU) concentrations of an average value of 7 [6.06, 7.94] µg m −3 (range from 3 to 12.9 µg m −3 ) (Figure 4c) and percentage differences of an average value of 12.7 [10.8, 14.8]% (range from 7 to 21%) (Figure 4d). The clear anticorrelation (R of −0.74) between NO 2 and O 3 residuals during the lockdown period (grey shaded areas in Figure 4c,d) reflects the link between NO x and O 3 chemistry, with reduced NO x resulting in reduced O 3 titration by NO and thus in enhanced O 3 levels. Moreover, in a likely VOC-limited environment, a decline in NO x leads in enhanced O 3 production. The mean European percentage changes of O 3 for traffic sites during the lockdowns was reported 30% by Grange et al. [11], mainly attributed to decreased O 3 destruction via the titration cycle. Similar increases in European O 3 concentrations during the lockdowns were also reported by Gualtieri et al. [20], Higham et al. [22], Tobías et al. [26], Sicard et al. [27], Putaud et al. [36]. The same behavior is revealed from the observed and BAU diurnal cycles of NO 2 and O 3 concentrations at AGSOFIA during the period from 16 March to 31 May 2020, shown in Figure 5. Both BAU and observed NO 2 diurnal cycles exhibit two distinct peaks; an early morning due to rush-hour vehicle traffic and a late evening resulting from both emissions and accumulation, due to absence of sunlight inhibiting NO 2 photolysis. The respective O 3 diurnal cycles are characterized by a broad afternoon maximum owing to its photochemical production. Lower/higher NO 2 /O 3 concentrations are observed throughout the day and night compared to BAU as depicted in Figure 5a,b. In particular, the diurnal cycles of NO 2 and O 3 residuals shown in Figure 5c   Regarding the second lockdown imposed in early November 2020, although a decrease/increase of NO 2 /O 3 levels is seen at AGSOFIA station as shown in Figure 4, these are not as pronounced as in the first lockdown, reflecting the greater intensity of the first lockdown. During the intermediate period between the two lockdowns, the observed NO 2 and O 3 concentrations are more close to BAU values compared to that during the lockdown periods.
As for the second examined station, Figure 6 presents both the time series and diurnal cycles of observed and BAU NO 2 concentrations at the EGNATIA station. In agreement with the AGSOFIA station, a decrease of observed NO 2 1-month moving average concentrations was seen from early March 2020 relative to BAU concentrations with an average value of −13.1 [−14.2, −12] µg m −3 (Figure 6a), while the percentage difference between observations and BAU during the lockdown (16 March 2020 to 5 May 2020) ranged from −7.3% to −24.1% with an average value of −18.4 [−19.6, −17.1]% (Figure 6b). From early May 2020, and as the restrictions started to relax, the observed NO 2 levels gradually increased, reaching near BAU values from mid-June 2020 onward. Regarding the second lockdown in early November 2020, this was not clearly reflected in the observed and residual NO 2 concentrations. The diurnal cycle of the observed and BAU NO 2 concentrations at the EGNATIA station from 16 March to 31 May 2020, shown in Figure 6c, exhibits the same two-peak shape as that of the AGSOFIA station, revealing a clear decline of NO 2 levels throughout the day and night during the first lockdown period.

Conclusions
This study investigated the impact of COVID-19 restriction policies on the urban air quality of Thessaloniki, Greece, using air quality and meteorological observations, ERA5 reanalysis data, and the XGBoost machine learning algorithm. Modeling NO 2 and O 3 BAU concentrations and comparing with observations for the lockdown period, allowed the assessment of COVID-19 restriction footprints on the urban air quality of Thessaloniki. With the applied approach, the impact of weather variability was also considered.
Both AGSOFIA and EGNATIA urban-traffic stations exhibited an NO 2 drop during the lockdown period (from 16  Consistent with recent studies on other urban areas worldwide, the NO 2 decrease at Thessaloniki during the COVID-19 lockdown was associated with an O 3 increase. Such reallife modeling scenarios, like the COVID-19 pandemic, provide tangible evidence that, for a better future air quality management in urban environments, a more sophisticated emissions control is required. To mitigate O 3 pollution, NO x reductions should be accompanied by VOC-targeted control strategies specifically designed for each urban environment. Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/atmos12111500/s1. Table S1: Descriptive statistics (mean, 10th percentile, 25th percentile, median, 75th percentile, 90th percentile, and standard deviation) for NO 2 (µg m −3 ) and O 3 (µg m −3 ) at AGSOFIA station, and NO 2 (µg m −3 ) at EGNATIA station, for the time period from 1 January 2018 to 31 December 2019. Table S2: XGBoost hyperparameters used for the BAU models. Figure  S1: (a) Observed (black) and BAU (green) timeseries of NO 2 hourly concentrations at EGNATIA station during the time period from 1 January to 10 February 2020. (b) Scatter plot of observed and BAU NO 2 hourly concentrations at EGNATIA station during the time period from 1 January to 10 February 2020. The solid green line represents the regression line of the observed and BAU NO 2 concentrations, while the dashed black line is the 1:1 line. (c) Histogram of NO 2 residual concentrations (BAU-observation) at EGNATIA station during the time period from 1 January to 10 February 2020. Figure

Data Availability Statement:
The air quality and meteorological observations used in the present study are available from the authors upon request. The ECMWF ERA5 reanalysis data [38] were obtained from the Copernicus Climate Change Service Climate Data Store (CDS) (https://cds. climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-single-levels?tab=form) (accessed on 11 November 2021).