Estimation of Lower ‐ Stratosphere ‐ to ‐ Troposphere Ozone Profile Using Long Short ‐ Term Memory (LSTM)

: Climate change and air pollution are emerging topics due to their possible enormous implications for health and social perspectives. In recent years, tropospheric ozone has been recognized as an important greenhouse gas and pollutant that is detrimental to human health, agriculture, and natural ecosystems, and has shown a trend of increasing interest. Machine ‐ learning ‐ based approaches have been widely applied to the estimation of tropospheric ozone concentrations, but few studies have included tropospheric ozone profiles. This study aimed to predict the Northern Hemisphere distribution of Lower ‐ Stratosphere ‐ to ‐ Troposphere (LST) ozone at a pressure of 100 hPa to the near surface by employing a deep learning Long Short ‐ Term Memory (LSTM) model. We referred to a history of all the observed parameters (meteorological data of European Centre for Medium ‐ Range Weather Forecasts (ECMWF) Reanalysis v5 (ERA5), satellite data, and the ozone profiles of the World Ozone and Ultraviolet Data Center (WOUDC)) between 2014 and 2018 for training the predictive models. Model–measurement comparisons for the monitoring sites of WOUDC for the period 2019–2020 show that the mean correlation coefficients (R 2 ) in the Northern Hemisphere at high latitude (NH), Northern Hemisphere at middle latitude (NM), and Northern Hemisphere at low latitude (NL) are 0.928, 0.885, and 0.590, respectively, indicating reasonable performance for the LSTM forecasting model. To improve the performance of the model, we applied the LSTM migration models to the Civil Aircraft for the Regular Investigation of the Atmosphere Based on an Instrument Container (CARIBIC) flights in the Northern Hemisphere from 2018 to 2019 and three urban agglomerations (the Sichuan Basin (SCB), North China Plain (NCP), and Yangtze River Delta region (YRD)) between 2018 and 2019. The results show that our models performed well on the CARIBIC data set, with a high R 2 equal to 0.754. The daily and monthly surface ozone concentrations for 2018–2019 in the three urban agglomerations were estimated from meteorological and ancillary variables. Our results suggest that the LSTM models can accurately estimate the monthly surface ozone concentrations in the three clusters, with relatively high coefficients of 0.815–0.889, root mean square errors (RMSEs) of 7.769–8.729 ppb, and mean absolute errors (MAEs) of 6.111–6.930 ppb. The daily scale performance was not as high as the monthly scale performance, with the accuracy of R 2 = 0.636~0.737, RMSE = 14.543–16.916 ppb, MAE = 11.130–12.687 ppb. In general, the trained module based on LSTM is robust and can capture the variation of the atmospheric ozone distribution. Moreover, it also contributes to our the air pollution, increasing our of pollutant areas. the SCB, NCP, and YRD from 2018 to 2019. Our results suggest that the performance of the LSTM models showed a good estimation of the monthly surface ozone concentrations in all the three clusters, with a relatively high coefficient of 0.815 − 0.889, RMSE of 7.769 − 8.729 ppb, and MAE of 6.111 − 6.930 ppb. The daily scale performance was not as high as the monthly scale performance, with an accuracy of R 2 = 0.636 − 0.737, RMSE = 14.543 − 16.916 ppb, MAE of= 11.130 − 12.687 ppb. The ozone concentrations in SCB might be affected by the high annual temperature and relative humidity. The overall predicted surface ozone concentration of our models was underestimated compared to the observations. The underestimation of the predicted ozone largely depends on the number of training samples and the sampling frequency. The distribution of the retrieval lower ‐ stratosphere ‐ to ‐ troposphere ozone concentrations can be conducive to the study of ozone transportation and pollution in some small ‐ and medium ‐ scale regions, which is of great significance for the study of long ‐ term ozone variation and its causes.


Introduction
Ozone (O3) is considered to be a particularly significant trace gas in the Earth's atmosphere, 90% of which is distributed in the stratosphere and 10% in the troposphere [1]. Stratospheric ozone protects the Earth's biota from harmful UV radiation. In the troposphere, ozone is a type of greenhouse gas [2,3] and is the main air pollutant endangering human health, agriculture, and natural ecosystems, and it also traps heat in the Earth's atmosphere and plays as an important role in atmospheric chemistry, impacting air quality and climate change [4,5]. Mills et al. demonstrated that a long-term ozone rate over 40 ppb may result in some loss of crops and ecosystems [6,7]. Ayres et al. and Taylan et al. suggested that hourly ozone concentrations should not exceed 80 ppb and/or 50-60 ppb in a maximum daily eight-hour average (MDA8) [7,8]. The World Health Organization (WHO 2006 and2017) recommended that the ozone level of the MDA8 should be within 100 μg/m −3 [9]. If the ozone concentration is higher than these values, it will pose a threat to human health. Now, more than ever, incidents of tropospheric ozone pollution are frequently reported and are thus arousing widespread concern in society [10]. Li et al. indicated that the mean ozone concentration over China increased from 87.65 ± 16.74 µg/m 3 in 2014 to 98.57 ± 14.86 μg/m 3 in 2016 [11]. In some fastdeveloping regions of China, including Beijing-Tianjin-Hebei, the Yangtze River Delta, and Pearl River Delta regions, much effort has been made to improve the air quality. The primary pollutants (e.g., PM2.5) have decreased as a consequence, but secondary pollutants (e.g., ozone) are on the rise [12,13]. Contrary to the increasing trend of ozone observed in China,  found that the surface ozone in the southeastern United States has gradually decreased in the last decade [10]. Therefore, monitoring the global distribution of vertical ozone profiles is essential for ozone transport studies, which will further help us to understand the physical and chemical processes in the atmosphere, track stratospheric ozone depletion and tropospheric pollution, and estimate the impact of ozone on climate [4,14].
Currently, ground-based measurement, in situ observation, and spaceborne measurement are recognized as the three main methods for atmospheric ozone concentration monitoring. The World Ozone and Ultraviolet Data Center (WOUDC) (http://www.woudc.org, accessed on 1 February 2021) mostly employs two kinds of instruments: Electrochemical Concentration Cell (ECC) and Brewer Mast (BM) to supply ozone profiles from the surface to the stratosphere with vertical resolution of ∼150 m and accuracy of 5% [15]. The ozone sounding stations are mostly located in Europe and North America with a small number in South America, Asia, and Africa. Therefore, the coverage is still sparse under different observation quality standards. Ozone is also measured in situ by aircraft. In situ measurements from Civil Aircraft for the Regular Investigation of the Atmosphere Based on an Instrument Container (CARIBIC) are made using a fully automatic scientific device that is packaged in a 1.5 ton container on an airliner to measure ozone concentrations. Although the ground-based and in situ observations benefit from high accuracy, good stability, and continuity, ground-based measurement is a single point observation method and is limited by the number of observation stations [16], and these in situ measurements are also spatially and temporally sparse in terms of the estimation of ozone concentrations. Ozone profile observations with a consistent quality and wide area of coverage are greatly desired. Compared with ground-based and in situ observations, spaceborne measurement, which makes ozone observations from space, can provide continuous observation data at large regional scales. Currently, spaceborne measurement can monitor ozone concentration at a large scale due to its wide spatial coverage and high temporal resolution [17]. The sounders mounted on satellites for ozone observation are mainly thermal infrared observations and ultraviolet observations. The sounders of thermal infrared observations include the Atmospheric Infrared Sounder (AIRS) [18], Tropospheric Emission Spectrometer (TES) [19], Infrared Atmospheric Sounding Interferometer (IASI) [20], and Cross-track Infrared Sounder (CrIS) [4]. These thermal infrared instruments are only sensitive to the middle and upper troposphere. The other type consists of the Ozone Monitoring Instrument (OMI) [21], Tropospheric Monitoring Instrument (TROPOMI) [22], and Ozone Mapping and Profiler Suite (OMPS) [23]. Ultraviolet sensors with high precision regarding the ozone columns are affected by the surface reflectance, absorbing dust aerosol, and other factors, causing retrieval error; the vertical distribution information for ultraviolet on ozone is therefore limited.
Recently, satellite data-based, ground-based, and in situ measurements have provided a new way to monitor atmospheric ozone. Ghoneim et al. proposed a new deep-learningbased ozone model that comprehensively considered the correlation between pollution and weather [24]. Based on the meteorological factors and air pollutants affecting ozone, Feng et al. applied the machine learning method to predict the surface ozone in Hangzhou, China, and the results demonstrated that the dewpoint and NO2 were primary factors in surface ozone formation [25]. Zhan et al. developed a random forest model to predict MDA8 ozone concentrations across China, and the ozone dataset is valuable for related epidemiological analyses in ozone pollution [26]. At present, tropospheric ozone mainly comes from the downward transport of stratospheric ozone and from photochemical reactions in the troposphere [27]. It is assumed that tropospheric ozone is affected by meteorological conditions (temperature, water vapor, cloud, solar radiation, and potential vorticity) [28,29], NOx, and volatile organic compounds (VOCs), making tropospheric ozone concentration difficult to estimate. Machine learning has been utilized in many areas to solve complex problems due to its advantages in terms of selecting and using a great many factors that affect the predictions of the dependent variable.
Machine learning methods have been put to use to predict surface ozone concentrations [30][31][32], but most are for the region where the training data are located and not migrated to other untrained regions [33,34]. In this study, LSTM is applied to estimate the vertical distribution of the tropospheric ozone profile from 100 hPa to the surface. First, the models are trained, based on different latitudes with satellite radiances of ozone absorption bands, the apparent reflectance, and other pertinent variables related to meteorological conditions; second, the trained models are applied to predict the daily Lower-Stratosphere-to-Troposphere (LST) ozone profile concentrations with a spatial resolution of 25 km × 25 km, with the inputs of ERA5 reanalysis data (i.e., temperature, water vapor, potential vorticity, and wind) and satellite data. The structure of this paper follows. Section 2 describes the input data of the model and the data used to verify the model. Section 3 introduces the LSTM model in detail, and Section 4 presents the validation and comparison of tropospheric ozone profile estimates of CARIBIC (Civil Aircraft for the Regular Investigation of the Atmosphere Based on an Instrument Container) flight data and three regions of China. Section 5 concludes this work.

Data
The datasets used in this study include LST ozone data (WOUDC ozonesonde dataset, CARIBIC data, and near-surface ozone data of typical urban agglomerations in China), satellite data from AIRS and OMI, and meteorological data that coincide with the ozone data in time and space.

WOUDC Datasets
Ozonesonde data used in the study were obtained from WOUDC ( Figure 1). The ECC and BM types for the ozonesondes are widely used at present. Stubi et al. demonstrated that there was no significant difference between ECC and BM of radiosonde at 90% confidence level [35]. Logan et al. [36] compared the radiosonde and the Measurements of Ozone and Water Vapor by In-Service Airbus Aircraft (MOZAIC) data in Frankfurt and Munich from 1999 to 2008, and showed that the average ozone deviation in the lower troposphere (681~580 hPa) was 0.9 ± 2.8 ppb, and the deviation was 1.7 ± 3.8 ppb at 501-430 hPa. In general, the ozone profiles data of WOUDC were sufficiently accurate, and could be used as the reference for satellite and other observation methods (WOUDC, 2007). In this study, 20 sounding stations were selected across the Northern Hemisphere from 2014 to 2020 (as shown in Table A1).

CARIBIC Flights Data
CARIBIC is a scientific project that studies and monitors the important chemical and physical processes of trace gases and other components in the Earth's atmosphere with a 1 s time resolution. The Northern Hemisphere data from CARIBIC synthesized in 2 minute intervals from January 2014 to December 2020 were chosen in this study. The container of CARIBIC is operated monthly on flights from Germany to the Americas, Asia, and Africa. Only in a few flights is the Southern Hemisphere is probed. Flight data ranging from 2014 to 2019 were collected in different locations within a narrow spectrum of altitudes. Each flight covers a wide range of areas, such as tropical middle tropospheric air or middle and high latitudes upper tropospheric air and lower tropospheric air [37]. In the tropics, the plane flies in the free troposphere, whereas in the extratropics, this altitude range corresponds to the tropopause region, and the aircraft frequently encounters stratospheric air masses. The container on the flight includes the equipment for in situ measurements of greenhouse gases (carbon dioxide, nitrogen oxides, and methane) including ozone, water vapor, carbon monoxide, dust particles, and many more. Air sampling is carried out at cruise altitude, and more than 99% of the samples are collected at a typical pressure altitude of 230 ± 60 hPa. Comparisons with a laboratory standard showed that ozone measured with a UV photometer at a time resolution of 4 seconds can achieve a precision of 0.3 ppb and a total uncertainty of ~1.5% [38]. Figure 2 shows the selected CARIBIC flights data from 2014 to 2019.

Near-Surface Ozone Data
China's continuous observation of near ozone concentrations began in 2005 [39]. The hourly data of near-surface ozone concentrations online in real time are reported by China Environmental Monitoring Center (CNEMC, http://www.cnemc.cn, accessed on 1 February 2021), but it is still not possible to obtain the historical ozone data from the network publicly. There may be errors and suspect values in the data of CNEMC. Therefore, the quality control test was carried out through a quality assurance program. Near-surface ozone data (1000 hPa is regarded as the surface pressure) in three typical areas ( Figure 3) of China were selected for validating the accuracy and generalizability of the model, which are of great concern to the public, including the Sichuan Basin (SCB), North China Plain (NCP), and Yangtze River Delta region (YRD). Hourly (13:00-14:00) in situ surface ozone observations at monitoring stations (139, 240, and 331 uniformly distributed surface ozone monitoring stations in SCB, NCP, and YRD, respectively) acquired by the China National Environmental Monitoring Center (CNEMC) of three typical areas from January 2017 to December 2019 were collected.

Satellite Data
Satellite data were taken from AIRS and OMI onboard the Aqua and Aura, respectively. The AIRS sounder onboard of NASA's Aqua platform provided us with the capability to retrieve daily ozone data over land, ocean, and polar regions during the day and night. This study used the Aqua L2 product-AIRS Cloud-Cleared Radiances (CCRs) [40]. The CCRs employed the cloud-clearing method that removed the cloud from an infrared cloudy field of view and derived the cloud-cleared radiances, with a spatial resolution of 50 km. The parameters in the Aqua L2 product are the radiance of seven channels near the absorption band of 9.6 μm, together with geographic information related to the solar azimuth angle, solar zenith angle, satellite zenith, and azimuth angle. OMI is an ozone monitor on the Aura satellite with a spectral range of 0.27-0.5 μm. In this study, the apparent reflectance (ρ) was calculated using 15 channels in the spectral radiance band of 310-340 nm (ozone absorption band) and the average solar spectral irradiance was provided by OMI: where ρ is the apparent reflectance, π is 3.1415, L is the spectral radiance of the satellite sensor entering the top of the atmosphere, D is the distance between the Sun and the Earth, ESUN is the average solar spectral irradiance at the top of the atmosphere, and θ is the solar zenith angle. The spectral radiance and irradiance of OMI were taken from the Aura L1B product, with a spatial resolution of 13 km × 24 km. Both L and ESUN were provided by OMI. According to the temporal and spatial information of ozone data, coincident satellite data were extracted.

Meteorological Data
The meteorological data were taken from ERA5 reanalysis data. ERA5 is the fifthgeneration climate reanalysis dataset of the European Centre for Medium-Range Weather Forecasts (ECMWF) [41,42], with a spatial resolution of 25 km and a 1 h resolution. Ten meteorological factors with 27 pressure levels in 1000-100 hPa, i.e., the divergence (d, unit: s −1 ), fraction of cloud cover (CC), potential vorticity (PV, unit: K m 2 kg −1 s −1 ), relative vorticity (VO, unit: Pa s −1 ), temperature (T, unit: K), specific humidity (q, unit: kg/kg), vertical velocity (w, unit: Pa s −1 ), eastward component of wind (U, unit: m s −1 ), northward component of wind (V, unit: m s −1 ), and relative humidity (r, unit: %) at a 0. 25° × 0.25° resolution were used in this study. In addition, the input data also include the time (year, month, day, and hour) and geographic location information (geo, including latitude and longitude). We extracted the matching ERA5 meteorological data based on the time and space information of LST ozone data. Table 1 presents detailed information about the selected datasets. Because the datasets used in the study have different spatial and temporal resolutions, all data sets were uniformly resampled to the same spatial size (0.25° × 0.25°) using the bilinear interpolation method and the same time interval. The meteorological variables selected (d, CC, PV, VO, T, q, w, U/V, and r), the radiance of seven channels near the ozone absorption band from AIRS CCRs, the apparent reflectance of 15 channels from OMI and time, and geographic location information were matched to the daily LST ozone concentrations for each station. All the datasets used were uniformly resampled to the same vertical grid based on the ERA5 pressure.

Dataset Used and Processing
Relative humidity (r) % Pressure(P) hPa

Variable Analysis
LST ozone is affected by many factors. The strong solar radiation and long duration of sunshine are generally assumed to lead to the photochemical generation of ozone [43]. In addition to the these factors, the pressure (P) closely related to the atmospheric circulation and synoptic-scale meteorological pattern is also recognized as a main driving force for the ozone concentration over the Northern Hemisphere [44]. The U/V-component of wind (U/V) is widely used to capture the influence of wind on air pollutants over a certain period of time. The ozone concentration also relies on temperature. As depicted in many studies, a high ozone concentration correlates with high temperature [45]. Relative humidity (RH) affects heterogeneous reactions of ozone and particles [46,47]. Potential vorticity (PV) reflects the stratospheric tropospheric exchange, and the main reason for the cause of this is the change of tropopause. The vertical velocity (W) at different pressure levels can provide information on the ability of low-pressure systems to transport air masses vertically by convection [48]. Relative vorticity (VO) is a measure of the rotation of horizontal air around a vertical axis relative to a fixed point on the Earth's surface.
According to the previous research on the influence factors of LST ozone, we use the random forest [49] method to analyze the importance of the factors affecting LST ozone. Recently, the variable importance measures yielded by random forests have also been suggested for the selection of relevant predictor variables in the analysis of microarray data and other applications. The "mean decrease accuracy" method of random forest [50] was applied in this study. The method determines the variable importance by directly measuring the influence of each feature on the prediction accuracy of the model. The basic idea of the method is to add a random noise to a certain eigenvalue, and then observe the degree to which the accuracy of the prediction is reduced. For the unimportant features, this method has little influence on the prediction accuracy of the model, but for the important features, it will greatly reduce the prediction accuracy of the model. The data used were meteorological data, satellite data, latitude, longitude, and time matched with WOUDC LST ozone data from 2014 to 2020. The LST ozone data were from WOUDC, taken from 2014 to 2020. The specific steps follow to determine the benchmark value of prediction accuracy in the training model: (1) Add a random noise to the variable X (temperature, humidity, and so on), and the prediction accuracy of the model was recalculated. If the prediction accuracy of the model is greatly reduced after adding the noise, then it is of high importance. (2) Repeat for all variables to calculate the variable importance.
Because of the obvious difference of ozone with latitude and season, in this study we divided the Northern Hemisphere into three regions according to latitude for characteristic importance analysis: Northern Hemisphere at low latitude (0-30°N, NL), Northern Hemisphere at middle latitude (30°N-60°N, NM), and Northern Hemisphere at high latitude (60°N-90°N, NH). The results showing the variable importance of input parameters are shown in Figure 4. In the three different regions, the meteorological variables collectively account for more than 50% of the overall relative importance. Temperature, specific humidity, relative humidity, divergence, vertical velocity, vorticity, and the U/V-component of wind are the predominant variables. The high importance of meteorology for tropospheric ozone has also been found in several studies [20,51]. Following the meteorological factors, UV and TIR serve as the main factors for predicting the Lower-Stratosphere-to-Troposphere ozone values. Although variables importance results show that time (year, month, day, and hour) and geographic information (latitude and longitude) are not the most important factors affecting LST ozone concentrations, geographical and seasonal changes are still indispensable factors affecting LST ozone concentrations [52].

LSTM Model
LSTM optimizes the problems of gradient vanishing and gradient explosion in Recurrent Neural Networks (RNNs), and is an effectively optimized network with the ability to memorize the sequence of data and to deal with sequential pattern recognition problems [53]. The basic unit for a common LSTM is a memory cell composed of three gates: an input gate, an output gate, and a forget gate. An adaptive "forgetting gate" enables the LSTM network to learn automatically and judge whether to store memory information. The cell state carries all the previous state information and the cell state will be adaptively adjusted with the new states by discarding the old information or adding information. Figure 5 shows the LSTM neurons, which include the input gate it, forgetting gate ft, unit Ct, output gate Ot, and output response ht. The input gate and forgetting gate control the inflow and outflow of information. The output gate controls the amount of information from the unit to the output ht. Supposing W is the weight vector of a gate and b is the bias value, then the gate can be expressed as Equation (2): where σ is the activation function, and bi , bf, and bo are it , ft , and Ot bias values, respectively. Figure 5 is the structure of our model used to predict the LST ozone. In a series of experiments, three layer types were applied to predict LST ozone: the input layer with LSTM, hidden layers (the first two hidden layer were LSTM, the third hidden layer was the dense layer), and the output layer with the dense layer. The input layer of the module accepted three kinds of datasets: meteorological variables, satellite radiances/apparent reflectance parameters, and spatial temporal information. Because of the different samples in different regions of the Northern Hemisphere, the number of neurons in each layer was typically set manually, and the best composition was problemspecific. The numbers of the input layer's neurons at NH, NM, and NL were 40, 60, and 40, respectively. The numbers of neurons in hidden layers of NH, NM, and NL were 20-10-10, 20-10-10, and 30-15-15, respectively. The output is the LST ozone profile concentrations. In order to improve the training accuracy and speed up the convergence of the module, the z-score standardization method was used to transform the input data, with the mean value of 0 and the standard deviation of 1. The z-score function [54] is given as follows: where x is the input data, μ and σ are the mean value and standard deviation value of x, respectively. The model adopted the default activation function Tanh in the input layer. The Tanh is a smoother zero-centered function whose range lies from −1 to 1. The hidden layers used the activation function ReLU, which was able to speed up the learning convergence [55]. The activation functions Tanh and ReLU are given in Formulas (7) and (8), respectively. The learning rate, as an important hyperparameter in deep learning, determined whether the objective function converged to a local minimum and the speed at which it converges to the minimum. A proper learning rate can make the objective function converge to a local minimum in a reasonable time. In this study, we employed the LearningRateScheduler [56] function, which can automatically adjust the learning rate according to the number of epochs. At the beginning of training, a high learning rate was used to increase the convergence and training speed; we then gradually reduced the learning rate to reduce the overfitting and improve the training accuracy. The learning rate was reduced to 0.5 of the original in every 100 epochs. When training the model, we used the RMSprop [57] optimizer with a batch size of 72 to minimize the cost function. The output layer that adopted the dense layer produced the LST ozone profiles.
In order to reflect the performance of the model, in this paper, we used the correlation coefficient (R 2 ), mean root mean square error (RMSE), and mean absolute error (MAE) to evaluate the performance of the model. The implementation of the model was based on Keras, which is a high-level neural network application programming interface written in Python.
where i y is the predicted value, y is the average value of predicted values, * i y is the observation value, and * y is the average value of observation values.

Results
In this section, Section 4.1 describes the trained models at NH, NM, and NL, and the trained models are applied to the same data source (WOUDC) at different times. In order to further prove the generalizability of the model, the trained models were applied to different data sources (CARIBIC and CNEMC), as shown in Section 4.2.

Model Training
In this section, the LST ozone data of the trained model taken from WOUDC are presented. The module was trained, validated, and tested in three different regions using a history of all observed parameters (ERA5, satellite data, and WOUDC ozone profiles) from 2014 to 2020-the data from 2014 to December 2017 were used for training (80% of total data), the data from January 2018 to December 2018 were used for validation (10% of total data), and the data from January 2019 to December 2020 were used as the test set (10% of total data). The reason why WOUDC datasets were divided into three parts is that the training sets were used to train the LSTM modules, the validation sets were used to adjust hyperparameters during training, and the testing sets were used to objectively evaluate the performance of the model. Table 2 shows statistical results of training, validation, and test from ozone concentrations in all pressures. As shown in Table 2, the number of samples is smaller than that of corresponding profiles multiplied by 27 due to missing values in some pressures. However, to analyze the overall performance of the model, we compute the R 2 , root mean square error (RMSE) and the mean average error ( Figure 6 shows the testing samples of mean RMSE, R 2 , and Relative Error (RD) stratification of LST ozone for the test sets ranging from 2019 to 2020 in different latitudes (NH, NM, and NL). RD can be expressed by Formula (12): The RMSE in NH and NM increased with the increase of altitude, to a maximum of 100 hPa. While the RMSE in NL showed little change with the altitude, and the RMSE in each pressure was less than 50 ppb. Particularly, the maximum R 2 of each layer in NH was 250 hPa, which was greater than 0.7. The R 2 of each layer in NL was almost in the range of 0.3-0.6. The R 2 values of each layer in NM were almost in the range of 0.36-0.85, and the maximum R 2 happened in 225 hPa. The RD stratification values of each layer on the test sets were larger at 850-1000 hPa in the three different latitudes. The mean RD of all pressures from 100 to 1000 hPa in NH, NM, and NL were 0.217, 0.23, and 0.278, respectively. Figure 7 shows the mean of vertical concentrations of LST ozone on the test sets at eight WOUDC stations since 2019. We can see that the predicted LST ozone profiles are consistent with the observations, while the predicted values are generally lower than the observations. Figure 8 shows the RMSE, R 2 , and relative error of WOUDC stations, since 2019 was at different pressures. The RMSE of these stations in NH and NM increased with altitude, while the RMSE of Hong Kong in NL showed little change at different pressures. It is seen from Figure 8 that the correlations between the prediction results and observations above 400 hPa at most stations in NH and NL regions are greater than that of 400 hPa. This is because the influencing factors of tropospheric ozone are different in different pressure layers. The LST ozone concentrations may be affected by meteorological factors above 400 hPa in a large extent, and below 400 hPa due to photochemical reactions, a precursor, which make ozone changes more complex; the ozone precursors are not trained in the model. This is also the reason why the results of the model in the middle and lower troposphere are generally worse than those in the middle and upper troposphere. The R 2 of Hong Kong in NL changes little at different pressures, with a R 2 = 0.2~0.4. The performance of Hong Kong stations is different with the stations in NH and NM; this may be due to the tropospheric ozone in Hong Kong being affected by the increase in photochemical production, and the increase in transboundary transport [58].

Model Evaluation
In order to improve the model's accuracy, we migrated the trained model to different regions and verified it with ozone data from different data sources. Different data sources have invisible characteristic information such as region and special climate. The larger the information gap between data sources, the greater the difference of these invisible characteristics, and the greater the difference of ozone distribution. In this case, applying models that were pretrained on other data sources may lead to the inapplicability of feature information. Therefore, the model needed to be migrated in order to learn some implicit features of the new data. In this section, we present the fine-tuning of the LSTM models presented in Section 4.1.

Applied to CARIBIC
The CARIBIC flights data, with typical pressure of 230 ± 60 hPa, were matched with the pressure of ERA5. Most of the matched data were distributed at 200, 225, 250, and 300 hPa. The CARIBIC flights data chosen, ranging from January 2014 to February 2019 in NH, NM, and NL, were divided into a pretraining part (2014-2017) and a fine-tuning part (January 2018-February 2019), respectively. Table A2 shows the prediction performance of the migration models on CARIBIC flight data from January 2018 to February 2019 under different hidden frozen layers [59] in NH, NM, and NL. To freeze a layer means that it excludes the layer from the training process. The process of the transfer training is performed by using the weight parameters of trained models in Section 4.1, and keeping the weight parameters of the frozen layer unchanged to train the migration model. This was done to observe the predictive performance of the model with different frozen hidden layers ((0), (1), (1,2), and (1,2,3)). We can see that the transfer model of NH and NM with the hidden frozen layers (1,2) achieved a higher R 2 (0.774 and 0.443, respectively), and a lower RMSE (77.410 and 109.334 ppb, respectively) and MAE (92.978 and 72.932 ppb, respectively). The transfer model of NL with the different hidden frozen layers did not change greatly, with an R 2 = 0.359, RMSE = 17.972 ppb and MAE = 17.061 ppb, respectively, in hidden frozen layers (1). The R 2 , RMSE, and MAE in Table A2 were calculated by using the ozone concentrations from all pressures in the pretraining part. Figure 9 displays the comparison results of the CARIBIC flight data that ranged from January 2018 to February 2019 with a pressure 200-300 hPa derived from LSTM and aircraft measurements. The N in Figure 9 represents the number of samples used to evaluate the tropospheric ozone estimation performance. The R 2 , RMSE, and MAE in Figure 9 were calculated by using the LST ozone profiles at each pressure in NH, NM, and NL of the fine-tuning part. Overall, the LST ozone derived from the migration model agrees well with the CARIBIC measurements; the model presents good results in the Northern Hemisphere, with a high R 2 = 754. The performance of the models in NH and NM were overall better (e.g., R 2 ≈ 0.770) than in NL (with an R 2 = 0.359). The factors affecting tropospheric ozone in NL are complex. This may be due to the redistribution of ozone concentration caused by the thermal and dynamic forcing of atmospheric circulation in NL.

Applied to CNEMC
This part focuses on evaluating the predictability of the trained models in three urban agglomerations of China. For this purpose, the prediction of surface ozone of three typical areas (Figure 3) was validated using the data provided by CNEMC. Hourly (13:00-14:00) in situ surface ozone observations at monitoring stations of three typical areas from January 2017 to December 2019 were collected and then averaged to obtain daily mean ozone measurements. The matched CNEMC data of three urban agglomerations from January 2017 to December 2019 were divided into a pretraining part (2017) and a finetuning part (2018-2019), respectively. The transfer model used in SCB was trained based the model in NL, with the hidden frozen layers (1,2) performing better, and the best results can be seen in Figure 10a. The transfer model used in NCP was trained based on the model in NM, with the hidden frozen layers (1,2) showing better achievement, and the best results can be seen in Figure 10b. The transfer model used in YRD was trained based on the model in NL, with the hidden frozen layers (1,2,3) performing better, and the best results can be seen in Figure 10c. Figure 10 shows the predicted and observed daily surface ozone distribution in SCB, NCP, and YRD. There were 86,594; 124,863; and 216,120 daily samples collected from surface ozone monitoring stations in SCB, NCP, and YRD, respectively. The daily estimated ozone concentrations in the typical urban agglomerations of the SCB, NCP, and YRD were consistent with surface measurements (R 2 = 0.652−0.737), with overall estimation uncertainties (i.e., an RMSE = 14.543−16.916 ppb and MAE = 11.130−12.687 ppb) from 2018 to 2019. The performances of LSTM showed slight differences for each year during 2018~2019 in the three typical urban agglomerations. As shown in Table 3 The lowest R 2 value being in SCB might be attributable to meteorological factors. The variation of surface ozone concentration in SCB was greatly affected by the high annual temperature, seasonal cycle, small wind speed, mostly static wind, short sunshine time, and obvious seasonal heat island effect and meteorological conditions (such as high temperature, low humidity, low wind speed, and long sunshine time) [33], which have a greater comprehensive effect on high concentrations of ozone. We also find outliers of high surface ozone concentrations that were underestimated by the model. The underestimation of the predicted ozone largely depended on the number, geographical distribution, and sampling frequency of training samples, which did not cover mainland China (except for the Hong Kong and Taiwan sites) and the sparsity of training samples on the surface.  Based on the prediction results for near-surface ozone of three urban agglomerations presented above for the period of 2018-2019 at the daily scale, we conducted a statistical comparison of results at the monthly scale. When considering the monthly scale, the stations with >15% of valid daily surface ozone concentration measurements in a month were used in the calculations. Figure 11 shows the cross-validation results for surface ozone monthly estimates from 2018 to 2019 in China. From 2018 to 2019, each site had at least eight months of effective monthly averages to be counted. Figure 11 shows that the predicted values of surface ozone were highly correlated with the observations. The accuracy in NCP was 0.889, 8.729 ppb, and 6.930 ppb for R 2 , RMSE, and MAE, respectively. The monthly ozone estimations performed better than the daily estimations in SCB and YRD regions (i.e., R 2 = 0.815, MAE = 6.111 ppb and RMSE = 7.769 ppb in SCB, R 2 = 0.831, MAE = 6.276 ppb, and RMSE = 8.177 ppb in YRD). Overall, despite some differences in the three clusters' performance, the LSTM model showed overall good prediction accuracy for surface ozone concentrations at the regional scale on monthly averages.   Figure 12 shows annual surface ozone spatial distributions across SCB for 2018 and 2019. The model in SCB has a good performance in areas with low values of ozone, but the areas with high ozone concentrations are underestimated. We see that the surface ozone concentrations of eastern Sichuan and western Chongqing are higher than other parts in Figure 12. It is obvious that the annual mean surface ozone concentrations in 2019 for North Central Hebei, most of Shandong, and southern Shanxi are higher than those in 2018, which are consistent with the trend of the ground-based observations in NCP ( Figure 13). The evaluation of the model in YRD performs well in areas with low ozone values, such as Jiangxi Province and southern Zhejiang Province ( Figure 14). We also can see the surface ozone concentrations for northern YRD in 2019 are larger than those in 2018. The performance of the model in the south of Jiangsu Province is poor, however, and the ozone concentrations in northern YRD are higher than other parts in YRD.

Discussion
In this study, we considered the meteorological and radiance factors that affect LST ozone. Because LST ozone is also affected by a photochemical reaction, LST ozone is also affected by NO2, HCHO, CHOCHO, and other precursors. However, at present, there is almost no profile information for ozone precursors, so the influence of gas precursors was not considered in this study. The models trained in this study were applied to three typical urban agglomerations in China to predict surface ozone concentrations. It could be seen that the prediction results were generally underestimated. This may be due to the ozone at 1000 hPa of WOUDC stations being lower than that at 1000 hPa of the China regions, and LST ozone not fully representative of surface ozone. Figures 12-14 show the spatial distributions of the migration models applied in SCB, NCP, and YRD for 2018-2019. Based on the input of the model, we analyzed the influencing factors of the high value of ozone. Figures 15-17 list the input parameters with high ozone correlation. The surface ozone concentrations of eastern Sichuan and western Chongqing are higher than other parts in SCB. This may be caused by high temperature and low humidity [60]. Figure 15 shows the correlation of surface ozone concentrations with temperature and RH in eastern Sichuan and western Chongqing from 2018 to 2019. We can see that ozone concentrations are positively correlated with temperature and negatively correlated with relative humidity. Moreover, the absolute values of correlation of surface ozone concentrations with temperature and RH in 2019 are higher than those in 2018. This also shows that high temperature and low humidity are the factors affecting ozone concentrations. The ozone concentrations in southern of NCP are higher than other parts in NCP. Figure 16 shows the correlation of surface ozone concentrations with temperature and u in the southern of NCP (36°-38°N, 114°-118°E) from 2018 to 2019. We can see that ozone concentrations are positively correlated with temperature and negatively correlated with u. Cloud cover and low humidity are also the main factors affecting ozone concentrations in YRD. The ozone concentrations in the northern part of YRD are higher than other parts of YRD. Figure 17 shows the correlation of surface ozone concentrations with CC and RH in the northern part of YRD (32°-34°N, 116°-119°E) from 2018 to 2019. We can see that ozone concentrations are negatively correlated with CC and RH. Cloud cover and low humidity are also the main factors affecting ozone concentrations in YRD.

Conclusions
With the increase of atmospheric ozone pollution in recent years, a large number of studies focusing on estimating tropospheric ozone have been conducted. Traditional methods also face great challenges of tropospheric ozone estimates due to large uncertainties in the retrieval as it is influenced by numerous factors. In order to tackle these challenges, LSTM models are proposed to estimate lower-stratosphere-totroposphere ozone concentrations from 100 hPa to the surface. In this study, three models were trained to estimate the ozone concentrations in the Northern Hemisphere. The training models in NH, NM, and NL perform well, giving high R 2 values of 0.978, 0.905, and 0.695, respectively. The model performance was evaluated using the data of WOUDC ranging from 2019 to 2020 in different latitudes, CARIBIC flight data from 2018 to 2019, and in situ surface ozone observations at monitoring stations of three typical areas (SCB, NCP, and YRD) in China from January 2018 to December 2019.
By applying the models to the validation sets of WOUDC data of 2019-2020, the results in NH and NM were shown to perform well (R 2 = 0.928 and 0.885, respectively), while the R 2 = 0.590 in NL was lower than NH and NM. However, the RMSE value of NL was smaller than the other regions, probably because the NL region has a small range of tropospheric ozone. Meanwhile, the LSTM models were applied to the CARIBIC flights data, with a high precision of R 2 = 0.881 and RMSE = 52.402 ppb. Finally, the models were applied to estimate the tropospheric ozone of the three typical urban agglomerations of the SCB, NCP, and YRD from 2018 to 2019. Our results suggest that the performance of the LSTM models showed a good estimation of the monthly surface ozone concentrations in all the three clusters, with a relatively high coefficient of 0.815−0.889, RMSE of 7.769−8.729 ppb, and MAE of 6.111−6.930 ppb. The daily scale performance was not as high as the monthly scale performance, with an accuracy of R 2 = 0.636−0.737, RMSE = 14.543−16.916 ppb, MAE of= 11.130−12.687 ppb. The ozone concentrations in SCB might be affected by the high annual temperature and relative humidity. The overall predicted surface ozone concentration of our models was underestimated compared to the observations. The underestimation of the predicted ozone largely depends on the number of training samples and the sampling frequency. The distribution of the retrieval lowerstratosphere-to-troposphere ozone concentrations can be conducive to the study of ozone transportation and pollution in some small-and medium-scale regions, which is of great significance for the study of long-term ozone variation and its causes. Acknowledgments: Thanks given for the in situ ozone ground-measurements used in this study that were available from the China National Environmental Monitoring Center, the OMI and AIRS data at Level 1 provided by NASA (https://disc.gsfc.nasa.gov/datasets, accessed on 1 October 2020), ERA5 reanalysis data provided by ECMWF, the WOUDC data (https://woudc.org/data/, accessed on 1 October 2020), and the CARIBIC flight data (https://www.ecmwf.int/, accessed on 20 October 2020).

Conflicts of Interest:
The authors declare no conflict of interest.