A Deep-Neural-Network-Based Aerosol Optical Depth (AOD) Retrieval from Landsat-8 Top of Atmosphere Data

: The 30 m resolution Landsat data have been used for high resolution aerosol optical depth (AOD) retrieval based on radiative transfer models. In this paper, a Landsat-8 AOD retrieval algorithm is proposed based on the deep neural network (DNN). A total of 6390 samples were obtained for model training and validation by collocating 8 years of Landsat-8 top of atmosphere (TOA) data and aerosol robotic network (AERONET) AOD data acquired from 329 AERONET stations over 30 ◦ W–160 ◦ E and 60 ◦ N–60 ◦ S. The Google Earth Engine (GEE) cloud-computing platform is used for the collocation to avoid a large download volume of Landsat data. Seventeen predictor variables were used to estimate AOD at 500 nm, including the seven bands TOA reﬂectance, two bands TOA brightness (BT), solar and viewing zenith and azimuth angles, scattering angle, digital elevation model (DEM), and the meteorological reanalysis total columnar water vapor and ozone concentration. The leave-one-station-out cross-validation showed that the estimated AOD agreed well with AERONET AOD with a correlation coefﬁcient of 0.83, root-mean-square error of 0.15, and approximately 61% AOD retrievals within 0.05 + 20% of the AERONET AOD. Theoretical comparisons with the physical-based methods and the adaptation of the developed DNN method to Sentinel-2 TOA data with a different spectral band conﬁguration are discussed.


Introduction
Aerosols play an important role in the Earth's radiation balance and are considered to be one of the largest sources of uncertainty in global radiative forcing estimation [1]. Aerosol optical depth (AOD) is one of the most important parameters for aerosol research. Over the past several decades, numerous satellite/sensor observations have been used to obtain spatially explicit and frequent AOD, such as MODerate-resolution Imaging Spectroradiometer (MODIS) [2][3][4], Multi-angle Imaging SpectroRadiometer (MISR) [5], Advanced Along Track Scanning Radiometer (AATSR) [6], MEdium Resolution Imaging Spectrometer (MERIS) [7] and Himawari-8 Advanced Himawari Imager (AHI) [8]. The observed top of atmosphere (TOA) signal is often a result of the interactions between the reflection of the Earth surface and the scattering and absorption of the pristine atmosphere and aerosol. The interactions are complicated, but they can be simulated using radiative transfer models. Since both aerosol and surface optical properties are unknown, in the classical AOD retrieval algorithms, some a priori assumptions are made on surface reflectance. The dark target (DT) [9,10] method assumes fixed ratios between the visible and 2.1 µm shortwave infrared (SWIR) band surface reflectance over dense dark vegetation.  10 10. 60-11.19 Thermal Infrared Sensor (TIRS) 1 100 11 11.50-12.51 TIRS 2 100 In 2017, USGS started to use 'Collection' to manage the data, software and algorithm updates, and reprocessed all the Landsat archive, including Landsat-8 data into Collection-1 format [28]. The Collection-1 data include the TOA reflectance and BT bands, a quality assessment band indicating for each 30 m the presence of cloud, cloud shadow, water and snow/ice, and scene level metadata, including acquisition time. In December 2020, the USGS reprocessed all the Landsat archive into Collection-2 format. In addition to including the Collection-1 data content, the Collection-2 also includes the atmospherically corrected reflectance (i.e., surface surface), per-pixel solar and viewing geometries (zeniths and azimuths) and the quality assessment band with improved accuracy. The Collection-2 data also has a better geolocation accuracy [43]. In recognition of the wide application of Landsat data, the Google Earth Engine (GEE) also holds a copy of the whole archive of the Collection-1 data and at the time of writing this paper, the Collection-2 data was not yet copied over. The GEE is an online cloud-computing platform that holds massive satellite images, including Landsat, Sentinel, and MODIS data.
In this study, the Landsat-8 data acquired over the AERONET stations between 30 • W-160 • E and 60 • N-60 • S during 2013-2020 were used. The data includes the Collection-Remote Sens. 2022, 14, 1411 4 of 16 1 TOA reflectance from bands 1-7 and TOA BT from the two TIRS bands and the Collection-2 per-pixel quality assessment band, and solar and viewing geometry and scene metadata ( Table 1). The panchromatic band is not used as its information is contained in its spectrally overlapping blue, green, and red bands. The cirrus band is not used as it is designed to detect the cirrus cloud and has little information about land surface and aerosols. The TOA reflectance and BT data used the Collection-1 data available on the GEE to avoid the massive volume data download. The corresponding Collection-2 sun-sensor geometry data as well as the quality assessment (QA) band data were downloaded using the USGS Machine-to-Machine (M2M) Application Programing Interface (API) as these data have much smaller volume (<20 MB/scene) and they (Collection-2 data) are not yet available on GEE platform at the time of the paper's writing.

Auxiliary Data: Atmospheric Reanalysis and Digital Elevation Data
The effect of gas absorption must be considered in aerosol estimation as the absorption by water vapor and columnar ozone affects the visible band TOA reflectance. The total columnar water vapor (kg m −2 ) and total columnar ozone (kg m −2 ) from the ERA5 hourly data were used. The ERA5 (ECMWF Reanalysis v5) is the fifth generation European Centre for Medium-Range Weather Forecasts (ECMWF) atmospheric reanalysis of the global climate. ERA5 provides estimates of atmosphere variables for each hour with a spatial resolution of 0.25 • × 0.25 • (~27 km at Equator). In addition, digital elevation is related to the scattering and absorption of sunlight by atmospheric molecules. The digital elevation data from the Shuttle Radar Topography Mission (SRTM) digital elevation model (DEM) with a 90 m spatial resolution were used in this study.

Training and Validation Sample Collection by Collocating Landsat-8 and AERONET Observations
A total of 6390 training and validation samples were extracted from the collocated Landsat-8 TOA and the AERONET observations. Spatially, the observations from the Landsat-8 30 m pixel closest to the AERONET station were used. Temporally, the AERONET 500 nm AOD measurements around the Landsat-8 acquisition time (±30 min) were averaged as the AOD value. At least 3 AERONET observations were required to obtain a robust average value, and only AERONET data with a positive 500 nm AOD were used. In addition, Landsat-8 observations identified as cloud, cloud shadow, cirrus, dilated cloud or fill data were discarded based on the QA band of Landsat data. In addition, observations with blue band TOA reflectance >0.4 were discarded to further exclude cloudy pixels that may be missed by the Landsat-8 cloud mask. The AERONET measurements and Landsat-8 observations between 30 • W-160 • E and 60 • N-60 • S during 2013-2020 were used.
Each sample contains a temporally averaged AERONET AOD and seventeen predictor variables derived from the Landsat observations and auxiliary data. The seventeen predictor variables include the seven band (bands 1-7) OLI TOA reflectance, the two band TIRS TOA BT, the solar and viewing zenith, the solar and viewing azimuth, the scattering angle, SRTM DEM, and ERA5 hourly total columnar water vapor and total columnar ozone. The Collection-1 TOA reflectance and BT without any QA layer filtering were first collocated with the AERONET AOD using the GEE platform to avoid a large volume Landsat image download. The collocated results were saved in a two-dimensional table with rows storing collocated samples and columns storing the AEROENT AOD, latitude, longitude, Landsat-8 acquisition time, TOA reflectance, and TOA BT and the Landsat-8 Scene ID. The Collection-2 Landsat-8 solar and viewing geometry and QA bands for the Landsat-8 scenes in the table were then downloaded and their pixel values added to the  table columns. Finally, the atmospheric reanalysis and digital elevation data were added to  the table columns

Deep Neural Network AOD Retrieval and Validation
The deep neural network (DNN) was constructed by advancing the three layer (one input, one hidden and one output layer) artificial neural network (ANN) with multiple hidden layers [44]. In this study, the 17 predictors were used as the input layer and the 500 nm AOD (one single value) as the output layer. It should be noted that the different DNN structures have small performance differences given sufficient training samples, despite that optimal DNN structure being usually unknown in deep learning applications [45]. This study used the same structure we developed for Himawari-8 AHI AOD retrieval [35]. Specifically, three hidden layers were used with 256, 512 and 512 neurons. Other deeper structures (with 6 layers and 8 layers) were tested with little improvement in performance. The state-of-the-art practices for DNN training were used. The mini-batch gradient descent search method was used [46] with a batch size of 256 and 200 epochs to ensure a stable and robust solution. The weight and bias were carefully initialized following He et al. [47]. To be specific, the bias was initialized as 0, while the weights were initialized with random values drawn from a normal distribution with 0 mean and with standard deviation determined by the neuron number of the previous layer. Batch normalization [48] was used as regularization to avoid over-fitting. A dynamic learning rate was used where the training starts with a large learning rate for fast coverage and then deceases the learning rate to avoid loss function oscillation around the minimum when the loss approaches the minimum. The learning rate was initialized as 0.1 and decreased to 0.01, 0.001 and 0.0001 at the 80, 120 and 160 epochs, respectively [49]. The DNN is implemented using the Google TensorFlow application programing interface (API) [50].
The DNN retrieved AOD was compared with the reference AERONET AOD at 500 nm using the correlation coefficient (R), root-mean-square error (RMSE), mean absolute error (MAE), linear regression slopes, and fraction within the uncertainty of ± (0.05 + 0.20 *AOD AERONET ) [12]. Two cross-validation methods were used following She et al. [35]: (i) k-fold cross-validation and (ii) leave-one-station-out cross-validation. Cross-validation is achieved by training and applying DNN multiple times, each with different training and validation samples. Consequently, each sample is used as a validation sample so that a total of 6390 samples are validated. The 6390 samples are randomly partitioned into k equally sized subsamples. A single subsample is used to validate the model trained by the other k − 1 subsamples. The training and validation process is repeated k times so that every single sample is used as an independent validation sample [51]. The k-fold cross-validation has been used in the validation of PM 2.5 estimation using machine learning methods [52,53]. However, in the k-fold cross-validation, the training and validation samples may come from the same AERONET station for each DNN validation, which could overestimate the accuracy for machine learning models [35]. In this study, k was set at 329 so that the numbers of training samples in (i) and (ii) are comparable.
In the (ii) leave-one-station-out cross-validation, the 6390 samples were partitioned into 329 subsamples (there are 329 AERONET stations) and all samples in each subsample were from the same unique AERONET station. A single subsample was used to validate the model trained by the other 328 subsamples. The training and validation process was repeated 329 times so that every single sample was used as an independent testing sample and the validation samples came from different locations than the training samples. This corresponds to the real situation where the AERONET stations are distributed sparsely over the globe and the DNN model application locations usually does not have AERONET observations. This has been used in the validation of PM 2.5 estimation using machine learning methods [54,55].

Descriptive Statistics of the Collocted Training and Validation Samples
The mean, standard deviation, maximum, and minimum values of the AOD and the 17 predictor variables in the 6390 data records are presented in Table 2. The AOD ranges from 0.003 (occurred in station Albergue_UGR) to 2.735 (occurred in station Chi-ang_Mai_Met_Sta) with a mean value of 0.234, which is comparable to the global multi-year mean MODIS AOD value of 0.19 reported in Remer et al. [56]. As expected, the four visible bands (bands 1-4) have very similar TOA reflectance (0.143-0.164). The mean NIR band TOA reflectance (0.258) is higher than the visible band because of vegetation presence in the AERONET stations. The two mean SWIR bands reflectance (0.241 and 0.178) is lower than NIR band due to the water absorption in soil and vegetation. The BT covers a range of temperatures for the two thermal bands, e.g., the thermal band 10 BT ranges from 262.0 K (−11.1 • C) to 327.7 K (54.6 • C) with a mean value of 298.7 K (25.5 • C). The mean viewing zenith angle is 3.6 • , which is expected considering that the viewing zenith angle ranges from 0 • to~8 • for Landsat-8 image acquired in nadir operations [57]. The 17.48 • maximum viewing angle was observed for an off-nadir pointing scene that was acquired on 13 May 2013 for WRS2 path 130 row 30. The minimal solar zenith angle is 20.2 • , which is slightly higher than the global minimal Landsat-5 and 7 solar zenith angle 22.14 • [58] because of its~30 min later overpass time.   Figure 2 shows the average AOD for the collocated samples for each AERONET station. The high AOD value stations are located mainly in Asian countries due to heavy pollution (e.g., the highest mean AOD value is 1. 44           urban in the MODIS land cover product. There are about a fourth of samples (1682) from the identified 89 urban AERONET stations. There is little difference between the accuracies for the urban samples ( Figure 3c) and for all samples (Figure 3b). station-out cross-validation. This is because the k-fold cross-validation may boost the validation accuracy as the validation and training samples could come from the same AER-ONET station. The trained model application to locations without any training data is actually unknown for the k-fold cross-validation and the model is usually less representative to non-AERONET station locations due to the different surface reflectance characteristics and aerosol optical properties. The leave-one-station-out cross-validation has training and validation samples that are forced to come from different stations and thus reflects the accuracy for large-area applications where most of the pixel locations have no AERO-NET observations for DNN training. For the following analysis, only the results for the leave-one-station-out cross-validation are shown.  The leave-one-station-out cross-validation strategy is recommended for use despite the fact that the accuracy of k-fold cross-validation is better than that of the leave-one-stationout cross-validation. This is because the k-fold cross-validation may boost the validation accuracy as the validation and training samples could come from the same AERONET station. The trained model application to locations without any training data is actually unknown for the k-fold cross-validation and the model is usually less representative to non-AERONET station locations due to the different surface reflectance characteristics and aerosol optical properties. The leave-one-station-out cross-validation has training and validation samples that are forced to come from different stations and thus reflects the accuracy for large-area applications where most of the pixel locations have no AERONET observations for DNN training. For the following analysis, only the results for the leaveone-station-out cross-validation are shown. Figure 4 shows the RMSE map for the DNN leave-one-station-out AOD validation with RMSE derived for each AERONET station. There are 170 out of 329 stations with RMSE smaller than 0.

Discussion
In this study, the AERONET AOD measurements observed within a relatively large temporal window (±30 min) centered at the Landsat-8 acquisition time were averaged as the AOD training data (Section 3.1). To examine whether a smaller temporal window has

Discussion
In this study, the AERONET AOD measurements observed within a relatively large temporal window (±30 min) centered at the Landsat-8 acquisition time were averaged as the AOD training data (Section 3.1). To examine whether a smaller temporal window has a better accuracy, we collocated samples using ±5 min and ±15 min temporal windows. To be specific, the AERONET AOD within ±5 min and ±15 min of the Landsat-8 acquisition time were averaged as the AOD and their leave-one-station-out cross-validation results are shown in Figure 6a,b, respectively. As expected, the smaller temporal window generated a smaller number of collocated samples. There is little accuracy difference between the ±15 min ( Figure 6b) and ±30 min (Figure 3b) temporal collocation window results. The validation accuracy of the ±5 min window results is slightly inferior to the validation accuracy of the ±15 min and ±30 min window results. This may be because of the smaller number of collocated samples (4608), resulting in insufficient DNN training. Consequently, a relatively large (±30 min) window is used in this study to gain more training samples. Such large temporal window has been applied in AOD validation studies [37,59].

Discussion
In this study, the AERONET AOD measurements observed within a relatively large temporal window (±30 min) centered at the Landsat-8 acquisition time were averaged as the AOD training data (Section 3.1). To examine whether a smaller temporal window has a better accuracy, we collocated samples using ±5 min and ±15 min temporal windows. To be specific, the AERONET AOD within ±5 min and ±15 min of the Landsat-8 acquisition time were averaged as the AOD and their leave-one-station-out cross-validation results are shown in Figure 6a,b, respectively. As expected, the smaller temporal window generated a smaller number of collocated samples. There is little accuracy difference between the ±15 min ( Figure 6b) and ±30 min (Figure 3b) temporal collocation window results. The validation accuracy of the ±5 min window results is slightly inferior to the validation accuracy of the ±15 min and ±30 min window results. This may be because of the smaller number of collocated samples (4608), resulting in insufficient DNN training. Consequently, a relatively large (±30 min) window is used in this study to gain more training samples. Such large temporal window has been applied in AOD validation studies [37,59]. The DNN method is not to replace the existing radiative transfer methods, but to complement them. As noted in the introduction, the DNN methods try to approximate The DNN method is not to replace the existing radiative transfer methods, but to complement them. As noted in the introduction, the DNN methods try to approximate the complicated physical relationship between AOD and satellite TOA observations without any a priori assumption on aerosol model and surface reflectance. Despite that the current DNN models have very powerful capability to represent complicated relationships, one main and well-known drawback is that the learned model only represents well the physical relationship in the training data. For example, the learned model in this study is not applicable for ocean surface AOD retrieval, which may need a new DNN model trained (calibrated) using the ocean ground AOD measurement data [60]. The training data representativeness is the key for the successful machine learning application. For example, in Figure 4, there are a few AERONET stations with high RMSE in the south of the Sahara Desert, which is, likely, because this region aerosol model and surface condition are not well represented in the training data, i.e., most AERONET stations are not located in the desert. The dust aerosols are distributed in global desert regions [61,62] with relative less human settlement and thus AERONET stations. To further examine this issue, Figure 7 shows the RMSE of the DNN estimated AOD as a function of Ångström exponent that is derived from the AERONET spectral AOD. The Ångström exponent describes the AOD spectral dependence and is usually related to the aerosol models. In general, a high Ångström exponent value is related to fine particles (e.g., industry air pollution aerosols) and a low Ångström exponent value is related to coarse particles (e.g., dust and sea salt aerosols). The AERONET AOD with a small Ångström exponent tends to have a large bias and there is a general decrease trend of RMSE with an increase in Ångström exponent. This is possibly because the AERONET stations in arid and semi-arid regions with dust aerosols dominate the small Ångström exponent AODs and the bright surface reflectance is relatively sparse in the AERONET training dataset. This could also be simply due to the large AOD values in the region, i.e., the AOD magnitude also decreases with the increase in Ångström exponent, which reflects that the coarse aerosols normally have larger AOD values than fine aerosols. Decoupling the two contribution factors is very difficult, if not impossible, and future studies on this are encouraged to understand the DNN algorithm performance.
in the desert. The dust aerosols are distributed in global desert regions [61,62] with relative less human settlement and thus AERONET stations. To further examine this issue, Figure  7 shows the RMSE of the DNN estimated AOD as a function of Ångström exponent that is derived from the AERONET spectral AOD. The Ångström exponent describes the AOD spectral dependence and is usually related to the aerosol models. In general, a high Ångström exponent value is related to fine particles (e.g., industry air pollution aerosols) and a low Ångström exponent value is related to coarse particles (e.g., dust and sea salt aerosols). The AERONET AOD with a small Ångström exponent tends to have a large bias and there is a general decrease trend of RMSE with an increase in Ångström exponent. This is possibly because the AERONET stations in arid and semi-arid regions with dust aerosols dominate the small Ångström exponent AODs and the bright surface reflectance is relatively sparse in the AERONET training dataset. This could also be simply due to the large AOD values in the region, i.e., the AOD magnitude also decreases with the increase in Ångström exponent, which reflects that the coarse aerosols normally have larger AOD values than fine aerosols. Decoupling the two contribution factors is very difficult, if not impossible, and future studies on this are encouraged to understand the DNN algorithm performance. The developed DNN model for 500 nm AOD estimation can neither provide the physical explanations of the underlying learned relationship nor estimate other aerosolrelevant variables. In particular, most physical based methods can estimate AOD at any wavelength using some physical dependence principle of the aerosol model assumptions. For the DNN-based method, however, as no model assumption is given, additional DNN models need to be trained to estimate AOD at other wavelengths. To test the effectiveness, The developed DNN model for 500 nm AOD estimation can neither provide the physical explanations of the underlying learned relationship nor estimate other aerosolrelevant variables. In particular, most physical based methods can estimate AOD at any wavelength using some physical dependence principle of the aerosol model assumptions. For the DNN-based method, however, as no model assumption is given, additional DNN models need to be trained to estimate AOD at other wavelengths. To test the effectiveness, we trained two new models for AOD estimation at 440 nm and 675 nm, respectively, using the same model structure. The same training samples were used, except that the 500 nm AERONET AOD is replaced with the 440 nm and 675 nm AOD, respectively. The leave-one-station-out cross-validation results are shown in Figure 8. The RMSE of 440 nm (RMSE = 0.17), 500 nm (0.15) and 675 nm (0.13) AOD DNN estimation decreases with the wavelength because the AOD magnitude decreases with the wavelength as the particular sizes of the aerosols are normally smaller than these wavelengths. The correlation coefficient (R) is the same for the AOD estimation at 440 nm (0.83) and 500 nm (0.83) and is smaller for 675 nm AOD estimation (0.77), mainly because the 675 nm AOD has a very smaller range of AOD that may decrease the R-value.
(RMSE = 0.17), 500 nm (0.15) and 675 nm (0.13) AOD DNN estimation decreases with the wavelength because the AOD magnitude decreases with the wavelength as the particular sizes of the aerosols are normally smaller than these wavelengths. The correlation coefficient (R) is the same for the AOD estimation at 440 nm (0.83) and 500 nm (0.83) and is smaller for 675 nm AOD estimation (0.77), mainly because the 675 nm AOD has a very smaller range of AOD that may decrease the R-value. It should be noted that there are no publicly available Landsat-8 AOD products that can be used for inter-comparison with the DNN AOD retrieval in this study. The public available software package Generalized Retrieval of Aerosol and Surface Properties (GRASP) that was developed to jointly characterize aerosol and surface properties for observations, such as POLDER and AERONET [63], may require code adaptations for Landsat-8 observations. The reported accuracy in the literature [25,[31][32][33] on Landsat-8 AOD retrieval has an RMSE ranging from 0.04 to 0.135 and R 2 ranging from 0.670 to 0.997. Their study regions, however, often cover less than two city locations and a meaningful comparison with this study is impossible. We encourage future study to inter-compare the DNN-based AOD retrieval and physical model-based retrieval to understand better their strengthens and limitations. We also encourage the USGS, who now provides Landsat-8 surface reflectance product, to provide also the AOD product derived in the surface reflectance calculation. This will not only benefit users focusing on aerosol studies but also those land application users who rely on the AOD quality to determine the surface reflectance quality.
The developed method can be used for Sentinel-2 data with several advantages over Landsat-8 data. First, Sentinel-2 has more bands in the solar reflective wavelength region that may be useful for aerosol characterization (e.g., the water vapor band). We note Sentinel-2 has no thermal bands and we found, however, that the thermal bands had only small contribution to the Landsat-8 AOD estimation. Figure 9 shows the leave-one-station-out cross-validation results for the 500 nm AOD estimation without the thermal BT as predictors. The accuracy slightly drops compared to Figure 3b results with BT as predictors. Secondly, Sentinel-2 has higher temporal resolution (5-day) than Landsat-8 (16day), which may be helpful for aerosol studies. Thirdly, the Sentinel-2 level-2 product has It should be noted that there are no publicly available Landsat-8 AOD products that can be used for inter-comparison with the DNN AOD retrieval in this study. The public available software package Generalized Retrieval of Aerosol and Surface Properties (GRASP) that was developed to jointly characterize aerosol and surface properties for observations, such as POLDER and AERONET [63], may require code adaptations for Landsat-8 observations. The reported accuracy in the literature [25,[31][32][33] on Landsat-8 AOD retrieval has an RMSE ranging from 0.04 to 0.135 and R 2 ranging from 0.670 to 0.997. Their study regions, however, often cover less than two city locations and a meaningful comparison with this study is impossible. We encourage future study to inter-compare the DNN-based AOD retrieval and physical model-based retrieval to understand better their strengthens and limitations. We also encourage the USGS, who now provides Landsat-8 surface reflectance product, to provide also the AOD product derived in the surface reflectance calculation. This will not only benefit users focusing on aerosol studies but also those land application users who rely on the AOD quality to determine the surface reflectance quality.
The developed method can be used for Sentinel-2 data with several advantages over Landsat-8 data. First, Sentinel-2 has more bands in the solar reflective wavelength region that may be useful for aerosol characterization (e.g., the water vapor band). We note Sentinel-2 has no thermal bands and we found, however, that the thermal bands had only small contribution to the Landsat-8 AOD estimation. Figure 9 shows the leave-one-stationout cross-validation results for the 500 nm AOD estimation without the thermal BT as predictors. The accuracy slightly drops compared to Figure 3b results with BT as predictors. Secondly, Sentinel-2 has higher temporal resolution (5-day) than Landsat-8 (16-day), which may be helpful for aerosol studies. Thirdly, the Sentinel-2 level-2 product has a publicly available AOD that can be used as a baseline to compare the DNN AOD estimation. The Sentinel-2 AOD is generated by the atmospheric correction software known as Sen2Cor [64], based on the DT algorithm. The developed DNN needs to be retrained using Sentinel-2 TOA reflectance data and the adaptation should be straightforward as the data are also available in the GEE. a publicly available AOD that can be used as a baseline to compare the DNN AOD estimation. The Sentinel-2 AOD is generated by the atmospheric correction software known as Sen2Cor [64], based on the DT algorithm. The developed DNN needs to be retrained using Sentinel-2 TOA reflectance data and the adaptation should be straightforward as the data are also available in the GEE. Figure 9. The leave-one-station-out cross-validation of the DNN 500 nm AOD results without using the thermal bands (i.e., same as Figure 3b expect that the thermal bands BT are excluded in the predictors).

Conclusions
In this paper, a DNN-based high spatial resolution AOD retrieval method for Landsat-8 data has been presented. The Landsat-8 TOA reflectance and BT, sun-sensor geometry, and auxiliary data were used as predictors. The AERONET version 3 Level 2.0 observation over 60°S-60°N, 30°W-160°E during 2013-2020 were collocated to obtain 6390 training and validation samples. The k-fold and leave-one-station-out cross-validation were undertaken and compared. The leave-one-station-out cross-validation has a lower validation accuracy, but it is believed to reflect the large area application accuracy as the training and validation samples are from different AERONET stations. For the leave-onestation-out cross-validation, the DNN-estimated AOD agrees well with AERONET AOD measurements (R = 0.83, RMSE = 0.15), as 61% of retrieval fall within the uncertainty of ± (0.05 + 0.20 *AODAERONET). The DNN-estimated AOD has larger errors over small Ångström exponent aerosol conditions. The accuracy of DNN models usually increase with training sample number and there are two ways to boost training sample size. On the one hand, one can use more ground-based AOD measurements from other AERONET stations and non-AERONET stations. For example, the Sun-Sky Radiometer Observation Network (SONET) can provide many more AOD measurements over mainland China than AERONET, especially over northeast China. On the other hand, one can use the Landsat-9 to augment training data. The Landsat-9 carrying OLI-2 and TIRS-2 was launched on 27 September 2021 and the data are now public available.

Conclusions
In this paper, a DNN-based high spatial resolution AOD retrieval method for Landsat-8 data has been presented. The Landsat-8 TOA reflectance and BT, sun-sensor geometry, and auxiliary data were used as predictors. The AERONET version 3 Level 2.0 observation over 60 • S-60 • N, 30 • W-160 • E during 2013-2020 were collocated to obtain 6390 training and validation samples. The k-fold and leave-one-station-out cross-validation were undertaken and compared. The leave-one-station-out cross-validation has a lower validation accuracy, but it is believed to reflect the large area application accuracy as the training and validation samples are from different AERONET stations. For the leave-one-station-out crossvalidation, the DNN-estimated AOD agrees well with AERONET AOD measurements (R = 0.83, RMSE = 0.15), as 61% of retrieval fall within the uncertainty of ± (0.05 + 0.20 *AOD AERONET ). The DNN-estimated AOD has larger errors over small Ångström exponent aerosol conditions. The accuracy of DNN models usually increase with training sample number and there are two ways to boost training sample size. On the one hand, one can use more ground-based AOD measurements from other AERONET stations and non-AERONET stations. For example, the Sun-Sky Radiometer Observation Network (SONET) can provide many more AOD measurements over mainland China than AERONET, especially over northeast China. On the other hand, one can use the Landsat-9 to augment training data. The Landsat-9 carrying OLI-2 and TIRS-2 was launched on 27 September 2021 and the data are now public available.