Validation of AERONET-Estimated Upward Broadband Solar Fluxes at the Top-Of-The-Atmosphere with CERES Measurements

The AERONET (Aerosol Robotic Network) global network provides estimations of broadband solar radiative fluxes at the surface and at the TOA (Top-Of-the-Atmosphere). This paper reports on the validation of AERONET flux estimations at the TOA with the CERES (Clouds and the Earth’s Radiant Energy System) instrument. The validation was made at eight AERONET sites worldwide with at least seven years of Level 2.0 and Version 3 data and representatives of mineral dust, biomass burning, background continental, and urban-industrial aerosol regimes. To co-locate in time and space the AERONET and CERES fluxes, several criteria based on time and distance differences and cloud coverage were defined. When the strictest criterion was applied to all sites, the linear relationship between the observed and estimated fluxes (y = 1.04x – 3.67 Wm−2) was very close to the 1:1 ideal line. The correlation coefficient was 0.96 and nearly all points were contained in the ±15% region around the 1:1 line. The average flux difference was –2.52 Wm−2 (−0.84% in relative terms). AERONET overestimations were observed at two sites and were correlated with large aerosol optical depth (AOD) (>0.2) Underestimations were observed at one desert site and were correlated with large surface albedos (>0.2).


Introduction
AERONET (Aerosol Robotic Network) is a well-known monitoring network of atmospheric aerosols at the global scale [1]. The inversion of the direct sun and diffuse sky radiance measurements gives a long list of aerosol and atmospheric properties, among them the broadband solar radiative fluxes at the surface and at the TOA (top of the atmosphere) in both upward and downward directions. These fluxes, if estimated with a known uncertainty, are precious information for the calculation of the radiative forcing effects produced by atmospheric aerosols. In this sense, the advantage of the broad and dense spatial coverage of AERONET sites at the global scale (>600 instruments in 2018 [2]) is certainly an added value.
While AERONET-estimated surface fluxes have been compared with surface flux measurements collected during the AMMA (African Monsoon Multidisciplinary Analysis) campaign by [3] and validated worldwide with surface flux measurements of networked pyranometers by [4], the TOA fluxes have never been validated in an intensive manner. The purpose of this article is to perform a validation, or at least to present a statistically representative estimation of the quality of AERONET-estimated TOA upward solar fluxes by comparing them with satellite measurements. A similar approach was undertaken by [5] who did a comparison at eight AERONET sites for a period of three years between . A brief description of the inversion assumptions and flux calculations is given in [7] and in the AERONET Inversion Products (Version 3) [8]. In brief, the flux calculation relies on the interpolation/extrapolation in the whole shortwave spectral range of the retrieved real part and imaginary part of the refractive index at AERONET wavelengths and of the assumed surface albedo at AERONET wavelengths. The assumed surface albedo is modelled by different BRDF (Bidirectional Reflectance Distribution Function) models depending on the surface type [8,9]. The gaseous absorption is accounted using the GAME (Global Atmospheric ModEl) radiative transfer model [10]. This model performs spectral integration using the correlated-k distribution based on line-by-line simulations [11]. As mentioned in the introduction, while the validation of AERONET downward flux at the surface has been performed against surface radiation measurements by [4], very little is known about the validity of AERONET flux estimates at the TOA.
The data shown in this work are based on instantaneous AERONET Level 2.0 (L2.0), cloud-screened and quality-assured data [12] inverted with the AERONET Version 3 (V3) retrieval algorithm [2]. All AERONET data are downloaded from the AERONET web page at http://aeronet.gsfc.nasa.gov/. In practice, the main difference between Version 2 and Version 3 is the full automation of the cloud screening and instrument anomaly quality controls.

CERES
CERES is a 3-channel radiometer measuring the reflected solar radiation in the 0.3-5 µm wavelength band, the emitted terrestrial radiation in the 8-12 µm band, and the total radiation from 0.3 µm to beyond 100 µm [13]. The ±78 • cross-track mode of the radiometer defines a large swath that allows for complete coverage of the Earth surface. The CERES instrument is currently onboard four satellites: Aqua, Terra, Suomi National Polar-orbiting Partnership (S-NPP), and NOAA-20 (formerly JPSS-1, named after the Joint Polar Satellite System-1 mission). However data of CERES/NOAA-20 are not available yet. CERES Data from Terra, Aqua, and S-NPP are used here, indistinctively. To fulfill the space-time co-location criterion, the only CERES data product useful for our study is the instantaneous Single Scanner Footprint (SSF) product which is given at the resolution of the field-of-view instantaneous footprint (20 km at nadir). We used SSF Level 2 Edition 4 products for CERES/Terra (1 March 2000 to 31 October 2018) and Aqua (3 July 2002 to 31 October 2018), and SSF Level 2 Edition 1 for CERES/S-NPP (27 January 2012 to 31 December 2018). All CERES data were downloaded from the CERES subsetting and browsing web page at https://ceres-tool.larc.nasa.gov/ord-tool/products?CERESProducts=SSFlevel-2_Ed4. CERES measures the upward energy (radiance) at the TOA. Shortwave upward fluxes are estimated using angular distribution models (ADMs) described by [14,15]. In this work we used the CERES shortwave TOA flux-upward (SSF-38) product, F SW,↑,TOA CER . In addition to the upward shortwave fluxes, we also downloaded the clear-sky fraction (SSF-81) which gives the portion of the CERES field-of-view for clear sky and up to two cloud layer combinations. In particular, we used the first coverage corresponding to clear sky. Finally the broadband surface albedo (SSF-50) was also downloaded to investigate possible correlations between the difference observed and the surface albedo at each site. This parameter is based on broadband albedo lookup tables, derived from on field observations, for each of the 17 IGBP (International Geosphere-Biosphere Programme) surface types and three surface types defined for CERES (tundra, fresh snow, and sea ice). For the calculation of the broadband surface albedo, up to 8 surface types per pixel can be considered. Values are scaled by the percent coverage of the corresponding surface type and summed. Information about these products can be found in the CERES collection guides available at https://ceres.larc.nasa.gov/collect_guide.php. In the CERES Data Quality Summaries available at: https://ceres.larc.nasa.gov/dqs.php for the different satellites and editions, one finds the data quality of the estimated fluxes which highly depends on the scene classification and the ADMs used to convert the measured radiance in a radiative flux [16]. To calculate an overall flux uncertainty for CERES/Terra (Aqua) we considered the largest seasonal value of the regional mean all-sky SW TOA flux bias, -0.20 (-0.17) Wm −2 , and the RMS error, 0.91 (0.87) Wm −2 . In addition, and because we do not know a priori the satellite viewing angle, we also summed up the relative RMS errors due to the difference between nadir-viewing and oblique-viewing (50-60º) in clear-sky conditions: 4.2%. No data quality summary has been released yet for CERES/S-NPP. Thus, in order to stay on the safe side, we apply the worst accuracy (CERES/Terra) to all CERES fluxes, i.e., − 0.20 ± 0.91 ± F SW,↑,TOA CER × 0.042 = +0.71/ − 1.11 ± F SW,↑,TOA

Strategy
Although initially the idea was to validate AERONET TOA upward fluxes at the same AERONET stations as the validation at the surface performed by [4], we quickly found out that space-truths (as opposed to ground-truths) are much more tedious to obtain in a reliable manner. Since the database of CERES is almost 18 years long, we looked for the AERONET stations with the longest records (>15 years). Only four AERONET stations have more than 15 years of V3L2.0 data: Banizoumbou, GSFC, Mauna Loa, and Sede Boker. These stations are included in this study. We then looked for stations with data for more than 10 years, and, if not, for more than 7 years. Besides the length of the AERONET records at a specific sites, we also wanted to select stations attempting to cover different aerosol influences: Mineral dust (MD), biomass burning (BB), background continental (BC), and urban-industrial (UI). Conversely to [4] and although Mauna Loa was initially considered in our work, the aerosol regimes of the free troposphere (Mauna Loa) and maritime aerosols were not further studied because of the complexity of extracting the radiative fluxes from space over dark surfaces (oceans) or over land surrounded by oceans in a reliable manner. With these criteria, we finally selected two AERONET stations per aerosol type considered. They are listed in Table 1  geographic position is shown in Figure 1. It is important to note the inherent limitation of the aerosol type representativeness of the selected sites due to the selection process itself. In such conditions, MD is probably more representative of North African mineral dust, BB of Brazilian biomass burning, and BC of Canadian background continental. (BC), and urban-industrial (UI). Conversely to [4] and although Mauna Loa was initially considered in our work, the aerosol regimes of the free troposphere (Mauna Loa) and maritime aerosols were not further studied because of the complexity of extracting the radiative fluxes from space over dark surfaces (oceans) or over land surrounded by oceans in a reliable manner. With these criteria, we finally selected two AERONET stations per aerosol type considered. They are listed in Table 1 and their geographic position is shown in Figure 1. It is important to note the inherent limitation of the aerosol type representativeness of the selected sites due to the selection process itself. In such conditions, MD is probably more representative of North African mineral dust, BB of Brazilian biomass burning, and BC of Canadian background continental.

Space-Time Co-Location Criteria
To guarantee the most exact possible space-time co-location of both AERONET estimates and CERES measurements we had to define several criteria based on the time difference, Δt, between AERONET estimates and CERES measurements, and on the distance difference, Δr, between AERONET coordinates and the center of the CERES pixel footprint. To guarantee time co-location we first allowed a restrictive criterion of 1 min on the time difference. With such a criterion, no coincidence was found in most of the stations. The main reason lies in the equator crossing time of the CERES satellites. For example CERES/Terra (Aqua) has a 10:30 AM (1:30 PM) equator crossing time. For small solar zenith angles near solar noon, when an optical airmass of 2 or less is not achieved, AERONET almucantar sequences (needed for the inversion algorithm) are made only Figure 1. AERONET stations and aerosol types selected for the validation. SEB stands for Sede Boker, BAN for Banizombou, ALF for Alta Floresta, RIB for Rio Branco, BRL for Bratts Lake, EGB for Egbert, GSF for GSFC, and BEI for Beijing.

Space-Time Co-Location Criteria
To guarantee the most exact possible space-time co-location of both AERONET estimates and CERES measurements we had to define several criteria based on the time difference, ∆t, between AERONET estimates and CERES measurements, and on the distance difference, ∆r, between AERONET coordinates and the center of the CERES pixel footprint. To guarantee time co-location we first allowed a restrictive criterion of 1 min on the time difference. With such a criterion, no coincidence was found in most of the stations. The main reason lies in the equator crossing time of the CERES satellites. For example CERES/Terra (Aqua) has a 10:30 AM (1:30 PM) equator crossing time. For small solar zenith angles near solar noon, when an optical airmass of 2 or less is not achieved, AERONET almucantar sequences (needed for the inversion algorithm) are made only hourly between 9:00 AM and 3:00 PM local solar time for the standard instrument and skipping only the noon almucantar for the polarization instrument [1]. Latitude-wise, most of our selected AERONET stations are not too far from the equator (see Figure 1), and in such conditions the AERONET measurements to compare are available only hourly at the best: The 1-min criterion was too restrictive. Then we tried a criterion of 5 min on the time difference. This criterion worked well at some stations but not at others. So, our first criterion was set to ∆t < 5 min and a distance difference ∆r < 10 km. This criterion, called 1a (see Table 2), is the most restrictive. As CERES SSF products are given at the instantaneous footprint resolution of diameter 20 km at nadir, a 10-km difference allows us to select usually no more than one pixel at a time, namely, the only one containing the AERONET coordinates. In very few cases the AERONET coordinates fall into the intersection of two CERES pixels, which results in two CERES coincident measurements for one AERONET estimate. These cases are not problematic if the assumption is made that both pixels are looking at nearly the same target, and they present the great advantage of enlarging the dataset, and thus of improving the statistics. At the stations with poor datasets with criterion 1a we also defined criterion 1b with ∆t < 5 min and ∆r < 20 km for the same purpose of improving the statistics. For the rest of the stations with no or almost no time co-located data with ∆t < 5 min, we set a much smoother criterion of ∆t < 60 min associated again to ∆r < 10 km (criterion 2a) or to ∆r < 20 km (criterion 2b). Criterion 1b (2b) was applied when the criterion 1a (2a) resulted in a number of points less than 40.
Finally we also force all cases to be clear-sky cases by selecting only cases with a CERES clear-sky fraction larger than 97%. The definition of the criteria used is given in Table 2. Table 2. Criteria for selecting space and time co-located AERONET and CERES measurements.

Corrections Applied
The spectral ranges of AERONET (0.2 to 4 µm) and CERES (0.3 to 5 µm) are slightly different. The shift of AERONET spectral range towards the ultra violet almost compensates the loss in the infrared. By means of a radiative transfer model, we performed various simulations of the diurnal cycle (from sunset to sunrise) of the upward fluxes at the TOA in both AERONET and CERES spectral ranges. These simulations showed that in the same conditions, the ratio of these fluxes (CERES divided by AERONET) was 0.998, which only started varying after the fourth digit. We multiplied all values of F SW,↑,TOA AER by this value of 0.998 in order to correct the very slight increase of radiation detected by AERONET with respect to CERES.
Criteria 2a and 2b (∆t < 60 min) can obviously lead to significant differences in the fluxes measured/estimated, since a 60-min difference at the hour of the day of highest solar flux gradient can lead to differences in solar zenith angles (SZA) of~10º and to differences in the solar fluxes of up to 20%. To cope with this problem, we perform an SZA correction on AERONET fluxes. The SZA correction is a two-step process. We first look for space-time co-located AERONET and CERES data without correction to get an estimate of the period of occurrence of these coincidences and of the mean SZA difference between both datasets (Table 3). Then we use a radiative transfer model, namely, GAME (the Global Atmospheric Model) described in [10,17], at the coordinates of the AERONET station and for the day corresponding to the middle of the occurrence period (overpass time in Table 3) to get an estimation of the upward shortwave fluxes at the TOA, i.e., the radiation reflected back to space, F SW,↑,TOA , as a function of solar time. We look for the solar zenith angles corresponding to our mean values, SZA CER and SZA AER , and calculate a correction factor as

Selected Stations and Aerosol Types
Eight sites were selected worldwide. The stations of Sede Boker and Banizombou were representatives of MD, Alta Floresta and Rio Branco of BB, Bratts Lake and Egbert of BC and GSFC and Beijing of UI. The geographical location of the stations are represented in Figure 1. Among them, five (SEB, ALF, RIB, BRL, and GSF) have already been described in [4], which the reader is referred to. To show the representativeness of each site selected, we plot in Figure 2 the monthly aerosol optical depth (AOD) per aerosol type.
While SEB is located in the northern part of the Israeli Negev desert, BAN is at the southern border of the Sahel desert. Both sites are influenced by MD from different source regions and have been used in the study of [18] who used the graphical framework introduced by [19] based on AOD, the Ångström exponent, AE, and its spectral curvature, δAE, i.e., the difference between AEs measured at two different pairs of wavelength. In brief, this study showed that each site is affected by MD at different periods of the year; that much higher AOD are found in BAN than in SED; and that AOD increases in BAN are often related to an increase of coarse particles (δAE < 0). Figure 2a reflects partially these results: AOD reaches its peak in April in BAN and in April-May and July -September in SEB. The AOD peak in BAN is three times higher than in SEB. Large AOD (>0.3) are observed all year round in BAN. There is a strong intra-month variation at both sites reflected by the large error bars.
The BB sites, ALF and RIB, are two out of the five stations that [4] averaged and considered as a unique case that they called Brazilian forest. Figure 2b shows that both sites have a similar annual cycle of the monthly AOD. The BB season is clearly visible between August and December with peaks in September. During the BB season, the AOD is larger in ALF than in RIB while the rest of the year the AOD (<0.2) is similar at both sites. Maximum monthly AOD of 1.31 and 0.92 are reached in September in ALF and RIB, respectively. This difference is mostly due to the location of both sites with respect to the synoptic conditions over South America during the dry season. [20] related the main corridors of smoke dilution with the presence of a high pressure area over central Brazil and the subsequent displacement of both the South Atlantic Subtropical High (westward) and the Intertropical Convergence Zone (northward), and the presence of the Andes Mountains acting as a barrier to the west.
BRL and EGB are located in non-industrial areas in the southern Canadian prairies. The type of aerosols is of a background nature, with episodic occurrences of BB. EGB may be sometime influenced by anthropogenic pollution as it is 80 km north of a large city (Toronto). Figure 2c shows similar annual cycles at both stations with AOD peaks in August of 0.21 and 0.18 in EGB and BRL, respectively. Except in April, the AOD in BRL is lower than in EGB by an average difference of 0.03. According to [21], Canadian western sites like BRL are more affected by forest fire smoke than southeastern sites like EGB. The fact that BRL AOD is slightly higher than EGB AOD in April, before the BB season, may be related to the presence in BRL of mineral dust from tilling at the start of the cropping season [21]. Finally the two UI sites, GSF and BEI, have quite different features (see Figure 2d). While GSF is described in [4], BEI is described in a recent paper from [22]. The most striking difference between both sites is the AOD difference: BEI has levels of pollution much more elevated than GSF, by an almost constant 0.5 AOD bias all year-round. If we compare the monthly AOD in BEI to the monthly occurrence vs. aerosol size of [22], one clearly sees that the AOD peak (1.17) in June is related to a predominance of fine particles, thus pollution. In GSF, the AOD peak (0.39) occurs in July and August.
respectively. Except in April, the AOD in BRL is lower than in EGB by an average difference of 0.03. According to [21], Canadian western sites like BRL are more affected by forest fire smoke than southeastern sites like EGB. The fact that BRL AOD is slightly higher than EGB AOD in April, before the BB season, may be related to the presence in BRL of mineral dust from tilling at the start of the cropping season [21].
Finally the two UI sites, GSF and BEI, have quite different features (see Figure 2d). While GSF is described in [4], BEI is described in a recent paper from [22]. The most striking difference between both sites is the AOD difference: BEI has levels of pollution much more elevated than GSF, by an almost constant 0.5 AOD bias all year-round. If we compare the monthly AOD in BEI to the monthly occurrence vs. aerosol size of [22], one clearly sees that the AOD peak (1.17) in June is related to a predominance of fine particles, thus pollution. In GSF, the AOD peak (0.39) occurs in July and August.

AERONET Flux Validation at the TOA: All Sites, Criterion 1a
We first had a look at the database. Four out of the eight sites selected had good temporal coincidences of AERONET and CERES and thus criterion 1 could be applied (see Table 3). In the rest of the sites, criterion 2 was applied. In these cases, the SZA difference goes from 1.1º up to 9.7º. Once the adequate criterion at each site was selected, we saw that the number of space-time co-located measurements was not very high taking into account the large time series considered (sometimes >15 years): It ranged from 45 in BRL (criterion 1b) up to 177 in ALF (criterion 2b). These numbers are relatively low. For example, let's consider SEB: Our dataset (>15 years) consists in 93 coincidences, while [4] had 1590 coincidences of AERONET estimates with ground-based pyranometer measurements for a period much shorter (2003-2005) than ours and a more restrictive temporal criterion (<30 s). This reflects the complexity of the validation of atmospheric parameters at the global

AERONET Flux Validation at the TOA: All Sites, Criterion 1a
We first had a look at the database. Four out of the eight sites selected had good temporal coincidences of AERONET and CERES and thus criterion 1 could be applied (see Table 3). In the rest of the sites, criterion 2 was applied. In these cases, the SZA difference goes from 1.1 • up to 9.7 • . Once the adequate criterion at each site was selected, we saw that the number of space-time co-located measurements was not very high taking into account the large time series considered (sometimes >15 years): It ranged from 45 in BRL (criterion 1b) up to 177 in ALF (criterion 2b). These numbers are relatively low. For example, let's consider SEB: Our dataset (>15 years) consists in 93 coincidences, while [4] had 1590 coincidences of AERONET estimates with ground-based pyranometer measurements for a period much shorter (2003-2005) than ours and a more restrictive temporal criterion (<30 s). This reflects the complexity of the validation of atmospheric parameters at the global scale with satellite-based measurements. An important note has to be made at this point. As it can be contrasted from Figure 2b and Table 3, the coincidences at the BB sites occur between April and August and are thus nearly outside of the BB season which is from August to December. In such conditions the coincidences at the BB sites are not going to be representative of BB but rather of Brazilian forest without nearly any BB influence. Before analyzing the results site-by-site or per aerosol influence, we applied criterion 1a to the entire database (all sites). Statistical results are shown in the first column ("All") of Table 4 and the scattergram CERES vs. AERONET is shown in Figure 3. Table 4 summarizes the results of our comparisons in terms of slope of the least-square fit, bias, root mean square error (RMSE), and correlation coefficient (R 2 ). In order to quantify the level of accuracy we also added the difference Di f f = F SW,↑,TOA AER − F SW,↑,TOA CER , the relative difference (i.e., the difference normalized to CERES flux, NormDi f f ) and the 5th and 95th percentiles. Figure 3a shows the scattergram of the coincidences and Figure 3b the histogram of the probability density function (PDF) of the difference F SW,↑,TOA AER − F SW,↑,TOA CER , as well as the AOD and the surface albedo (SA) as a function of this difference (each point represents the mean in each PDF bins). Figure 3b is intended to bring light on the possible causes of discrepancies observed between the fluxes estimated by AERONET and those measured by CERES. The total of coincidences with criterion 1a while considering all sites is 209. In Figure 3, BB is not represented. MD is represented by SEB (93 points), BC by BRL (12 points) and UI by GSF (75 points) and BEI (29 points). From these sites, SEB is the closest to the equator ( Figure 1) and has the lowest SZA (Table 3) it is thus logical that MD coincidences in Figure 3a  . Compared to this accuracy, the average flux difference, -2.52 Wm −2 , is also very small and the relative difference is lower than 1% in absolute values. If we have a look at the PDF histogram of the flux difference (Figure 3b), one sees that it is not perfectly symmetrical and that it is slightly shifted towards negative values. This tendency may suggest that AERONET fluxes are slightly lower, in average, than the measured ones. However, this result cannot be generalized and prescribed to a possible underestimation of AERONET fluxes since the average difference is within the accuracy of the measured fluxes. The interpretation of the plots of AOD and SA vs. the flux difference is not straightforward. As far as the AOD is concerned, large positive differences (>+10 Wm −2 ) seem to be caused by large AOD (>0.2). No such correlation exists for the large negative differences. The points of the positive tail of the histogram correspond to BEI: Large AOD and strong absorption properties [22]. In such conditions it seems that the AERONET radiative transfer model overestimates the flux reflected back to space, i.e., that either the atmospheric extinction is underestimated and/or the AERONET SA is overestimated. One of the most important improvements in the AERONET Version 3 algorithm (implemented in Version 2 already), regarding Version 1, is the assumption of a dynamic spectral and spatial satellite and model estimation of the surface albedo, including the bidirectional reflectance distribution function [9]. It has been shown that, at short wavelengths (440 nm), the retrieved spectral surface albedo increases with increasing AOD for AOD > 0.5; such dependence of surface retrievals on aerosol loading and Remote Sens. 2019, 11, 2168 9 of 15 vertical distribution can be due to contributions from multiple scattering interactions between aerosol and vertically stable Rayleigh (molecular) scattering, e.g., [23,24]. At short wavelengths where Rayleigh scattering is strong, the sensitivity of the algorithm to aerosol loading and vertical distribution increases with aerosol optical depth. Another possible source of error exists. According to [25], large AOD scenes, even in the presence of absorbing aerosols (the case of BEI), brightens the total reflectance (atmosphere + surface) over moderately reflecting targets. In such conditions, an underestimation of AERONET absorption contribution may also cause an overestimation of the TOA fluxes. If we now look at the SA vs. flux difference, a general tendency is observed: AERONET fluxes are underestimated for large SA (>0.2), i.e., bright surfaces. For large negative Di f f values (negative tail of the histogram, Figure 3b), all values of SA > 0.22 are from the SEB desert station. According to the sensitivity tests performed by [9], the AERONET surface albedo (used in the calculation of the TOA fluxes) uncertainty is produced mostly by an assumed aerosol profile and the retrieved angular shape uncertainty in the BRDF. They quantify the uncertainty on the SA due to both error source of the order of 0.04 for AOD < 0.7. Reference [26] developed a method to determine broadband surface albedo from CERES measurements and estimated in their introduction that an offset of 0.04 on the SA could produce a difference in terms of radiative fluxes at the TOA of the order of 15 to 20 Wm −2 . This range of values is comparable to what is observed in Figure 3b. However, this surface albedo error is not limited to AERONET retrievals. As [27] explained very well in their 2004 paper, satellite measurements of aerosol (dust) optical properties over bright desert surfaces are particularly difficult to make because: (1) the surface contribution to the radiance received by the satellite is larger than that over vegetated areas, (2) dust absorption is poorly known, and (3) the angular properties due to particle non-sphericity is also of limited availability. For more than two decades now, the atmospheric community has been seeking to develop time-varying BRDF models at the global scale to minimize the errors on the aerosol retrieval due to point (1). All in all, it seems that AERONET flux underestimation over bight surface may be due to an underestimation of the broadband surface albedo used in the calculation of the TOA fluxes. At the surface, [4] found an overestimation of AERONET downward fluxes over bright surfaces. It is important to recall that [4] used AERONET Version 1 products. Above bright desert surfaces the spectrally neutral SSA (single scattering albedo) of Version 1 tends to decrease the effect of iron-rich dust absorption compared to the improved SSA magnitude and spectra for coarse-mode dust implemented in Version 2 [9,27]. Such an underestimation of the absorption is probably the cause of the AERONET Version 1 overestimation of the downward fluxes over bright surfaces found by [4]. and spatial satellite and model estimation of the surface albedo, including the bidirectional reflectance distribution function [9]. It has been shown that, at short wavelengths (440 nm), the retrieved spectral surface albedo increases with increasing AOD for AOD > 0.5; such dependence of surface retrievals on aerosol loading and vertical distribution can be due to contributions from multiple scattering interactions between aerosol and vertically stable Rayleigh (molecular) scattering, e.g., [23,24]. At short wavelengths where Rayleigh scattering is strong, the sensitivity of the algorithm to aerosol loading and vertical distribution increases with aerosol optical depth. Another possible source of error exists. According to [25], large AOD scenes, even in the presence of absorbing aerosols (the case of BEI), brightens the total reflectance (atmosphere + surface) over moderately reflecting targets. In such conditions, an underestimation of AERONET absorption contribution may also cause an overestimation of the TOA fluxes. If we now look at the SA vs. flux difference, a general tendency is observed: AERONET fluxes are underestimated for large SA (>0.2), i.e., bright surfaces. For large negative Diff values (negative tail of the histogram, Figure 3b), all values of SA > 0.22 are from the SEB desert station. According to the sensitivity tests performed by [9], the AERONET surface albedo (used in the calculation of the TOA fluxes) uncertainty is produced mostly by an assumed aerosol profile and the retrieved angular shape uncertainty in the BRDF. They quantify the uncertainty on the SA due to both error source of the order of 0.04 for AOD < 0.7. Reference [26] developed a method to determine broadband surface albedo from CERES measurements and estimated in their introduction that an offset of 0.04 on the SA could produce a difference in terms of radiative fluxes at the TOA of the order of 15 to 20 Wm −2 . This range of values is comparable to what is observed in Figure 3b. However, this surface albedo error is not limited to AERONET retrievals. As [27] explained very well in their 2004 paper, satellite measurements of aerosol (dust) optical properties over bright desert surfaces are particularly difficult to make because: (1) the surface contribution to the radiance received by the satellite is larger than that over vegetated areas, (2) dust absorption is poorly known, and (3) the angular properties due to particle non-sphericity is also of limited availability. For more than two decades now, the atmospheric community has been seeking to develop time-varying BRDF models at the global scale to minimize the errors on the aerosol retrieval due to point (1). All in all, it seems that AERONET flux underestimation over bight surface may be due to an underestimation of the broadband surface albedo used in the calculation of the TOA fluxes. At the surface, [4] found an overestimation of AERONET downward fluxes over bright surfaces. It is important to recall that [4] used AERONET Version 1 products. Above bright desert surfaces the spectrally neutral SSA (single scattering albedo) of Version 1 tends to decrease the effect of iron-rich dust absorption compared to the improved SSA magnitude and spectra for coarse-mode dust implemented in Version 2 [9,27]. Such an underestimation of the absorption is probably the cause of the AERONET Version 1 overestimation of the downward fluxes over bright surfaces found by [4].

AERONET Flux Validation at the TOA: Site by Site Analysis
Site-by-site results are summarized in Table 4 and shown in Figure 4 (MD and BB) and Figure 5 (BC and UI) in the same way they were presented in Figure 3 for all sites. For the site-by-site analysis, the Ångström exponent AE has been added to the AOD plot to investigate possible correlations with a size-dependent parameter. With very few exceptions, the points of the scattergram at all sites fall within the ±15% region around the 1:1 line. Some of this variability is caused by the possible change in the aerosol optical properties due to the time difference (5 and 60 min) between both datasets. At all sites, the slope of the linear fitting is lesser than 1 and varies between 0.52 and 0.98. This result indicates that AERONET tendency, in average, is to underestimate low TOA fluxes and/or overestimate high TOA fluxes. However the flux interval of the coincident measurements at each site is usually rather small (<50 Wm −2 ); this is inherent to the principle of space-time co-location with satellites in a sun-synchronous orbit) which makes the analysis more sensitive to the variability of the estimated and measured fluxes. In addition, when all sites are analyzed together with criterion 1a (Figure 3), low and high fluxes compensate (slope = 1.04 and R 2 = 0.96). The correlation coefficient R 2 at individual sites is reasonably good and varies between 0.32 and 0.76. It is in the lower part of the interval (0.32-0.44) for MD and BB, and in the upper part (>0.54) for BC and UI, and seems anti-correlated with the number of points used in the linear fitting. The flux difference, Di f f , is in the range [-9.37; 6.76 Wm −2 ]. If we compare these absolute differences to the error associated to CERES fluxes, [5; 13 Wm −2 ] (see the previous section), for the interval of fluxes observed one sees that they are of the same order of magnitude, and that on average no systematic bias is detected. With respect to the criteria used, the only relevant difference between sites where criterion 1 was applied and those where criterion 2 was applied is in terms of the relative flux difference, NormDi f f . Where criterion 1 was applied, NormDi f f is less than 1% in absolute value (except in SEB where it is -2.13%), whereas at sites where criterion 2 was applied, NormDi f f varies, in absolute value, between 1.86% and 5.77%. The most probable explanation of this difference is the SZA correction which was made "on average" and not on a point-by-point basis.
The site-by-site analysis reveals that for MD the PDF is shifted slightly towards the negative values (Di f f = -5.87 and -9.37 Wm −2 for SEB and BAN, respectively) with a large interval of spread values (>40 Wm −2 , see the difference between 95th and 5th percentiles). No correlation of Di f f with AOD, AE, or SA is observed in SEB (Figure 4c) while a clear correlation is visible in BAN (Figure 4d): High AOD (>0.5) associated with low AE (<0.5) cause AERONET to overestimate the TOA fluxes. As seen in Section 3, desert dust produces much higher AOD values in BAN than in SEB. The surface albedo in BAN is flat and equal to 0.18 and smaller than in SEB (0.24). Consequently, the high AOD in BAN produces the same effect (an overestimation) than the high AOD in BEI (see the previous section). The cause is thus very likely the same: Multiple scattering interactions between aerosol and Rayleigh scattering as well as a possible underestimation of the aerosol absorption.
For the BB sites, the PDF are much more compact: The difference between 95th and 5th percentiles is less than 20 Wm −2 . The absolute difference between AERONET and CERES fluxes is positive (2.67 and 3.91 Wm −2 in ALF and RIB, respectively). The plots of AOD vs. flux difference show rather flat, low AOD and confirm that no strong BB events were detected. The highest AOD (~0.23; AE~1.59; Figure 4h) is reached in RIB and is well above the background value of the BB off-season, 0.11 (Figure 2b). Thus it contains very likely some coincidences corresponding to BB events. No clear correlation can be deduced between the flux difference and the AOD or the surface albedo.  At BC sites, like at MD sites, the PDF of the flux difference Di f f is quite spread as it ranges from -27.5 to 25 Wm −2 . No significant bias is observed in BRL (-0.41 Wm −2 ) while a positive bias is observed in EGB (6.76 Wm −2 ). In relative terms this bias corresponds to 5.77% of the observed flux. Such a high difference (compared to the other sites) is probably due to the approximation made in the SZA correction applied, as mentioned earlier. It is worth noting that EGB is the site with the longest coincident period between September and March, and thus with the largest variation of solar illumination. No clear correlation is observed between the flux difference and the AOD or the surface albedo either at BRL or at EGB. Although the analysis of [6] was made at background maritime sites, it seems appropriate to compare their results in the BC category. Out of their 29 co-located points, [6] found a linear relation of observed to modeled flux with a slope of 0.88, a positive bias of 8.70 Wm −2 (which in relative terms represented~8%-9%), and a correlation of R 2 = 0.87. Except for R 2 which is a little higher than in BRL (0.59) or EGB (0.76), the slope and the bias in [6] are within the range of values found at BRL and EGB.
As far UI sites are concerned, although the PDF is not symmetric in shape, the flux difference Di f f is small (-1.27 and 0.37 Wm −2 at GSF and BEI, respectively) and corresponds to less than 1% of the observed flux. No correlation between Di f f and the SA is visible at none of both sites. By contrast, the results in BEI discussed in the previous section are visible again in Figure 5h

Conclusions
We report for the first time a statistically representative comparison at global scale between time-space co-located AERONET-estimated upward solar fluxes at the top of the atmosphere and fluxes measured by CERES, a satellite-based 3-channel radiometer. The comparison was performed for eight AERONET sites with more than seven years of V3L2.0 data in the period 1993-2018 and representative of mineral dust, biomass burning, background continental, and urban-industrial aerosol types.
Prior to the analysis, we set up four criteria to guarantee space-time co-location and a minimum number of coincidences (>40) for the dataset to be statistically representative. The most restrictive criteria allows for a maximum time difference (∆t) of 5 min and a maximum distance difference (∆r) of 10 km. The other three criteria were set as (∆t < 5 min, ∆r < 20 km), (∆t < 60 min, ∆r < 10 km), and (∆t < 60 min, ∆r < 20 km). For the last two criteria, for which a 60 min time difference can lead to large solar zenith angle differences and thus to large flux differences, a correction factor was applied to estimate the TOA upward solar flux that AERONET would have detected at CERES solar zenith angle.
When considering all stations and the most restrictive criterion, 209 coincidences were found. The linear relation of observed to modeled flux is given by a slope of 1.04, a small, negative bias of -3.67 Wm −2 (which in relative terms represent less than 1%), well below the accuracy of CERES measurements, and a high correlation (R 2 = 0.96). From the histogram of the PDF of the flux difference Di f f = F SW,↑,TOA AER − F SW,↑,TOA CER , we observe that AERONET overestimation (the positive tail of the histogram), observed at MD (BAN) and UI (BEI) sites, is correlated with large AOD (>0.2). Our review of the literature and discussions point out that the driving causes are multiple scattering interactions between aerosol and Rayleigh scattering as well as a possible underestimation of the aerosol absorption in AERONET retrievals. We also observe that AERONET underestimation (the negative tail of the histogram), observed at the desert site of SEB, is correlated with large surface albedos (>0.2) and is probably due to an underestimation of the broadband surface albedo used in the calculation of the TOA fluxes in AERONET retrievals.
The site-by-site analysis reveals that the instantaneous difference, i.e., the difference of individual points, between AERONET-estimated and satellite-measured TOA upward solar fluxes never exceeds ±15%. Some of this variability is caused by the possible change in the aerosol optical properties due to the time difference (5 and 60 min) between both datasets. The slope of the linear fitting varies between 0.52 and 0.98, and the correlation coefficient at individual sites is reasonably good and varies between 0.32 and 0.76. The relative differences between AERONET and CERES fluxes is usually lesser than 1% at sites where the criteria with ∆t < 5 min were applied and between roughly 2% and 6% at the sites where the criteria with ∆t < 60 min were applied, pointing out the limitation of the "average" SZA correction performed at those sites. Despite (1) AERONET TOA fluxes being more complex to estimate than surface fluxes because of the inherent technique of the sun/sky-photometer it employs (a ground-based instrument, directly pointing at the sun), and (2) the complexity of obtaining space-truths (as opposed to ground-truths) and their reliability, we conclude that AERONET TOA fluxes compare rather well with satellite-based flux measurements. The agreement within 15% found at the TOA in this work is not much larger than the agreement of about 10% found at the surface by [3,4].