Synergistic Use of Hyperspectral UV-Visible OMI and Broadband Meteorological Imager MODIS Data for a Merged Aerosol Product

The retrieval of optimal aerosol datasets by the synergistic use of hyperspectral ultraviolet (UV)–visible and broadband meteorological imager (MI) techniques was investigated. The Aura Ozone Monitoring Instrument (OMI) Level 1B (L1B) was used as a proxy for hyperspectral UV–visible instrument data to which the Geostationary Environment Monitoring Spectrometer (GEMS) aerosol algorithm was applied. Moderate-Resolution Imaging Spectroradiometer (MODIS) L1B and dark target aerosol Level 2 (L2) data were used with a broadband MI to take advantage of the consistent time gap between the MODIS and the OMI. First, the use of cloud mask information from the MI infrared (IR) channel was tested for synergy. High-spatial-resolution and IR channels of the MI helped mask cirrus and sub-pixel cloud contamination of GEMS aerosol, as clearly seen in aerosol optical depth (AOD) validation with Aerosol Robotic Network (AERONET) data. Second, dust aerosols were distinguished in the GEMS aerosol-type classification algorithm by calculating the total dust confidence index (TDCI) from MODIS L1B IR channels. Statistical analysis indicates that the Probability of Correct Detection (POCD) between the forward and inversion aerosol dust models (DS) was increased from 72% to 94% by use of the TDCI for GEMS aerosol-type classification, and updated aerosol types were then applied to the GEMS algorithm. Use of the TDCI for DS type classification in the GEMS retrieval procedure gave improved single-scattering albedo (SSA) values for absorbing fine pollution particles (BC) and DS aerosols. Aerosol layer height (ALH) retrieved from GEMS was compared with Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP) data, which provides high-resolution vertical aerosol profile information. The CALIOP ALH was calculated from total attenuated backscatter data at 1064 nm, which is identical to the definition of GEMS ALH. Application of the TDCI value reduced the median bias of GEMS ALH data slightly. The GEMS ALH bias approximates zero, especially for GEMS AOD values of >~0.4 and GEMS SSA values of <~0.95. Finally, the AOD products from the GEMS algorithm and MI were used in aerosol merging with the maximum-likelihood estimation method, based on a weighting factor derived from the standard deviation of the original AOD products. With the advantage of the UV–visible channel in retrieving aerosol properties over bright surfaces, the combined AOD products demonstrated better spatial data availability than the original AOD products, with comparable accuracy. Furthermore, pixel-level error analysis of GEMS AOD data indicates improvement through MI synergy. Remote Sens. 2020, 12, 3987; doi:10.3390/rs12233987 www.mdpi.com/journal/remotesensing Remote Sens. 2020, 12, 3987 2 of 34


Advantages for aerosol retrieval
Cirrus cloud detection High spatial resolution (500 m-2 km) Sensitive to aerosol size Sensitive to aerosol absorption and height information Aerosol retrieval over relative brighter surface (e.g., city, desert) High spatial resolution (250 m) Sensitive to aerosol size

Disadvantages (limitations) for aerosol retrieval
Difficult to retrieve aerosol of brighter surfaces over land (e.g., city, desert) due to high surface reflectance in visible channel.
Cirrus, sub-pixel cloud contamination Insensitive to aerosol size Cirrus cloud contamination In this study, we investigated the possibility of constructing an optimal aerosol dataset by synergistic use of GEMS, hyperspectral UV-visible, and AMI broadband MI data. As GEMS data are not yet released, LEO satellite OMI and MODIS instruments were used to provide proxy datasets for GEMS and AMI, respectively. Both GEMS and OMI are grating imaging spectrometers, providing hyperspectral data for wavelengths of 300-500 nm with full-width at half-maximum (FWHM) values of~0.6 nm and spectral sampling of 0.2 nm. The main differences between GEMS and OMI data are the spatial resolution (3.5 × 8 and 13 × 24 km, respectively) and temporal resolution (eight times and once per day, respectively). MODIS has all wavelengths corresponding to the AMI visible-thermal IR range (470 nm, 511 nm, 640 nm, 856 nm, 1.38 µm, 1.61 µm, 6.952 µm, 8.562 µm, 11.212 µm, and 12.364 µm) needed for this study. The AMI and MODIS systems differ in terms of spatial resolution for each wavelength, being respectively 1 km and 500 m at 470 nm; 1 km and 1 km at 511 nm; 500 m and 250 m at 640 nm; 1 km and 250 m at 856 nm; 2 km and 1 km at 1.38 µm; 2 km and 500 m at 1.61 µm; 2 km and 1 km at 6.952 µm, 8.562 µm, 11.212 µm, and 12.364 µm. The systems also differ in terms of temporal resolution, with AMI once per 10 min and MODIS once per day. The overall procedure in constructing optimal aerosol datasets involved three stages: (1) improvement of GEMS cloud mask results with MI; (2) refining of GEMS predominant aerosol size mode selection among absorbing fine pollution particles (BC) and dust (DS) using broadband MI; (3) merging of the individual Level 2 (L2) AOD products and improved GEMS and MI products to create 'best' AOD datasets. Baseline GEMS aerosol products include AOD, SSA, and ALH products. The AOD and SSA products were compared using Aerosol Robotic Network (AERONET) data, and ALH and CALIOP datasets were compared by calculating ALH from backscattered coefficient data at 1064 nm.
OMI and MODIS data are described in Section 2; algorithms for synergistic use of the broadband MI are described in Section 3; results of synergistic use of the MI in terms of cloud masking and aerosol-type selection are described with aerosol product fusion results in Section 4; error analyses Remote Sens. 2020, 12, 3987 5 of 34 before and after synergistic use of MI are investigated in Section 5. Sections 6 and 7 summarizes the results with future perspectives.

OMI
OMI [34,35] is a grating imaging spectrometer of the push-broom type that simultaneously measures solar radiation backscattered from Earth at 270-500 nm with 0.6 nm spectral resolution using a two-dimensional charge-coupled device (CCD) of 580 spatial pixels × 780 spectral pixels. The instrument has the purpose of retrieving concentrations of atmospheric trace gases such as O 3 , NO 2 , SO 2 , HCHO, and aerosols. Spatial resolution at nadir is~13 km (along-track) × 24 km (across-track). The OMI instrument is aboard the Aura satellite, in the A-train satellite constellation. Aura was launched in July 2004, and the OMI has since provided data continuously. The satellite covers the globe once per day, and provides data every two days because of the 'row anomaly' problem that has existed since 2007 [36]. The row anomaly corresponds to a row on the CCD detector with poor-quality Level 1B (L1B) radiance data at all wavelengths that correspond to a particular viewing direction of the OMI. OMI L1B version 3 radiance data produced by a visible detector (OML1BRVG) at wavelengths of 349-504 nm were used in this study, and OMI L1B version 3 visible irradiance data (OML1BIRR) were used to calculate normalized radiance. Apart from the row-anomaly CCD pixels, OMI radiances are known to degrade by 1-2%, and irradiances by 3-8%, over the mission lifetime of~12 years [36]. The GEMS aerosol retrieval algorithms [33,37] were adopted for OMI L1B data to retrieve AOD and SSA data at 443 nm and ALH from 1 January to 31 December 2006. The spatial domain was East Asia (110 • E-150 • E; 20 • N-50 • N).

MODIS L1B Data
MODIS aboard the Aqua satellite is used to take advantage of the continuous eight-minute time difference from OMI. The Aqua satellite was launched by the National Aeronautics and Space Administration (NASA) in 2002 and passes from south to north over the equator in the afternoon (1:30 p.m. local time). The ground swath of MODIS/Aqua is 2330 km (across-track), covering the entire Earth surface every two days and acquiring data in 36 spectral bands from visible to IR with a spatial resolution of 250 m for bands 1-2, 500 m for bands 3-7, and 1000 m for bands 8-36. Here, Aqua MODIS L1B Collection 6 (C6) [38] data were used to generate the total dust confidence index (TDCI) [39] from MODIS IR data from 1 January 1 to 31 December 2006, for the above spatial domain. Calibration for MODIS L1B C6 was improved compared with MODIS L1B Collection 5 (C5) data, and a number of aerosol algorithm improvements have been published [40,41] (Section 3.2).
Although the TDCI can be retrieved from MODIS aboard Terra, we used only that aboard Aqua. The goal of this study was to investigate the possibility of synergy between GEMS and AMI data, with the time difference between the two being <10 min. Terra crosses the equator at 10:30 a.m. local time, so there is a three-hour time difference from the Aqua MODIS system. Terra TDCI data may thus not apply due to aerosol transport within the three hours, hence the use of Aqua data only.

MODIS Dark Target Aerosol Products
MODIS DT aerosol algorithms have been developed for dark surfaces (e.g., clear ocean surfaces and dense vegetation over land) [20,42]. The algorithms retrieve AOD parameters based on the assumption that for the two visible channels of 470 and 644 nm, surface reflectance can be estimated linearly from that of the shortwave IR channel at 2119 nm, which is transparent for aerosol signals. MODIS DT collection 6.1 (C6.1) was recently updated with surface reflectance data over land where urban coverage is >20% [43].
MODIS DT aerosol products were used to create 'best' AOD data as proxy data for AMI aerosol products. Other MODIS aerosol algorithms (e.g., DB and MODIS Multi-Angle Implementation of Atmospheric Correction (MAIAC)) were not used because the MODIS DT aerosol algorithm uses only wavelengths of >470 nm. With wavelengths of >470 nm it is difficult to retrieve accurate AOD data for bright surfaces over land such as deserts and cities, because surface reflectance contributions are larger than the TOA aerosol contribution; the same properties can be observed in AMI aerosol products.

MODIS Black Sky Albedo
In the merging of individual AOD products to create the 'best' AOD datasets, the accuracies of the original satellite AOD products were assessed with respect to the corresponding surface albedo. MODIS black sky albedo (BSA) visible (0.3-0.7 µm) products (MCD43A3 version 6) were used as surface albedo in the absence of diffuse radiation, providing data daily at 0.5 km (~0.005 • ) spatial resolution. The BSA was derived by integrating the bidirectional reflectance distribution function (BRDF) for the upper hemisphere with respect to reflected angle [44]. MODIS BRDF products are retrieved using multidate, multiband, atmospherically corrected surface-reflectance data from the Aqua and Terra MODIS systems with a 16-day cycle. BSA can be expressed as follows: where dI(ω r ) is the portion of total radiance reflected in the direction defined by ω r , θ r is the angle between the normal to the surface and the direction of reflected light, dω r is the element of the solid angle, I(ω i ) is the radiance incident on the surface from the direction ω i , and θ i is the angle between the normal to the surface and the direction of incident light. The accuracy of MCD43 in clear-sky situations is within 5%, as described by the MODIS land validation team [45]. Here, BSA was selected for a given day, with all quality data being used.

CALIOP
CALIOP aboard the Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation (CALIPSO) satellite was launched on 28 April 2006, and is an LEO active lidar instrument that passes over the equator from south to north at 1:30 p.m. local time in the A-train constellation. CALIOP utilizes three receiver channels: one measuring 1064 nm backscatter intensity, and two measuring orthogonally polarized components of the 532 nm backscattered signal. From these three receiver channels, high-resolution vertical aerosol and cloud profiles for up to 40 km above sea level have been provided since mid-June 2006 during both day-and night-time. Although the horizontal spatial resolution of CALIOP is high at 333 m [46], because of the consistent short time gap between CALIOP and MODIS (<~2 min) or OMI (<~8 min during 2005 to 2007), it provides useful synchronous data for altitudes of clouds and aerosols, which cannot be probed by nadir viewing satellites [13]. CALIOP provides clear-sky, cloud, tropospheric aerosol, and stratospheric aerosol data separately through the 'vertical feature mask' variable, as well as high-resolution vertical profiles of aerosol and cloud.
In this study, the most recent version 4.2.0 standard product (CAL_LID_L2_05kmAPro-Standard-V4-20) was adopted to calculate ALH from 1064 nm total-attenuated-backscatter (km −1 sr −1 ) daytime measurement data. Although CALIOP provides both 532 and 1064 nm non-polarized backscatter attenuation data, the 532 nm data appear to lose sensitivity in the presence of carbonaceous aerosols due to attenuation [13], so 1064 nm data were used. ALH values calculated from CALIOP Remote Sens. 2020, 12, 3987 7 of 34 data were used in GEMS ALH validation. ALH (Z aer ) is defined as the attenuated-backscatter-weight height [13,47], expressed as: where n represents the layer number from the surface to 10 km altitude, and B sc (i) represents the attenuated backscatter at height H(i). A Gaussian shape with 1 km FWHM was assumed for the aerosol vertical distribution. This ALH concept was first introduced to the Total Ozone Mapping Spectrometer (TOMS) near-UV aerosol algorithm and was also used in the OMI aerosol algorithm [13,47]. Torres et al. [47] investigated the sensitivity of TOMS retrieved SSA and AOD data to aerosol vertical distribution using the Micropulse Lidar data of the South African Regional Science Initiative (SAFARI 2000) campaign [48]. The Gaussian shape implies that ALH has no significant effect on AOD (2-13%) and SSA (~0.1%) retrieval results, except for days with two aerosol layers [47].

AERONET
AERONET is a global aerosol monitoring network involving ground-based sun-sky photometer [49]. Sun-sky photometers measure direct sun irradiance and directional sky radiance in almucantars [50]. The direct sun data provide column-integrated AOD at nominal standard wavelengths of 340, 380, 440, 500, 675, 870, 935, 1020, and 1640 nm, and have been used for ground-truth AOD measurements for over 25 years. Column-averaged aerosol microphysical and optical properties (e.g., refractive index, volume size distribution (VSD), and SSA) at 440, 670, 870, and 1020 nm are retrieved by combining the AOD data and almucantar sky measurements at the four wavelengths.
Uncertainties in AERONET direct-measurement AOD data range from ±0.01 at visible wavelengths to ±0.02 at near-UV wavelengths [51]. For AERONET version 2 L2 inversion data, the theoretical uncertainty of SSA data is ±0.03 for 440 nm AOD > 0.4 [51,52]. AERONET inversion SSA data are provided only for 440 nm AOD > 0.4, so SSA data can be compared only under those conditions. Uncertainties in SSA are caused mainly by instrument calibration inaccuracies [52]. The instruments are calibrated regularly with reference Cimel instruments at the NASA Goddard Space Flight Center, using the Langley method. The AERONET database has been updated recently to version 3 [50], with the new algorithm providing cloud screening and automatic instrument anomaly control. The AERONET version 3 L2 database also allows more AOD observations under partly cloudy conditions [53]. Reported AERONET AOD differences between versions 3 and 2 average +0.002 (±0.004 standard deviation, SD) for time-matched observations [50].
Here, AERONET L2 AOD values at 380 and 440 nm, and SSA values at 440 nm, were used to compare satellite-retrieval AOD and SSA data. AERONET direct-measurement AOD data were used as true references for validation of satellite-retrieval AOD data, while the comparison of satellite-retrieval SSA and AERONET SSA data are not considered appropriate for validation because both inversion techniques involve assumptions [32]. In field studies [54,55], Jethva et al. [32] found that although the theoretical uncertainty of AERONET SSA is ±0.03, the actual uncertainty may be~±0.05. We, therefore, used both ±0.03 and ±0.05 uncertainty thresholds in comparisons of retrieved SSA and AERONET data.

Algorithm for Synergistic Broadband Meteorological Imager Use
Flowcharts for producing an optimal aerosol dataset are shown in Figures 1 and 2, with the former pertaining to the improvement of GEMS aerosol algorithms in terms of cloud masking and predominant aerosol size mode selection, and the latter to the merging of the improved GEMS and MI aerosol products.  [33,37]. The figure is modified from Figure 1 of Go et al. [33]. Blue shading indicates synergistic use of meteorological instruments. The unshaded components of Figure 1 correspond to the GEMS aerosol algorithm, which was developed and tested continuously with OMI L1B data; improvements have been described in previous studies [33,37]. The blue shading indicates synergistic use of a meteorological instrument, with MI cloud masking (Section 3.1) being applied first. The high spatial resolution of the MI with IR channels helped mask sub-pixel clouds and cirrus clouds for GEMS aerosol retrieval results. The TDCI, calculated from MI IR channels, was applied next to the GEMS aerosol algorithm to separate dust from the GEMS aerosol type. Corrected aerosol types were applied to the GEMS aerosol algorithm (Section 3.2). Finally, as shown in Figure 2, GEMS and MI AOD data were combined using the MLE method by applying weights calculated from the RMSE of the original AOD products (Section 3.3). Detailed descriptions of each step are provided below.

GEMS Aerosol Algorithm and Synergistic Use of Cloud Masking
Detailed descriptions of improvements to the GEMS aerosol algorithm have been published previously [33,37]. GEMS aerosol algorithms are based on optimal estimation (OE) involving an offline look-up table (LUT). Aerosol models included in the LUT were integrated from the long-term AERONET inversion dataset. The spectral dependence of aerosol absorption at wavelengths of <440 nm was refined using UV Multi-Filter Rotating Shadowband Radiometer (UV-MFRSR) groundbased measurement data obtained over Seoul, Korea, to give the most appropriate spectral aerosol absorption model for East Asia [49,56]. The Dubovik package [57][58][59] was applied to generate the LUT for the DS aerosol model. The aspect ratio distribution of DS was adopted from previous field data [ [60][61][62][63], because of the aspect ratio distribution varying by region [58]. For the inversion procedure, the aerosol types BC, DS, and non-absorbing (NA) were first selected by calculating UVAI The unshaded components of Figure 1 correspond to the GEMS aerosol algorithm, which was developed and tested continuously with OMI L1B data; improvements have been described in previous studies [33,37]. The blue shading indicates synergistic use of a meteorological instrument, with MI cloud masking (Section 3.1) being applied first. The high spatial resolution of the MI with IR channels helped mask sub-pixel clouds and cirrus clouds for GEMS aerosol retrieval results. The TDCI, calculated from MI IR channels, was applied next to the GEMS aerosol algorithm to separate dust from the GEMS aerosol type. Corrected aerosol types were applied to the GEMS aerosol algorithm (Section 3.2). Finally, as shown in Figure 2, GEMS and MI AOD data were combined using the MLE method by applying weights calculated from the RMSE of the original AOD products (Section 3.3). Detailed descriptions of each step are provided below.

GEMS Aerosol Algorithm and Synergistic Use of Cloud Masking
Detailed descriptions of improvements to the GEMS aerosol algorithm have been published previously [33,37]. GEMS aerosol algorithms are based on optimal estimation (OE) involving an off-line look-up table (LUT). Aerosol models included in the LUT were integrated from the long-term AERONET inversion dataset. The spectral dependence of aerosol absorption at wavelengths of <440 nm was refined using UV Multi-Filter Rotating Shadowband Radiometer (UV-MFRSR) ground-based measurement data obtained over Seoul, Korea, to give the most appropriate spectral aerosol absorption model for East Asia [49,56]. The Dubovik package [57][58][59] was applied to generate the LUT for the DS aerosol model. The aspect ratio distribution of DS was adopted from previous field data [60][61][62][63], because of the aspect ratio distribution varying by region [58]. For the inversion procedure, the aerosol types BC, DS, and non-absorbing (NA) were first selected by calculating UVAI and visible aerosol index (VISAI) values. Surface reflectance corrections were based on the minimum Lambertian equivalent reflectivity (LER) [64]. Initial AOD and SSA estimates were retrieved using a two-channel method, and ALH was retrieved using an OE method involving six channels (354, 388, 412, 443, 477, and 490 nm). The definition of ALH in this study was defined in Equation (2).
MODIS L2 provides cloud-mask products, which were applied in the aerosol retrieval algorithm at a 500-m spatial resolution, with 0 indicating clear conditions and 1 cloudy conditions. For each pixel of MODIS L2, the nearest GEMS pixel was allocated, and MODIS L2 pixels within a GEMS pixel were averaged. If the average cloud fraction for a GEMS pixel was above a certain value (from 20% to 80%), that pixel was considered contaminated by cloud, and masked.

Total Dust Confidence Index
The 300-500 nm channels cannot distinguish easily between smoke and dust, with both types of particles causing Mie scattering in that range. Mie scattering efficiency is at the maximum when particle diameter and wavelength are similar. Particle size does not distinguish easily between smoke and dust in the 300-500 nm range. However, IR channels can distinguish between smoke and dust, with smoke involving Rayleigh scattering and dust Mie scattering. Previous studies have demonstrated dust particle detection using IR channels such as 8.6, 11, and 12 µm. Park et al. [39] developed a new dust detection algorithm involving the TDCI, overcoming previous dust-detection limitations in distinguishing clear-sky pixels, optical properties of dust particles, and cloud contamination.
The TDCI algorithm uses 6.3, 8.6, 11, and 12 µm wavelengths, which are applicable for both MODIS and AMI instruments, with TDCI > 40 indicating a dust aerosol type [39]. TDCI values are more accurate with high AOD and low FMF values [39]. For cases with TDCI > 40, FMF is <0.4 and AOD > 0.4, with agreement of TDCI being~0.146 and for false detection 0.444 [39]. However, for cases with TDCI > 40, FMF < 0.4, and AOD > 1.0, the agreement of TDCI was 0.764 and 0.379 for false detection [39].

Maximum Likelihood Estimation Method
The MLE method provides a weighted average of original data products with statistical variables such as mean, number of counts, and SD being used as weighting factors. The method may consider pixel-level uncertainties of aerosol products. In this study, Equations (3) and (4) of the MLE method [23] were used to calculate merged AOD products at each point with merged AOD re-gridded from the original AOD value: where τ MLE i is the final AOD at the new spatial resolution of grid point i; R i,k is the SD of the new grid point i for the original AOD product k (SD can be used as a measure of AOD uncertainty for each AOD and surface albedo interval, and AOD expected error when no information is available for the true reference dataset, such as AERONET data); τ i,k is the mean AOD value at the new spatial resolution of grid point i (from the spatial resolution of instrument-AOD original product k); N is the total number of original AOD products; s i,k is the mean instrument AOD at the new spatial resolution grid point i; g i is the mean AERONET AOD value for gridded point i; M is the number of collocated points between s i,k and g i . The term s i,k is the spatial average over 0.4 • , and g i is the temporal average over 30 min.
GEMS 443 nm AOD values simulated with OMI L1B data ( Figure 2) were converted to 550 nm AOD values using the assumed absorption Ångström exponent (AAE) value. The MYD04 550 nm AOD data were converted to GEMS grid data. Each MYD04 pixel was assigned to the nearest GEMS pixel and the collected MODIS data within each GEMS pixel were averaged. The RMSE values of each original instrument AOD product (MYD04 AOD and GEMS L2 AOD) were calculated. To merge several instrument's aerosol products without using ground-based measurement data such as AERONET, it was necessary to find the relationship of RMSE with other parameters concerning regional or global AOD products. Surface reflectance is a major source of AOD error. AOD magnitude is also a key parameter that affects AOD error; with strong AOD plumes, surface reflectance will not be the major source of error in AOD retrievals. Therefore, four sets of surface albedo values (0.0-0.05, 0.05-0.08, 0.08-0.11, and 0.11-0.25) and four sets of AOD values (0.0-0.25, 0.25-0.5, 0.5-0.8, and 0.8-5.0) were considered when calculating RMSE values [23]. For surface albedo, MODIS BSA visible (0.3-0.7 µm) products (MCD43A3 version 6) were used (MCD43 products are independent variables from the GEMS aerosol and MODIS DT aerosol products).

Synergistic Use of Meteorological Imagers for Cloud Masking
Case-study results for synergistic use of an MI in cloud masking are shown in Figure 3. The IR channels detect cirrus clouds, owing to ice particle absorption properties. The GOCI red-green-blue (RGB) images for 22 May and 25 May 2016, are shown in Figure 3a,b, respectively, with each being a combined true-color TOA-reflectance image for 660 (red), 555 (green), and 490 nm (blue) after correction of the Rayleigh optical depth signal. On 22 May, cirrus clouds were present over the northeastern Korean Peninsula and Russia at 130 The GEMS AOD results retrieved using OMI L1B data are shown in Figure 3c, with cirrus cloud regions being misidentified as high aerosol plumes with AOD values of >2.5. The MODIS and AHI cloud-mask data were used with the GEMS AOD algorithm to mask the misidentified pixels. Figure 3e,g show the effect of synergistic use of MODIS (500 m spatial resolution) and AHI cloud-mask data, respectively; in both cases, the cirrus clouds are well masked. In contrast, Figure 3b,d,f,h show thick aerosol plumes, including over the western Korean Peninsula and the ocean (Figure 3b,d). The aerosol plumes over the ocean remained after applying MODIS and AHI cloud masking as shown in Figure 3f,h, respectively.
Results of a GEMS 443 nm AOD validation test with OMI L1B data, 2005-2007, are shown in Figure 4, with AERONET L2 direct-sun AOD data being used as true references. Validation of GEMS AOD data constrained with MODIS cloud fractions of <80% to <20% (in sequential order) is demonstrated in Figure 4a-d.
Cloud generally has a higher optical depth than AOD, and most of the overestimated satellite-retrieval AOD points were caused by clouds. As the AOD-pixel cloud fraction decreased from Figure 4a-d, the overestimated pixels were screened out, especially over areas with AERONET AOD values of <1.0, with the correlation coefficient (R) increasing from 0.786 to 0.871, the RMSE decreasing from 0.341 to 0.276, and the Q (fraction of data points within the expected error range of 30% or 0.1; [65]) increasing from 46.77% to 57.32%. This demonstrates that MODIS cloud masking (i.e., high-spatial-resolution MI IR channel data) can be useful for improving GEMS AOD data in cases of cloud contamination. Remote Sens. 2018, 10, x FOR PEER REVIEW 12 of 37  Results of a GEMS 443 nm AOD validation test with OMI L1B data, 2005-2007, are shown in Figure 4, with AERONET L2 direct-sun AOD data being used as true references. Validation of GEMS AOD data constrained with MODIS cloud fractions of <80% to <20% (in sequential order) is demonstrated in Figure 4a-d. Cloud generally has a higher optical depth than AOD, and most of the overestimated satelliteretrieval AOD points were caused by clouds. As the AOD-pixel cloud fraction decreased from Figure  4a-d, the overestimated pixels were screened out, especially over areas with AERONET AOD values of <1.0, with the correlation coefficient (R) increasing from 0.786 to 0.871, the RMSE decreasing from 0.341 to 0.276, and the Q (fraction of data points within the expected error range of 30% or 0.1; [65]) increasing from 46.77% to 57.32%. This demonstrates that MODIS cloud masking (i.e., high-spatialresolution MI IR channel data) can be useful for improving GEMS AOD data in cases of cloud contamination.

Application of TDCI
The application of TDCI to the GEMS aerosol retrieval procedure for a severe dust case (8 April 2006) is demonstrated in Figure 5 where (a) indicates MODIS RGB TOA reflectance data; (b) indicates GEMS aerosol types (BC, DS, NA) with UVAI and VISAI; (c) is the corresponding OMI UVAI; (d) shows the calculated MODIS TDCI as an indicator of dust (to clearly identify dust pixels, only TDCI

Application of TDCI
The application of TDCI to the GEMS aerosol retrieval procedure for a severe dust case (8 April 2006) is demonstrated in Figure 5 where (a) indicates MODIS RGB TOA reflectance data; (b) indicates GEMS aerosol types (BC, DS, NA) with UVAI and VISAI; (c) is the corresponding OMI UVAI; (d) shows the calculated MODIS TDCI as an indicator of dust (to clearly identify dust pixels, only TDCI values >40.0 are plotted); (e) shows GEMS aerosol types corrected from (b) after using TDCI; (f) shows a qualitative comparison of OMI aerosol types with OMI operational aerosol products. OMI operational aerosol products classify aerosol types as 'SMOKE', 'DUST', or 'SULFATE' based on UVAI and AIRS CO data [13]. The OMI aerosol types 'SMOKE', 'DUST', and 'SULFATE' in Figure 5f may correspond to 'BC', 'DS', and 'NA' aerosol types in GEMS (Figure 5b,e), respectively. On that particular day, high UVAI values with high TDCI (>40) were recorded for long, thick aerosol bands in the Manchuria region, with detection of coarse absorbing particles. The original GEMS aerosol type (Figure 5b) displayed most pixels as DS aerosols; however, after applying TDCI, this type remained only for severe aerosol plume areas. Conversely, the OMI aerosol type was 'SMOKE' in dust plume areas. Although OMAERUV applies a straightforward separation of the 'SMOKE' aerosol type based on AIRS CO data, dust aerosol can be classified as 'SMOKE' when present over a high-CO area such as eastern China during the spring season [13].  Here, OMAERUV (Figure 5f) indicated much 'SULFATE' without appearing as NA in GEMS (Figure 5e) because of the different threshold for aerosol type selection. OMAERUV first applies UVAI to separate 'SMOKE' and 'DUST' (high UVAI) from 'SULFATE' (low UVAI), followed by AIRS CO data to separate 'SMOKE' (high CO) from 'DUST' (low CO). The GEMS aerosol algorithm first applies UVAI to separate BC and DS (high UVAI) from NA (low UVAI) types, followed by VISAI data to separate BC (low VISAI) and DS (high VISAI) types. Both the GEMS and OMI aerosol algorithm use UVAI, but our UVAI threshold was lower than that of OMI, causing detection of more BC aerosols. However, the BC-detection area in Figure 5e, over southern Japan, seems to have low RGB AOD values (Figure 5a), so the aerosol type has little practical meaning in this low-AOD case; rather, it is required only as a priori information for the aerosol retrieval procedure.
The MODIS RGB image for the same day (8 April 2006) is shown in Figure 6a, with Figure 6b,c showing the GEMS AOD and SSA data, respectively, before applying the TDCI-corrected aerosol type. Figure 6d displays TDCI calculated with MODIS; (e) and (f) represent GEMS AOD and SSA, respectively, after application of the TDCI-corrected aerosol type. The retrieved SSA value for the northern Korean Peninsula decreased from 0.96 to 0.92 (Figure 6c-f), whereas the retrieved AOD value (Figure 6b-e) for low-AOD dust-storm regions (105 • E-130 • E; 40 • N-50 • N) showed no significant change. Sensitivity tests of the GEMS aerosol algorithm (not shown here) indicate that when AOD is low, TOA reflectance does not change significantly with aerosol type. In contrast, for SSA, the lower the AOD, the lower its sensitivity to TOA reflectance, making it difficult to retrieve SSA accurately.

POCD and POFD Analysis
AERONET aerosol data were selected as reference data to validate aerosol types between forward and inverse models, based on GEMS (before and after applying the TDCI during the inversion procedure) and OMI aerosol types. Aerosol types were investigated qualitatively using AERONET data, applying traditional statistical analysis such as accuracy, probability of correct detection (POCD), and the probability of false detection (POFD) [66]. FMF and 440 nm SSA data were first used to classify aerosol type based on AERONET inversion L2 data [31] before Equations (5)-(7) for accuracy, POCD, and POFD were applied to compare the GEMS and AERONET aerosol types.
Here a represents true detections (positives; i.e., the number of collocated points where both GEMS and AERONET algorithms indicate the existence of 'dust'); b represents false positives (i.e., the number of collocated points where AERONET indicates 'no dust' but GEMS indicates 'dust'); c represents false negatives (i.e., the number of collocated points where AERONET indicate 'dust' but GEMS indicates 'no dust'); d represents true negatives (i.e., the number of collocated points where both AERONET and GEMS algorithms indicate 'no dust').  Tables 2-4, with statistics before and after applying the TDCI shown together for the GEMS aerosol type. For aerosol type BC, the GEMS POCD increased from 38% to 46% with the application of TDCI, while POFD increased from 16% to 47% and accuracy decreased from 62.5% to 52%. For the DS aerosol type, the GEMS the POCD decreased from 100% to 66%, while the POFD decreased from 93% to 82%. However, there were few DS a and c data points because of the use of AERONET inversion data points in the statistical analysis, so the POCD is not reliable. Nevertheless, the GEMS DS accuracy increased from 72% to 94% after applying the TDCI, due to the increased number of data points available for d. For the NA aerosol type, the number of data points for a, b, c, and d was the same before and after applying the TDCI. For the OMI aerosol type [13], the overall BC and DS accuracies were higher than that of the GEMS aerosol type. OMI uses AIRS CO data for separating the BC and DS aerosol types. The NA accuracy was lower than that of the GEMS aerosol type.     For nadir-view instrument data, accurate aerosol-type retrieval is difficult because of its low information content. Although AERONET inversion data have high accuracy [51,52,58], the aerosol type-classification between forward (based on the method of Lee et al. [31]) and inverse modeling (based on UVAI, VISAI, and TDCI) are not identical. Therefore, the qualitative comparison method was used here to assess GEMS aerosol-type accuracy.

Validation of Aerosol Results after Applying TDCI
AOD validation and SSA comparison results before and after applying the TDCI are described in Figure 7, where (a), (c), and (e) represent AOD validation and SSA comparison results before applying TDCI; (b), (d), and (f) after applying TDCI. The GEMS 380 nm AOD values are extrapolated from 443 nm AOD data based on the AAE value of the selected aerosol type. SSA comparison results improved significantly after the application of the TDCI, but AOD validation showed no significant improvement. For SSA, the fraction of data points within the 0.03 difference range increased from 42% to 51%. The sensitivity test of the GEMS aerosol algorithm indicates that SSA is significantly affected by incorrect aerosol size information, while AOD is less affected [37]. The 380 nm AOD validation results improved for high-AOD cases. The remaining error may thus be due to errors in surface reflectance data or forward aerosol modeling.
Remote Sens. 2020, 12, x FOR PEER REVIEW 20 of 37 Results of bias analyses for retrieved ALH data are shown in Figure 8 before (a,c,e) and after (b,d,f) application of the TDCI. As mentioned in Section 2.  Results of bias analyses for retrieved ALH data are shown in Figure 8 before (a,c,e) and after (b,d,f) application of the TDCI. As mentioned in Section 2.3, the 1064 nm total attenuated backscatter (km −1 sr −1 ) from version 4.2.0 Standard product (CAL_LID_L2_05kmAPro-Standard-V4-20) was used to calculate CALIOP ALH. To exclude the presence of clouds and effects of noise, average attenuated backscattered (total 399 layers average) larger than 0.005, and smaller than 0.0015 were rejected, respectively [13]. CALIOP ALH were then calculated using Equation (2). GEMS ALH pixels within a 20 km radius of the CALIOP path were collected and used in the validation. Due to the lack of CALIOP data, the validation period was 13 June to 31 December 2006, with a total of 3064 data points. ALH bias (GEMS ALH minus CALIOP ALH) are listed sequentially in order of corresponding GEMS AOD, and 80 bins are collected into one. The circle and error bar in each bin represent the mean and SD of the 80 points collocated between GEMS aerosol and CALIOP data. Figure 8a,b indicate the ALH bias with respect to retrieved GEMS AOD values.
Remote Sens. 2020, 12, x FOR PEER REVIEW 21 of 37 attenuated backscattered (total 399 layers average) larger than 0.005, and smaller than 0.0015 were rejected, respectively [13]. CALIOP ALH were then calculated using Equation (2). GEMS ALH pixels within a 20 km radius of the CALIOP path were collected and used in the validation. Due to the lack of CALIOP data, the validation period was 13 June to 31 December 2006, with a total of 3064 data points. ALH bias (GEMS ALH minus CALIOP ALH) are listed sequentially in order of corresponding GEMS AOD, and 80 bins are collected into one. The circle and error bar in each bin represent the mean and SD of the 80 points collocated between GEMS aerosol and CALIOP data. Figure 8a,b indicate the ALH bias with respect to retrieved GEMS AOD values.  The overall mean ALH bias decreased slightly after application of the TDCI, with the bias approximating zero, especially for AOD >~0.4. The ALH bias with respect to retrieved SSA for all AOD cases is indicated in Figure 8c and d. To clearly indicate ALH bias with respect to SSA only, GEMS AOD < 0.4 were omitted in Figure 8e,f. AERONET SSA is reported only for AOD > 0.4, because SSA sensitivity and accuracy are low when AOD is low [51,52]. Figure 8e,f indicate the ALH bias with respect to retrieved SSA for GEMS AOD > 0.4, with the bias being close to zero for both cases, especially with absorbing particles (SSA < 0.95). This is consistent with UV channels being more sensitive than visible channels to aerosol absorption signals due to the predominant Rayleigh scattering.

Weight Calculation
Calculated RMSE values for GEMS AOD simulated with OMI L1B data, and MODIS DT AOD data for the three years 2005-2007 are given in Table 5, which indicates that the RMSE values satisfactorily indicate characteristics of GEMS and MODIS AOD. For bright-surface albedo ranges of 0.08-0.11 and 0.11-0.25, the number of GEMS AOD data points (36, 5, and 13) and data number percentages (3.54%, 0.49%, and 1.28%; see Table caption for definition) were higher than those of MODIS DT AOD data.  In the lowest surface albedo (0.0-0.05) and AOD (0.0-0.25) ranges, the GEMS RMSE (0.087) is lower than that of MODIS DT (0.123). For GEMS, the aerosol algorithm uses UV channels in the inversion procedure, where Rayleigh signals predominate over surface-reflectance signals, so GEMS AOD products are less affected by surface-reflectance variations. However, the data number percentage of GEMS AOD (15.93%) is lower than that of MODIS DT products (23.97%), indicating that the GEMS aerosol algorithm sometimes retrieves negative AOD for very clear regions. When AOD values are very low, TOA reflectance difference values (e.g., TOA reflectance difference between AOD of 0.0 and 0.1) are very low. Therefore, both the combination of inaccurate aerosol-type assumptions and the over-compensation of Rayleigh scattering or overestimation of surface albedo may cause this negative AOD problem, which required further assessment.

Validation Results
Results of a case study of AOD data fusion on 24 March 2018, are shown in Figure 9. On that day, MODIS RGB data indicated aerosol plumes over eastern China with clouds over the ocean as shown in the MODIS RGB image of Figure 9a, while there were low aerosol concentrations over northwestern China. GEMS 443 nm AOD data simulated with OMI L1B are shown in Figure 9c. OMI has row-anomaly issues in the middle part of the across-track sequence, and is thus not suitable for aerosol retrievals [36]. MODIS DT C6.1 550 nm AOD products are shown in Figure 9d. MODIS DT uses visible channels for retrieval, so does not retrieve aerosol properties over bright surfaces as shown in Figure 9d. Fused 550 nm AOD products are shown in Figure 9b, based on extrapolation of 443 nm GEMS AOD to 550 nm using selected aerosol model assumptions. Over bright surface areas, GEMS aerosol products were used. For the aerosol plume area, GEMS and MODIS DT aerosol products were fused using the MLE method. The western coast of the Korean Peninsula was within the sun-glint area for both OMI and MODIS instruments, so aerosol data were not retrieved. The fused 550 nm AOD products exhibit smooth and consistent spatial distributions compared with the original individual AOD products, implying that the ranges of AOD and surface albedo are well established. If the two ranges were not well established, the spatial distribution of the fused AOD products would have been rugged and uneven.
in the inversion procedure, where Rayleigh signals predominate over surface-reflectance signals, so GEMS AOD products are less affected by surface-reflectance variations. However, the data number percentage of GEMS AOD (15.93%) is lower than that of MODIS DT products (23.97%), indicating that the GEMS aerosol algorithm sometimes retrieves negative AOD for very clear regions. When AOD values are very low, TOA reflectance difference values (e.g., TOA reflectance difference between AOD of 0.0 and 0.1) are very low. Therefore, both the combination of inaccurate aerosoltype assumptions and the over-compensation of Rayleigh scattering or overestimation of surface albedo may cause this negative AOD problem, which required further assessment.

Validation Results
Results of a case study of AOD data fusion on 24 March 2018, are shown in Figure 9. On that day, MODIS RGB data indicated aerosol plumes over eastern China with clouds over the ocean as shown in the MODIS RGB image of Figure 9a, while there were low aerosol concentrations over northwestern China. GEMS 443 nm AOD data simulated with OMI L1B are shown in Figure 9c. OMI has row-anomaly issues in the middle part of the across-track sequence, and is thus not suitable for aerosol retrievals [36]. MODIS DT C6.1 550 nm AOD products are shown in Figure 9d. MODIS DT uses visible channels for retrieval, so does not retrieve aerosol properties over bright surfaces as shown in Figure 9d. Fused 550 nm AOD products are shown in Figure 9b, based on extrapolation of 443 nm GEMS AOD to 550 nm using selected aerosol model assumptions. Over bright surface areas, GEMS aerosol products were used. For the aerosol plume area, GEMS and MODIS DT aerosol products were fused using the MLE method. The western coast of the Korean Peninsula was within the sun-glint area for both OMI and MODIS instruments, so aerosol data were not retrieved. The fused 550 nm AOD products exhibit smooth and consistent spatial distributions compared with the original individual AOD products, implying that the ranges of AOD and surface albedo are well established. If the two ranges were not well established, the spatial distribution of the fused AOD products would have been rugged and uneven. Validation results for GEMS AOD (using OMI L1B as a proxy), MODIS DT AOD, and fused AOD products are shown in Figure 10; Figure 10b,d,f indicate corresponding normalized frequency histograms of AOD bias. The fused AOD products have accuracies comparable with MODIS DT AOD products, with an increasing number of data points. Moreover, the AOD bias distribution of fused AOD products have distributions similar to those of MODIS DT AOD products. From the perspective of GEMS AOD (Figure 10a), the merged products (Figure 10e) produced better statistics overall, with increasing slope (from 0.618 to 0.776; Figure 10a,e), decreasing offset (from 0.184 to 0.112), increasing correlation coefficients (from 0.883 to 0.887), decreasing RMSE (from 0.234 to 0.205), and increasing Q (from 59.31% to 69.79%), with an increasing number of total validation points (from 548 to 1066). Fused results are thus beneficial for GEMS AOD in terms of increasing accuracy and spatial coverage. Furthermore, compared with the OMAERUV AOD product, fusion may provide a straightforward means of validation as the discarding of data with SD > 0.3 (to mitigate possible sub-pixel cloud contamination [65]) is avoided.  Ratios of retrieved pixels to total measurement L1B pixels for each original AOD product and the fused AOD dataset are plotted in Figure 11. Both GEMS AOD (using OMI L1B as a proxy) and MODIS DT products had high data availability in autumn and low availability in summer due to low and high cloud fractions, respectively. GEMS AOD products had better spatial coverage, especially during autumn and winter. The fused products displayed better spatial data availability than each source AOD product.
Remote Sens. 2020, 12, x FOR PEER REVIEW 26 of 37 Ratios of retrieved pixels to total measurement L1B pixels for each original AOD product and the fused AOD dataset are plotted in Figure 11. Both GEMS AOD (using OMI L1B as a proxy) and MODIS DT products had high data availability in autumn and low availability in summer due to low and high cloud fractions, respectively. GEMS AOD products had better spatial coverage, especially during autumn and winter. The fused products displayed better spatial data availability than each source AOD product. Figure 11. Ratio of retrieved area for fused AOD dataset and source AOD products for 1 January to 31 December 2006.

Error Analysis Before and After Synergistic Meteorological Imager Use
Error analyses were undertaken for GEMS AOD before and after aerosol fusion with results presented in Figures 12 and 13, which show GEMS AOD bias with respect to parameters related to the inversion procedure. The scattering angle, calculated from OMI measurement geometry for each pixel, is considered in Figure 12a, and AE values are plotted in Figure 12b as indicators of aerosol size, calculated from AERONET AOD at 440 and 675 nm. The bias with respect to surface reflectance and the OMI surface climatology database [64] at 443 nm are plotted in Figure 13a, as used here in the GEMS AOD inversion procedure, and AERONET AOD is plotted against GEMS-AERONET AOD in Figure 13b. In Figures 12 and 13, the blue color represents GEMS AOD simulated with OMI L1B data, and the red color indicates the fused AOD dataset, with each bin representing the median and SD of 40 collocated GEMS AOD (using OMI L1B as a proxy) and AERONET data points. Overall, bias analysis indicates decreased absolute median bias (AMB) for all related parameters: 0.0549 to 0.0239 for scattering angle; 0.0241 to 0.0044 for AE; 0.0612 to 0.0232 for surface reflectance; 0.0612 to 0.0344 for AOD. The AMB and bin median and bias values decreased after merging of the AOD products.

Error Analysis before and after Synergistic Meteorological Imager Use
Error analyses were undertaken for GEMS AOD before and after aerosol fusion with results presented in Figures 12 and 13, which show GEMS AOD bias with respect to parameters related to the inversion procedure. The scattering angle, calculated from OMI measurement geometry for each pixel, is considered in Figure 12a, and AE values are plotted in Figure 12b as indicators of aerosol size, calculated from AERONET AOD at 440 and 675 nm. The bias with respect to surface reflectance and the OMI surface climatology database [64] at 443 nm are plotted in Figure 13a, as used here in the GEMS AOD inversion procedure, and AERONET AOD is plotted against GEMS-AERONET AOD in Figure 13b. In Figures 12 and 13, the blue color represents GEMS AOD simulated with OMI L1B data, and the red color indicates the fused AOD dataset, with each bin representing the median and SD of 40 collocated GEMS AOD (using OMI L1B as a proxy) and AERONET data points. Overall, bias analysis indicates decreased absolute median bias (AMB) for all related parameters: 0.0549 to 0.0239 for scattering angle; 0.0241 to 0.0044 for AE; 0.0612 to 0.0232 for surface reflectance; 0.0612 to 0.0344 for AOD. The AMB and bin median and bias values decreased after merging of the AOD products.
The greatest decrease in bias occurred in the backscattering geometry ( Figure 12a). Physically, the same surface may appear brighter in backscattering geometry than in forward scattering geometry. The GEMS bias in backscattering geometry may be caused by our use of the Kleipool dataset [64], which is a monthly composited dataset excluding geometry consideration. The MODIS DT algorithm takes into account the angular effect (scattering angle) in surface reflectance estimation, so the backscattering error might have decreased after fusion with MODIS DT. This suggests that surface reflectance, rather than the Kleipool dataset, should be developed further for use with GEMS. Remote Sens. 2020, 12, x FOR PEER REVIEW 27 of 37  The AOD bias did not change significantly over the AE range of 0.7-1.3 (Figure 12b), which indicates that neither GEMS nor MODIS DT has large errors in the median size. The bias at low AE (0.5-0.7, corresponding to coarse particles) decreased, and that at high AE (1.3-1.8) slightly increased. This suggests that the two AODs are mixed, with the bias being fused evenly in terms of particle size. Figure 13a indicates that the bias decreased evenly over the entire section of surface reflection, implying that the surface reflectance range used in Equation (4) was set correctly. In Figure 13b, the bias approaches zero over the entire AOD range after fusion. At incredibly low AOD (0.0-0.2), there was a bias of 0.1-0.2 for GEMS, which decreased after fusion. The RMSE of GEMS AOD in Table 5 (Figure 12a), it is possible to infer the bias largely from the backscattering geometry, and this point should be investigated in future applications of the GEMS aerosol algorithm.
The MLE method uses RMSE as a weighting factor, so more accurate satellite results are reflected in fusion AOD results with a large fraction. Validation results might be improved significantly if the merged AOD were based on the more accurate AOD data from the two satellites without applying the RMSE weighting factor, although the AOD spatial distribution could have significant discontinuities. The advantage of the MLE method is therefore that the errors of the two satellites can be considered using RMSE, and it is possible to calculate the merged AOD with minimal spatial discontinuity. The disadvantage, however, is the assumption that all outputs have a bias of zero. If the merged AOD was calculated after correction of the AOD bias from each satellite L2 AOD product, more accurate satellite output could be obtained.
Remote Sens. 2020, 12, x FOR PEER REVIEW 28 of 37 Figure 13. (a) Bias of GEMS AOD with respect to surface reflectance (443 nm), as used for AOD retrieval. (b) Bias of GEMS AOD with respect to collocated AERONET AOD direct-measurement data (L2). Blue and red refer to before and after aerosol fusion, respectively. The AMB was calculated for the total collocated blue and red points. Circles and bars represent median and SD values for 40 collocated data points for GEMS and AERONET data.
The greatest decrease in bias occurred in the backscattering geometry ( Figure 12a). Physically, the same surface may appear brighter in backscattering geometry than in forward scattering geometry. The GEMS bias in backscattering geometry may be caused by our use of the Kleipool dataset [64], which is a monthly composited dataset excluding geometry consideration. The MODIS DT algorithm takes into account the angular effect (scattering angle) in surface reflectance estimation, so the backscattering error might have decreased after fusion with MODIS DT. This suggests that surface reflectance, rather than the Kleipool dataset, should be developed further for use with GEMS.
The AOD bias did not change significantly over the AE range of 0.7-1.3 (Figure 12b), which indicates that neither GEMS nor MODIS DT has large errors in the median size. The bias at low AE (0.5-0.7, corresponding to coarse particles) decreased, and that at high AE (1.3-1.8) slightly increased. This suggests that the two AODs are mixed, with the bias being fused evenly in terms of particle size. Figure 13a indicates that the bias decreased evenly over the entire section of surface reflection, implying that the surface reflectance range used in Equation (4) was set correctly. In Figure 13b, the bias approaches zero over the entire AOD range after fusion. At incredibly low AOD (0.0-0.2), there was a bias of 0.1-0.2 for GEMS, which decreased after fusion. The RMSE of GEMS AOD in Table 5 was 0.087 for BSA 0.0-0.05 and AOD 0.0-0.25; 0.117 for BSA 0.05-0.08 and AOD 0.0-0.25; 0.145 for BSA 0.08-0.11 and AOD 0.0-0.25. When considering the GEMS AOD bias with respect to the scattering angle (Figure 12a), it is possible to infer the bias largely from the backscattering geometry, and this point should be investigated in future applications of the GEMS aerosol algorithm.
The MLE method uses RMSE as a weighting factor, so more accurate satellite results are reflected in fusion AOD results with a large fraction. Validation results might be improved significantly if the Figure 13. (a) Bias of GEMS AOD with respect to surface reflectance (443 nm), as used for AOD retrieval. (b) Bias of GEMS AOD with respect to collocated AERONET AOD direct-measurement data (L2). Blue and red refer to before and after aerosol fusion, respectively. The AMB was calculated for the total collocated blue and red points. Circles and bars represent median and SD values for 40 collocated data points for GEMS and AERONET data.

Discussion
In this study, we presented a method for synergistic use of hyperspectral UV-visible and MI data to improve aerosol retrieval products and to produce optimal AOD datasets. The proposed methods were simulated using OMI and MODIS data, which can be proxies for GEMS and AMI data, highlighting improvements in aerosol results. The methodology presented here is generally applicable to other UV-visible instruments and meteorological satellites. Previous studies generally aimed to improve aerosol retrieval products [12][13][14][15][16][17][18][19], or to create 'best' AOD datasets by merging individual AOD products [20][21][22][23][24][25] to provide users with optimal AOD data for points observed simultaneously by multiple satellites.
With regard to cloud masking, if clouds are much smaller than the pixel spatial resolution their signals cannot be distinguished from aerosol signals. For example, a thin cloud with a resolution of 500 m is clearly visible in satellite RGB systems with a resolution of 500 m, but is indistinguishable with a spatial resolution of 10 km. Many UV-visible instruments have spatial resolutions of >3.5 km and may have sub-pixel cloud contamination issues that affect validation results. We have shown that it is possible to separate clouds more accurately using high-resolution MI cloud data for observations of the same location. As the cloud fraction in the AOD pixel decreased from 80% to 20%, the AOD correlation coefficient increased from 0.786 to 0.871. Our method is thus beneficial for reducing the inconvenience of discriminating clouds in the validation stage by data SD, and for determining where the clouds are in the aerosol retrieval region.
We have also proposed a new method for applying the TDCI algorithm to UV-visible instrument aerosol retrieval processes, based on IR data being suitable for DS discrimination. Statistical analysis indicates that GEMS DS aerosol detection accuracy was improved from 72% to 94% using TDCI for aerosol type selection. In addition, SSA results were improved particularly for aerosol types BC and DS. UV-visible instruments generally face the difficulty of accurately separating aerosol size information when retrieving aerosol data. It is difficult to accurately determine the aerosol type from nadir-view satellites, owing to their low information content, but it is possible to determine aerosol size or absorption information within a large group aerosol type from wavelength-dependent aerosol optical properties. Torres et al. [13] provided a novel method for using AIRS CO as a tracer of smoke and for its application in the aerosol retrieval algorithm. However, since neither GEMS nor AMI can obtain CO data, the method is difficult to apply. After synergistic use of MI at the algorithm stage, we have shown that fused AOD data may provide spatially wider AOD with consistent accuracy. If a problem occurs with a payload, another satellite can compensate spatially. Here, only uncertainties at pixel-level were considered for the output of AOD, and this could be improved if the bias characteristics of each output were utilized.
In this study, the LEO OMI and MODIS instruments were used as proxy datasets for GEMS and AMI. OMI and MODIS have points in common with GEMS and AMI, such as FWHM, spectral sampling, and wavelength, but there are also differences in temporospatial resolution and instrument errors. Previous studies indicated that the OMI operational aerosol algorithm (OMAERUV) can be tested using the Global Ozone Monitoring Experiment (GOME) L1B as pre-launch product validation (OMI Algorithm Theoretical Basis Document (ATBD) [67]) [68,69]. The OMAERUV algorithm has an established accuracy (~30%) based on heritage TOMS validation results [67,70]. The MODIS DT aerosol algorithm also exhibited the expected algorithm errors of ±(0.03 + (0.05 × AOD)) over ocean and ±(0.05 + (0.15 × AOD)) over land in the pre-launch Tropospheric Aerosol Radiative Forcing Observational Experiment (TARFOX) through use of a MODIS airborne simulator (MAS) aboard the NASA ER-2 satellite [71][72][73]. OMI and MODIS satisfy the 30% and expected algorithm error of ±(0.05 + (0.15 × AOD)) after launch, as set pre-launch [20,65]. Of course, algorithm accuracy generally depends on payload calibration accuracy [20,40,41].
It is well known that the main sources of error in aerosol products are cloud contamination, errors in the assumed (or retrieved) surface reflectivity, instrumental errors, and errors in the aerosol models (in forward modeling) [67]. For instrument errors, aerosol product accuracy is affected by radiometric calibration errors such as radiometric calibration offsets, radiometric calibration scale factors, and radiometric noise, rather than spectral calibration errors.
Errors due to cloud contamination depend mainly on the observed spatial resolution, as mentioned in Sections 1 and 4.1. In the case of surface reflectivity, monthly reflectivity climatology or surface reflectance estimated from the IR channel is primarily used, as derived from the payload [9,20,29,64,74]. If it is not available at the launch stage, the algorithm can rely on existing surface climatology [64]. The aerosol model used in forward modeling does not require a change of payload; therefore, we anticipate that an aerosol algorithm tested before satellite launch will rely on the calibration state of the payload after launch, except for a slight threshold modification for cloud screening and re-calculation of surface reflectivity. The algorithm performance itself will not change significantly.
As an example, the GOCI aerosol algorithm (operational since 2010; cf. Table 1) was tested using MODIS L1B before launch. With MODIS L1B, the correlation coefficient between AERONET AOD and the GOCI AOD products was 0.78-0.92 for the Gosan and Shirahama sites [75]. After GOCI launch, the correlation coefficient between AERONET AOD and the GOCI AOD products was~0.8 over the field of regard [76]. Changes were in the refinement of cloud screening, turbid-water detection, and surface reflectivity over time. The original frame of the algorithm has not changed. In changing from LEO to GEO satellites, the major change in the aerosol algorithm is the reconstruction of surface reflectivity. In the case of the GOCI, surface reflectance was changed to a method composited by time, with a 30-day minimum for calculation of MODIS (because surface reflectivity depends on geometry).
In considering the results of previous studies with respect to the present study, high-resolution AMI data can be applied for cloud screening provided there is no major problem with GEMS or AMI calibration, and the GEMS aerosol algorithm should yield stable aerosol data even over bright surfaces. Because the shortest wavelength band in AMI is 470 nm, it may have a higher AOD uncertainty than GEMS over bright surfaces. The TDCI index can be re-determined only by adjusting the threshold. Therefore, if the SD (Equation (4)) with respect to AOD and surface reflectivity is selected, the final merged aerosol product may increase the overall spatial coverage while also complementing the uncertainty of each satellite AOD product.
For TDCI calculations, we used MODIS L1B C6 data, while the most recent MODIS L1B dataset is C6.1. However, of the three MODIS bands examined, only Terra MODIS band 29 (8.55 µm) exhibits any difference between the MODIS C6-and C6.1-based results [77], so the use of C6 would rarely affect the significance of our results. This study explored the possibility of constructing an optimal aerosol dataset by the synergistic use of GEMS, hyperspectral UV-visible instruments, AMI, and broadband MI. Future studies should focus on the application of these possibilities with new geostationary satellites such as GEMS, TEMPO, Sentinel-4, AHI, AMI, and ABI.
The improved aerosol products developed through the fusion of AOD products can be used to monitor air quality and improve air-quality forecasting accuracy over East Asia. Previous particulate-matter estimation studies have indicated that satellite AOD coverage and aerosol product accuracy may affect the accuracy of predicted PM 2.5 (particulate matter of diameter < 2.5 µm). Sorek-Hamer et al. [78] used MODIS DB AOD products and combined DB-DT algorithm AOD products to better predict PM 2.5 concentrations over bright surfaces. Bilal et al. [79] processed to find the best of AOD among aerosol products of SARA, DT, and DB since the accuracy of AOD affects the accuracy of PM 2.5 . Moreover, Song et al. [80] found that a high sampling rate for AOD data with unbiased sampling is preferable for estimating surface PM 2.5 concentrations, with PM 2.5 estimates differing, on average, by 11.2 µg m −3 in autumn and 8.5 µg m −3 in winter after including AOD data over bright surfaces. Methods developed here may aid hourly prediction of surface PM 2.5 concentrations.
The aerosol algorithm improved in this study may also contribute to trace-gas retrieval data. Trace-gas data are retrieved using a hyperspectral spectrometer that can detect cross-sectional features. If AOD, SSA, and ALH data could be obtained from the same sensor, the temporospatial collocation errors of other satellites could be reduced.

Conclusions
The main purpose of this study was to generate an optimal aerosol dataset by the synergistic use of hyperspectral UV-visible instruments and broadband MI. The procedure involves the overcoming of physical limitations of GEMS aerosol algorithms in terms of aerosol size insensitivity and cloud masking by using MI and merging of aerosol data between improved GEMS and MI aerosol products.
MI IR measurements are useful for cirrus cloud masking of GEMS aerosol retrievals. The high spatial resolutions of MI TOA reflectance data together with IR data are helpful for refining sub-pixel cloud masking, which is difficult for GEMS. In terms of improvements in aerosol type classification, statistical analysis indicates that GEMS DS aerosol detection accuracy improved from 72% to 94% using TDCI for DS selection. Application of TDCI in the detection of DS in the GEMS aerosol retrieval procedure improved SSA results, particularly for aerosol types of BC and DS, which is consistent with results of the theoretical sensitivity test of the GEMS aerosol algorithm in terms of aerosol-type misclassification. Due to the advantage of retrieving aerosols over bright surfaces from the UV-visible channel, the fused AOD products exhibit better spatial data availabilities than any of the AOD source products, with an accuracy matching that of source products. AMB values for scattering angle, surface reflectance, AE, and AERONET AOD are all significantly improved.
The improved aerosol retrieval algorithm presented in this study is applicable to the GEMS aboard the GEO-KOMPSAT-2B satellite and AMI aboard the GEO-KOMPSAT-2A satellite, which were launched in February 2020 and December 2018, respectively. The synergistic use of MI data indicates the future possibility of improving GEMS aerosol algorithms and constructing optimal aerosol datasets. NO