Development and Application of HECORA Cloud Retrieval Algorithm Based On the O 2 -O 2 477 nm Absorption Band

: In this paper, we present the Hefei EMI Cloud Retrieval Algorithm (HECORA), which uses information from the O 2 -O 2 absorption band around 477 nm to retrieve e ﬀ ective cloud fraction and e ﬀ ective cloud pressure from satellite observations. The retrieved cloud information intends to improve the atmospheric trace gas products based on the Environment Monitoring Instrument (EMI) spectrometer. The HECORA method builds on OMCLDO2 and presents some evolutions. The Vector Linearized Discrete Ordinate Radiative Transfer (VLIDORT) model has been used to produce the Top of the Atmosphere (TOA) reﬂectance Look-up Tables (LUT) as a function of the cloud fraction and cloud pressure. Applying the Di ﬀ erential Optical Absorption Spectroscopy (DOAS) technique to the synthetic reﬂectance LUT, the reﬂectance spectra can be associated with O 2 -O 2 geometrical vertical column densities (VCD geo ) and continuum reﬂectance. This is the core of the retrieval method, since there is a one-to-one relationship between O 2 -O 2 VCD geo and continuum reﬂectance, on the one hand, and e ﬀ ective cloud fraction and e ﬀ ective cloud pressure, on the other hand, for a given illumination and observing geometry and given surface height and surface albedo. We ﬁrst used the VLIDORT synthetic spectra to verify the HECORA algorithm and obtained good results in both the Lambertian cloud model and the scattering cloud model. Secondly, HECORA is applied to OMI and TROPOMI and compared with OMCLDO2, FRESCO + , and OCRA / ROCINN cloud products. Later, the cloud pressure results from TROPOMI observations obtained using HECORA and FRESCO + are compared with the of the O 2 A-band to cloud vertical information. Finally, HECORA is applied to the TROPOMI NO 2 retrieval. Validation of the tropospheric NO 2 VCD with ground-based MAX-DOAS measurements shows that choosing HECORA cloud products to correct for photon path variations on the TROPOMI tropospheric NO 2 VCD retrievals has better performance than using FRESCO + under low cloud conditions. In conclusion, this paper shows that the HECORA cloud products are in good agreement with the well-established cloud products and that they are suitable for correcting the e ﬀ ect of cloud in trace gas retrievals. Therefore, HECORA has the potential to be applied to EMI.

complex, a variety of simplified assumptions were used to deal with the clouds in the trace gas algorithm [23]. The TROPOMI NO 2 product uses the cloud information from the FRESCO+ algorithm to correct the cloud effect. FRESCO+ is based on the O 2 A-band around 760 nm. The O 2 A-band is less interfered by other trace gases. However, it is far away from the trace gases retrieval band such as the NO 2 405-465 nm retrieval band. Deschamps et al. [24] found that the O 2 -O 2 method can provide a better retrieval result over vegetated surfaces than the O 2 A-band due to a lower surface reflectance at 477 nm than at 760 nm. The O 2 -O 2 absorption band at 477 nm is closer to the trace gas retrieval spectral range and is less affected by the surface albedo.
EMI is one of the payloads for China s first hyperspectral atmospheric monitoring satellite, Gaofen-5, which was successfully launched on 9 May 2018. The goal of EMI is to achieve global-scale atmospheric trace gas monitoring. The observational data from EMI will contribute to the understanding of global air quality evolution and atmospheric chemistry mechanisms. EMI has similar instrument characteristics to OMI and TROPOMI, including~13:30 local overpass time, push-broom imaging technology, and daily global coverage. EMI covers the ultraviolet and visible spectral ranges from 240 to 710 nm, with a spectral resolution of 0.3-0.5 nm and nadir resolution of 12 × 13 km 2 [25,26]. When using the EMI observation spectrum for trace gas retrieval, cloud correction is required. Since the EMI spectral ranges do not include the O 2 A-band, the existing cloud retrieval algorithm based on the O 2 -A absorption band cannot be applied to EMI. It is a suitable solution to develop a cloud retrieval algorithm for EMI based on OMCLDO2 using the 477 nm O 2 -O 2 absorption band.
In this paper, a new cloud retrieval algorithm using an O 2 -O 2 VCD geo LUT to process hyperspectral data is presented. The purpose of developing HECORA is to correct the influence of clouds in the trace gas retrieval process. HECORA is similar to OMCLDO2, but due to the use of the VCD geo LUT, it has a low interpolation error. HECORA performs better in low clouds than in high clouds. The HECORA cloud results are more suitable for the correction of trace gas retrieval in China, where near-surface pollution is serious. In Section 2, we introduce the essential facts and the implementation of the HECORA algorithm. In Section 3, the HECORA results are validated using the VLIDORT synthetic spectrum. The HECORA algorithm is then applied to OMI and TROPOMI, and compared with the existing cloud retrieval algorithms. Next, we apply the HECORA algorithm to retrieve TROPOMI NO 2 VCD and compare these results with the MAX-DOAS results. In Section 4, we summarize and review the characteristics of HECORA and draw conclusions. However, there have been some unexpected problems with EMI since the launch, specifically the oversaturation of the spectrum [25]. Tests of the HECORA algorithm on EMI will be carried out as soon as possible after the spectral calibration problems are solved.

Algorithm Description
The HECORA algorithm uses measurements of the O 2 -O 2 absorption band at 477 nm. This algorithm adopts a LUT similar to the OMCLDO2 algorithm [14,15]. Firstly, we use the VLIDORT [27] instead of the Doubling Adding KNMI [14] model to generate the simulated reflectance spectrum. The VLIDORT model is designed to generate the simultaneous output of the Stokes vector of polarized light and their derivatives for an arbitrary combination of atmospheric and surface properties. VLIDORT can deal with the attenuation of solar and line-of-sight light paths in a pseudo-spherical atmosphere. Secondly, compared with OMCLDO2, we have made some improvements in the DOAS fitting process. OMCLDO2 takes into account the absorption effects of NO 2 , O 3 , and O 2 -O 2 in the DOAS process [15]. HECORA considers more gas species, such as water vapor and Ring structure. The last but most important point is that the LUT we created uses VCD geo as nodes. The VCD geo represents the molecular distribution characteristics of a single-pixel in the vertical direction. The VCD geo is better suited as the LUT node for the cloud retrieval. The specific details of the algorithm are described below.

Look-up Tables
When creating the LUT, a Lambertian cloud model and an independent pixel approximation (IPA) [28,29] were used in VLIDORT to obtain the simulated spectrum. IPA assumes that a spatial pixel (3.5 × 7 km 2 in the case of TROPOMI) can be divided into several sub-pixels and there is no lateral transport of radiance between sub-pixel boundaries. Therefore, a pixel containing clouds can be divided into cloud-covered and cloud-free subpixels. The reflectance of top-of-atmosphere (TOA) can be expressed as: R(λ; θ, θ 0 , ϕ, c, P c , A s , P s ) = cR cloud (λ; θ, θ 0 , ϕ, A c , P s ) + (1 − c)R clear (λ; θ, θ 0 , ϕ, A s , P s ) (1) where R(λ; θ, θ 0 , ϕ, c, P c , A s , P s ) is the reflectance at TOA for wavelength λ; θ 0 is the solar zenith angle (SZA); θ is the viewing zenith angle (VZA); ϕ is the relative azimuth angle (RAA); c is the effective cloud fraction; P c is the cloud top pressure; A c is the cloud albedo; A s is the surface albedo (SA); P s is the surface pressure (SP); R clear is the reflectance of the clear part of the pixel, and R cloud is the reflectance of the cloudy part of the pixel. In the simulation process, a Lambertian cloud model with an albedo of 0.8 was selected following the findings of previous studies [2,14,19] proving this to be an adequate choice for cloud correction and trace gas retrieval. The input parameters used for generating the reflectance LUT include SZA, VZA, RAA, SP, SA, effective cloud fraction, and effective cloud pressure. The reflectance from the VLIDORT can be defined as: where I is the reflected radiance and E 0 is the corresponding solar irradiance. Fitting the LUT reflectance spectra by the DOAS technique [30], the continuum reflectance and the O 2 -O 2 slant column density (SCD) are calculated. The air-mass factor (AMF) is the ratio of SCD to the VCD of the absorber as viewed by the satellite in the measured radiance spectrum. The AMF value depends on elements such as the vertical distribution of trace gases absorbing in the atmosphere, the length of the optical path, and the reflectivity of the earth's surface. The length of the light path depends also on the position of the sun and satellite [14]. In the absence of atmospheric scattering, the AMF depends only on geometrical parameters describing the incident solar illumination and the satellite observing geometry. This geometrical AMF, AMF geo , is expressed as follows: Although HECORA radiative transfer (RT) simulations take into account atmospheric scattering, they use the geometrical VCD, VCD geo , defined as: where N s is the O 2 -O 2 SCD, as nodes for the reflectance LUT instead of using directly the SCD as derived by the DOAS fit. The justification of this choice is to reduce the distance between the O 2 -O 2 column nodes in the LUT by normalizing N s with the AMF geo and, in this way, decrease the interpolation error in VCD-dimension of the multidimensional space. Just to recap, for fixed values of geometry, SA and SP, we can obtain the continuum reflectance and O 2 -O 2 VCD geo as a function of the effective cloud fraction and cloud pressure as follows: 1. Input: (θ, θ 0 , ϕ, c, P c , A s , P s )→ reflectance LUT (Equation (1)) → Output: R(λ; θ, θ 0 , ϕ, c, P c , A s , P s ).

4.
All three steps together form the forward LUT: (θ, θ 0 , ϕ, c, P c , A s , P s )→ LUT forward → (R c , VCD geo ). The final objective is to obtain (c, P c ) as a function of (R c , VCD geo ) and the rest of the geometric and surface parameters. Exchanging the ( c, P c ) and the (R c , VCD geo ) columns in the input and the output of LUT forward , we obtain the desired relationship: 5.
Inverse LUT: (θ, θ 0 , ϕ, A s , P s , R c , VCD geo ) → LUT inverse → (c, P c ). However, there is still a problem to solve. As defined above, the LUT forward is regular in the input parameters (θ, θ 0 , ϕ, c, P c , A s , P s ) but the output values are scattered in the 2D (R c , VCD geo ) plan. Accordingly, LUT inverse has two dimensions (R c , VCD geo ) where the nodes are not distributed in a regular grid. In order to define the LUT in a multidimensional (7-dimensional) mesh, the input nodes have to be interpolated. We interpolated the LUT inverse multidimensional scattered data using radial basis functions (RBF) (refer to Veefkind et al., 2016 [15]) leading to the desired inverse LUT in a regular 7-dimensional mesh: 6.
Inv_regular LUT: A typical slice of effective cloud fraction and effective cloud pressure is illustrated in Figure 1. However, there is still a problem to solve. As defined above, the LUTforward is regular in the input parameters 0 c s s θ θ φ, c P A P ( , , , , , ) but the output values are scattered in the 2D ( c geo R ,VCD ) plan.
Accordingly, LUTinverse has two dimensions ( c geo R VCD , ) where the nodes are not distributed in a regular grid. In order to define the LUT in a multidimensional (7-dimensional) mesh, the input nodes have to be interpolated. We interpolated the LUTinverse multidimensional scattered data using radial basis functions (RBF) (refer to Veefkind et al., 2016 [15]) leading to the desired inverse LUT in a regular 7-dimensional mesh: 6. Inv_regular LUT: LUTinverse (   0  s  s  c  geo   θ θ φ A P R VCD  , , , ,  , , A typical slice of effective cloud fraction and effective cloud pressure is illustrated in Figure 1. The comparison between the HECORA O2-O2 VCDgeo nodes and the OMCLDO2 using O2-O2 SCD nodes is shown in Table 1. HECORA uses the same number of column density nodes as OMCLDO2; the value of HECORA O2-O2 VCDgeo node is 1/5 of the corresponding OMCLDO2 O2-O2 SCD node.

Retrieval Algorithm
HECORA is based on the DOAS method [30][31][32]. By means of a DOAS fit, the O 2 -O 2 SCD and reflectance polynomial coefficients are retrieved from the observed reflectance spectrum within the fit window of 460-490 nm. The DOAS equation used in HECORA can be expressed as follows: where R(λ) is the reflectance spectrum, R c = γ 0 + γ 1 λ is the continuum reflectance at the wavelength, and λ, N s , N s,O 3 , N s,NO 2   To calculate the O 2 -O 2 VCD geo and to retrieve cloud parameters, auxiliary data specifying the surface and geometric properties including the SZA, VZA, RAA, SP, and SA are needed. SZA, VZA, solar azimuth angle (SAA), and viewing azimuth angle (VAA) can be obtained from the illumination and viewing geometry information of the specified satellite load, and the RAA can be calculated from SAA and VAA. SP and SA are generated by interpolation from the surface height dataset [38] and surface albedo dataset [39] as a function of latitude and longitude. The effective cloud fraction and cloud pressure are obtained by multidimensional linear interpolation of the above parameters according to the LUT inv_regular .

HECORA Results from the Simulated Spectrum
To verify the HECORA algorithm, the reflectance spectrum was simulated using the VLIDORT radiative transfer model. The simulations are performed on Lambertian cloud and scattering cloud models. To test the accuracy of the HECORA retrieved effective cloud pressure, we created an O 2 -O 2 SCD LUT using the same SCD nodes with OMCLDO2 for comparison.
First, we consider the Lambertian cloud. In this model, the cloud is regarded as a Lambertian surface with cloud albedo of 0.8 and ignores transmission. In the simulation, the cloud is at 3 and 7 km, and the corresponding cloud pressure is 701 and 411 hPa. The cloud fraction is set to 0.5 and 1. For all situations, VZA is 0.1 degrees, RAA is 0 degrees, the surface albedo is 0.05, the surface pressure is 1013 hPa, and SZA is 0, 10, 20, 30, 40, 50, 60, 70, and 80 degrees. Figure 2 shows the results of HECORA retrieved cloud pressure as a function of SZA.
In general, the LUT inv_regular used by HECORA performs better than O 2 -O 2 SCD LUT. The HECORA retrieval results are very close to the real cloud pressure when SZA is small. When SZA is larger than 60 degrees, HECORA has an overestimation of cloud pressure. However, the O 2 -O 2 SCD LUT retrieval results have an underestimation of cloud pressure and the bias is larger than HECORA. When the cloud fraction is 0.5, the average HECORA retrieved cloud pressure is 702.32 hPa, 1.2 hPa higher than the real cloud pressure with a standard deviation of 2.5 hPa. When the cloud fraction is 1.0, the retrieved cloud pressure means is 701.76 hPa, 0.76 hPa higher than the real cloud pressure with a standard deviation of 1.7 hPa. From the retrieved results, when the cloud fraction decreases, the bias between retrieved cloud pressure and real cloud pressure increase. Figure 2c,d also shows the same result, verifying the point of Acarreta [14] that retrieval biases of cloud pressure increase with the decrease of cloud fraction.
Remote Sens. 2020, 12, x FOR PEER REVIEW 8 of 21 larger than 60 degrees, HECORA has an overestimation of cloud pressure. However, the O2-O2 SCD LUT retrieval results have an underestimation of cloud pressure and the bias is larger than HECORA. When the cloud fraction is 0.5, the average HECORA retrieved cloud pressure is 702.32 hPa, 1.2 hPa higher than the real cloud pressure with a standard deviation of 2.5 hPa. When the cloud fraction is 1.0, the retrieved cloud pressure means is 701.76 hPa, 0.76 hPa higher than the real cloud pressure with a standard deviation of 1.7 hPa. From the retrieved results, when the cloud fraction decreases, the bias between retrieved cloud pressure and real cloud pressure increase. Figure 2c,d also shows the same result, verifying the point of Acarreta [14] that retrieval biases of cloud pressure increase with the decrease of cloud fraction. To study the stability of HECORA under different cloud fraction conditions, VLIDORT is used to simulate the reflectance of cloud at different pressures (450, 550, 650, 750, 850 hPa) with different cloud fractions and use HECORA to retrieve the cloud pressure. The HECORA retrieved cloud pressure and comparison with the O2-O2 SCD LUT retrieval results are presented in Figure 3. When the effective cloud fraction is 1.0, the maximum bias between HECORA retrieved cloud pressure and real cloud pressure is 1.4 hPa, but when the effective cloud fraction is 0.1, the retrieval bias gets worse, the maximum bias is 40.4 hPa. HECORA still shows a more stable performance and has a low bias under different cloud fractions than the O2-O2 SCD LUT retrieval results. For clouds of different To study the stability of HECORA under different cloud fraction conditions, VLIDORT is used to simulate the reflectance of cloud at different pressures (450, 550, 650, 750, 850 hPa) with different cloud fractions and use HECORA to retrieve the cloud pressure. The HECORA retrieved cloud pressure and comparison with the O 2 -O 2 SCD LUT retrieval results are presented in Figure 3. When the effective cloud fraction is 1.0, the maximum bias between HECORA retrieved cloud pressure and real cloud pressure is 1.4 hPa, but when the effective cloud fraction is 0.1, the retrieval bias gets worse, the maximum bias is 40.4 hPa. HECORA still shows a more stable performance and has a low bias under different cloud fractions than the O 2 -O 2 SCD LUT retrieval results. For clouds of different heights, the bias between both HECORA and O 2 -O 2 SCD LUT retrieved cloud pressure increases with the decrease of the cloud fraction. When the cloud fraction is small (≤0.2), the retrieval bias of high cloud is larger than for the low cloud, which proves that the cloud algorithm based on the O 2 -O 2 477 nm absorption bands is less sensitive to the high cloud [15].
Remote Sens. 2020, 12, x FOR PEER REVIEW 9 of 21 heights, the bias between both HECORA and O2-O2 SCD LUT retrieved cloud pressure increases with the decrease of the cloud fraction. When the cloud fraction is small (≤0.2), the retrieval bias of high cloud is larger than for the low cloud, which proves that the cloud algorithm based on the O2-O2 477 nm absorption bands is less sensitive to the high cloud [15]. By replacing the Lambertian cloud model with the scattering cloud model in VLIDORT, the settings for SZA, VZA, RAA, surface pressure, and surface albedo are the same as for the Lambertian cloud model. The cloud type is set as the water cloud with a droplet size of 0.20 µ m. The Å ngström exponent is set as 2.0455. The effective radius is set as 10.0 µ m. The single scattering albedo is set to 1.00. The refractive index is calculated by Hale and Quarry [40]. The cloud fraction is set to 0.5, and the cloud height is set to 4-5 km (540-617 hPa), with the cloud optical depth (COD) being 7 and 14, respectively. Figure 4 shows the retrieval results, when the COD is 7, the average retrieved cloud pressure from HECORA is 586.81 hPa. While the COD is 14, the average retrieved cloud pressure is 583.62 hPa. The HECORA retrieved cloud pressure is close to the mid-cloud pressure. In contrast to the SCD LUT results, HECORA has a smaller standard deviation. For larger cloud optical thicknesses, the corresponding effective cloud fraction is larger, whereas cloud pressure retrieval errors are smaller.
Overall, HECORA shows a more stable performance than the SCD LUT algorithm both using the Lambertian cloud model and scattering cloud model. In the HECORA algorithm, SZA and VZA are the parameters for calculating VCDgeo. This reduces some of the errors introduced by SZA and VZA by participating directly in the multivariate linear interpolation. On the other hand, HECORA LUTinv_regular column density nodes are denser than those of the OMCLDO2 O2-O2 SCD LUT, which reduces the interpolation error generated in the cloud parameter calculation. By replacing the Lambertian cloud model with the scattering cloud model in VLIDORT, the settings for SZA, VZA, RAA, surface pressure, and surface albedo are the same as for the Lambertian cloud model. The cloud type is set as the water cloud with a droplet size of 0.20 µm. The Ångström exponent is set as 2.0455. The effective radius is set as 10.0 µm. The single scattering albedo is set to 1.00. The refractive index is calculated by Hale and Quarry [40]. The cloud fraction is set to 0.5, and the cloud height is set to 4-5 km (540-617 hPa), with the cloud optical depth (COD) being 7 and 14, respectively. Figure 4 shows the retrieval results, when the COD is 7, the average retrieved cloud pressure from HECORA is 586.81 hPa. While the COD is 14, the average retrieved cloud pressure is 583.62 hPa. The HECORA retrieved cloud pressure is close to the mid-cloud pressure. In contrast to the SCD LUT results, HECORA has a smaller standard deviation. For larger cloud optical thicknesses, the corresponding effective cloud fraction is larger, whereas cloud pressure retrieval errors are smaller.

HECORA Cloud Retrievals from OMI Data
We have applied HECORA to OMI. We selected a random day to compare the global cloud parameters retrieved by HECORA and OMCLDO2 on 1 June 2006. Figure 5 shows the cloud fraction and cloud pressure global spatial distribution. The spatial distribution of effective cloud fraction and effective cloud pressure of OMCLDO2 and HECORA have a strong positive correlation. Figure 6 compares the cloud retrieval results of the OMI orbit 9993 from OMCLDO2 and HECORA on 1 June Overall, HECORA shows a more stable performance than the SCD LUT algorithm both using the Lambertian cloud model and scattering cloud model. In the HECORA algorithm, SZA and VZA are the parameters for calculating VCD geo . This reduces some of the errors introduced by SZA and VZA by participating directly in the multivariate linear interpolation. On the other hand, HECORA LUT inv_regular column density nodes are denser than those of the OMCLDO2 O 2 -O 2 SCD LUT, which reduces the interpolation error generated in the cloud parameter calculation.

HECORA Cloud Retrievals from OMI Data
We have applied HECORA to OMI. We selected a random day to compare the global cloud parameters retrieved by HECORA and OMCLDO2 on 1 June 2006. Figure 5 shows the cloud fraction and cloud pressure global spatial distribution. The spatial distribution of effective cloud fraction and effective cloud pressure of OMCLDO2 and HECORA have a strong positive correlation. Figure 6 compares the cloud retrieval results of the OMI orbit 9993 from OMCLDO2 and HECORA on 1 June 2006. From the regression analysis, it is clear that both the retrieval of effective cloud fraction and cloud pressure correlate very well. For the effective cloud fraction, the correlation coefficient is 0.997; for cloud pressure, the correlation coefficient is 0.985. In the case of low cloud pressure, the cloud pressure bias of OMCLDO2 and HECORA increases because of the low sensitivity of the O 2 -O 2 477 nm absorption band to the high cloud. From the comparisons with the OMCLDO2 product, we can conclude that the HECORA can provide reliable results and could be used for cloud retrieval in the hyperspectral load to provide cloud correction for trace gas retrieval.

HECORA Cloud Retrievals from OMI Data
We have applied HECORA to OMI. We selected a random day to compare the global cloud parameters retrieved by HECORA and OMCLDO2 on 1 June 2006. Figure 5 shows the cloud fraction and cloud pressure global spatial distribution. The spatial distribution of effective cloud fraction and effective cloud pressure of OMCLDO2 and HECORA have a strong positive correlation. Figure 6 compares the cloud retrieval results of the OMI orbit 9993 from OMCLDO2 and HECORA on 1 June 2006. From the regression analysis, it is clear that both the retrieval of effective cloud fraction and cloud pressure correlate very well. For the effective cloud fraction, the correlation coefficient is 0.997; for cloud pressure, the correlation coefficient is 0.985. In the case of low cloud pressure, the cloud pressure bias of OMCLDO2 and HECORA increases because of the low sensitivity of the O2-O2 477 nm absorption band to the high cloud. From the comparisons with the OMCLDO2 product, we can conclude that the HECORA can provide reliable results and could be used for cloud retrieval in the hyperspectral load to provide cloud correction for trace gas retrieval.

HECORA Cloud Retrievals from TROPOMI Data
In this section, we compare the cloud parameters retrieved by HECORA and other TROPOMI cloud products and CALIOP. Figure 7 shows the cloud parameters global spatial distribution of TROPOMI from FRESCO+ [20], OCRA/ROCINN [3], and HECORA on 2 January 2019. Since OCRA does not use the assumption that the cloud albedo is 0.8 and cloud albedo is one of the result parameters of ROCINN [3], for a more reasonable comparison with HECORA, the recalculated OCRA cloud fraction was obtained using the following formula: In the formula, OCRA_RCF is the recalculated OCRA cloud fraction to compare with HECORA and FRESCO+. CF_crb is the cloud fraction retrieved by OCRA using the CRB (Cloud as Reflecting Boundary) forward model. CA_crb is the cloud albedo used in OCRA.

HECORA Cloud Retrievals from TROPOMI Data
In this section, we compare the cloud parameters retrieved by HECORA and other TROPOMI cloud products and CALIOP. Figure 7 shows the cloud parameters global spatial distribution of TROPOMI from FRESCO+ [20], OCRA/ROCINN [3], and HECORA on 2 January 2019. Since OCRA does not use the assumption that the cloud albedo is 0.8 and cloud albedo is one of the result parameters of ROCINN [3], for a more reasonable comparison with HECORA, the recalculated OCRA cloud fraction was obtained using the following formula: Remote Sens. 2020, 12, 3039

of 22
OCRA cloud fraction was obtained using the following formula: In the formula, OCRA_RCF is the recalculated OCRA cloud fraction to compare with HECORA and FRESCO+. CF_crb is the cloud fraction retrieved by OCRA using the CRB (Cloud as Reflecting Boundary) forward model. CA_crb is the cloud albedo used in OCRA.

Comparison with Other TROPOMI Cloud Products
We selected the TROPOMI spectrum on 2 January and 1 June 2019 to test the HECORA performance in different seasons. Compared with other existing TROPOMI cloud algorithms, we found that the HECORA retrieval results are well correlated with FRESCO+ (0.90 < R < 0.99). The HECORA retrieved effective cloud fraction also shows consistency with the recalculated OCRA cloud fraction (0.80 < R < 0.91). Compared with ROCINN cloud top pressure results, HECORA cloud pressure results have roughly the same trend as ROCINN (0.83 < R < 0.93). This shows the commonality of different cloud retrieval algorithms. However, in some cases, the HECORA retrieved cloud pressure is larger than those of FRESCO+ and ROCINN. From the comparison, we found: (1) HECORA cloud pressure retrieval results are more correlated with FRESCO+ and OCRA in winter than in summer. (2) The difference of different cloud retrieval algorithms results are smaller in the ocean than in the land. (3) With increasing cloud pressure, the consistency of cloud fraction retrieval results of different algorithms also increases. Overall, HECORA and other cloud products are showing similar trends. Figure 8 shows two typical orbit correlation figures for different cloud products. In the formula, OCRA_RCF is the recalculated OCRA cloud fraction to compare with HECORA and FRESCO+. CF_crb is the cloud fraction retrieved by OCRA using the CRB (Cloud as Reflecting Boundary) forward model. CA_crb is the cloud albedo used in OCRA.

Comparison with Other TROPOMI Cloud Products
We selected the TROPOMI spectrum on 2 January and 1 June 2019 to test the HECORA performance in different seasons. Compared with other existing TROPOMI cloud algorithms, we found that the HECORA retrieval results are well correlated with FRESCO+ (0.90 < R < 0.99). The HECORA retrieved effective cloud fraction also shows consistency with the recalculated OCRA cloud fraction (0.80 < R < 0.91). Compared with ROCINN cloud top pressure results, HECORA cloud pressure results have roughly the same trend as ROCINN (0.83 < R < 0.93). This shows the commonality of different cloud retrieval algorithms. However, in some cases, the HECORA retrieved cloud pressure is larger than those of FRESCO+ and ROCINN. From the comparison, we found: (1) HECORA cloud pressure retrieval results are more correlated with FRESCO+ and OCRA in winter than in summer.
(2) The difference of different cloud retrieval algorithms results are smaller in the ocean than in the land. (3) With increasing cloud pressure, the consistency of cloud fraction retrieval results of different algorithms also increases. Overall, HECORA and other cloud products are showing similar trends. Figure 8 shows two typical orbit correlation figures for different cloud products.
performance in different seasons. Compared with other existing TROPOMI cloud algorithms, we found that the HECORA retrieval results are well correlated with FRESCO+ (0.90 < R < 0.99). The HECORA retrieved effective cloud fraction also shows consistency with the recalculated OCRA cloud fraction (0.80 < R < 0.91). Compared with ROCINN cloud top pressure results, HECORA cloud pressure results have roughly the same trend as ROCINN (0.83 < R < 0.93). This shows the commonality of different cloud retrieval algorithms. However, in some cases, the HECORA retrieved cloud pressure is larger than those of FRESCO+ and ROCINN. From the comparison, we found: (1) HECORA cloud pressure retrieval results are more correlated with FRESCO+ and OCRA in winter than in summer. (2) The difference of different cloud retrieval algorithms results are smaller in the ocean than in the land. (3) With increasing cloud pressure, the consistency of cloud fraction retrieval results of different algorithms also increases. Overall, HECORA and other cloud products are showing similar trends. Figure 8 shows two typical orbit correlation figures for different cloud products.  Furthermore, we have also statistically analyzed that the long-term results of HECORA, FRESCO+, and OCRA/ROCINN cloud parameters from 5 November 2018 to 4 December 2018 with latitude range are −70 degrees to 70 degrees, excluding the snow and ice cases. During this period, the cloud parameters were widely distributed and representative.
The mean cloud fraction retrieved by FRESCO+, HECORA, and OCRA for different kinds of surface types is shown in Table 3. We can see that the differences of cloud fraction are minimal (~0.03), which is expected as the effective cloud fraction is mainly determined by the reflectance measurements performed in the non-absorbed part of the spectrum. The effective cloud fraction retrieved by HECORA is lower than the one retrieved by FRESCO+ and larger than OCRA_RCF, especially for the land and vegetation cases. Table 3. Mean cloud fraction parameter and standard deviations for HECORA, FRESCO+, and OCRA, as well as the mean difference for TROPOMI measurements. The mean cloud pressure and the pressure difference between FRESCO+, HECORA, and ROCINN are shown in Table 4. The mean pressures are 852 ± 163 hPa by HECORA and 825 ± 207 hPa by FRESCO+. ROCINN results show sharply different results. We also analyzed the pressure differences for different surface types. Over the oceans, the HECORA mean effective cloud pressure is greater than that of FRESCO+ and ROCINN. However, over land and vegetation, the effective cloud pressure retrieved by HECORA is lower than FRESCO+, much higher than ROCINN. Over land, the mean cloud pressure of HECORA, FRESCO+, and ROCINN is 795 ± 212 hPa, 801 ± 209 hPa, and 658 ± 192 hPa. The land surface is the surface type with the smallest difference between the results of the three cloud retrieval algorithms. The cloud pressure retrieval results from HECORA are much greater than in ROCINN, which is due to the fact that the visible band used by HECORA is more sensitive to the middle of the cloud, while the ROCINN results using the O 2 A-band are closer to the actual cloud top pressure [15], the O 2 A-band and O 2 -O 2 band have different absorption and scattering characteristics, and its sensitivity to the cloud and surface is different.

Comparison with CALIOP
The primary payload aboard CALIPSO is CALIOP. CALIOP is an elastic backscatter lidar that transmits linearly polarized laser light at 532 and 1064 nm. It measures range-resolved backscatter intensities at both wavelengths using a three-channel receiver, which mainly provides cloud and aerosol information. In this study, we choose the CALIOP 2 Level product Cloud Layer with version 3.4. We compared the cloud pressure formation from HECORA and FRESCO+ with the CALIOP cloud layer from 5 November to 4 December 2018. For the one considered monthly data, we further filter as follows: (1) CALIOP cloud layer is a single layer; cloud optical depth is greater than 4.
(2) For every CALIOP pixel, the collocated TROPOMI pixel we selected is within ±0.025 degrees of the CALIOP latitude and longitude. Those criteria leave us with 11,230 cases. We take the average of CALIOP cloud top pressure and cloud base pressure as the CALIOP cloud pressure. The mean pressure of CALIOP is 733 ± 218 hPa; this value retrieved by FRESCO+ and HECORA is 823 ± 185 hPa and 843 ± 142 hPa, respectively. That means that the cloud pressure retrieved by FRESCO+ and HECORA is greater than CALIOP, but the FRESCO+ retrieval result is closer to CALIOP. Then, classifying all cases into oceans, land, and vegetation based on MODIS NDVI products, the comparison results are shown in Table 5. For different surface types, the average values of FRESCO+ are closer to the CALIOP cloud layer pressure than those retrieved by HECORA. Table 5. Mean cloud pressure and standard deviations for HECORA and FRESCO+ of TROPOMI, as well as CALIOP. We have also compared the global distribution of cloud pressure retrieved by HECORA and FRESCO+ for TROPOMI measurements and consider the CALIOP results as true cloud pressure. The comparison results of two typical CALIOP orbits are shown in Figure 9. For the selected orbit on 6 and 28 November 2018, HECORA retrieval results are closer to the CALIOP cloud pressure in low cloud conditions, whereas FRESCO+ performs better in high and thick clouds. The comparison results are consistent with Acarreta's [14] view that the O 2 -O 2 distribution in the atmosphere has a lower profile shape than O 2 , increasing the sensitivity to lower clouds in principle.

Application to TROPOMI NO2 Retrieval and Comparison with MAX-DOAS
The TROPOMI NO2 operational retrieval algorithm uses FRESCO+ to correct cloud effects [41]. The purpose of developing HECORA is to correct cloud effects in the retrieval of trace gases from the EMI of China. However, the current EMI spectral calibration work is still in progress, so we use TROPOMI spectra to test the effect of HECORA in correcting NO2 retrieval results. The NO2 VCD retrieval algorithm used here is previously implemented for the EMI NO2 operation product (refer to Zhang et al., 2019 [42]), and has been adapted to the TROPOMI instrument. The NO2 retrieval method in general followed three steps: (1) NO2 SCDs fitting in the wavelength range of 405-465 nm by using the DOAS technique. (2) Stratospheric and tropospheric NO2 AMF were calculated pixelby-pixel by the VLIDORT model, using daily high-resolution NO2 a priori profile from the WRF-Chem model. (3) A modified reference sector method [43] is used to estimate the stratospheric contribution from the total column and derive the final tropospheric NO2 VCD. During the AMFs calculation, auxiliary information including cloud fraction and cloud top pressure are needed in the second step. The FRESCO+ cloud parameters involved in the calculation are the cloud pressure, the cloud fraction, and the cloud radiance fraction from the NO2 spectra window itself at 440 nm.
To explore the effect of cloud parameters on the tropospheric NO2 retrieval, we compared the ground-based multiaxial differential optical absorption spectroscopy (MAX-DOAS) observation data with TROPOMI NO2 VCD to evaluate the performance of HECORA and FRESCO+. The multiaxial differential optical absorption spectroscopy (MAX-DOAS) is used to measure concentrations of trace gases from the ground. MAX-DOAS can retrieve aerosol profiles with the corresponding aerosol properties (e.g., AOD) and trace gas profiles using the measured spectrum information.

Application to TROPOMI NO 2 Retrieval and Comparison with MAX-DOAS
The TROPOMI NO 2 operational retrieval algorithm uses FRESCO+ to correct cloud effects [41]. The purpose of developing HECORA is to correct cloud effects in the retrieval of trace gases from the EMI of China. However, the current EMI spectral calibration work is still in progress, so we use TROPOMI spectra to test the effect of HECORA in correcting NO 2 retrieval results. The NO 2 VCD retrieval algorithm used here is previously implemented for the EMI NO 2 operation product (refer to Zhang et al., 2019 [42]), and has been adapted to the TROPOMI instrument. The NO 2 retrieval method in general followed three steps: (1) NO 2 SCDs fitting in the wavelength range of 405-465 nm by using the DOAS technique. (2) Stratospheric and tropospheric NO 2 AMF were calculated pixel-by-pixel by the VLIDORT model, using daily high-resolution NO 2 a priori profile from the WRF-Chem model. (3) A modified reference sector method [43] is used to estimate the stratospheric contribution from the total column and derive the final tropospheric NO 2 VCD. During the AMFs calculation, auxiliary information including cloud fraction and cloud top pressure are needed in the second step. The FRESCO+ cloud parameters involved in the calculation are the cloud pressure, the cloud fraction, and the cloud radiance fraction from the NO 2 spectra window itself at 440 nm.
To explore the effect of cloud parameters on the tropospheric NO 2 retrieval, we compared the ground-based multiaxial differential optical absorption spectroscopy (MAX-DOAS) observation data with TROPOMI NO 2 VCD to evaluate the performance of HECORA and FRESCO+. The multiaxial differential optical absorption spectroscopy (MAX-DOAS) is used to measure concentrations of trace gases from the ground. MAX-DOAS can retrieve aerosol profiles with the corresponding aerosol properties (e.g., AOD) and trace gas profiles using the measured spectrum information. . Figure 10a-d shows tropospheric NO 2 VCD using two cloud retrieval algorithms and the MAX-DOAS NO 2 VCD from CAMS, UCAS, NC, and GC at four observation sites. The correlations between tropospheric NO 2 VCD using two cloud retrieval algorithms and MAX-DOAS NO 2 VCD at four sites are very high, especially for UCAS and GC, and GC and UCAS sites have a relatively lower NO 2 VCD than the other two sites. Results show that both cloud retrieval algorithms are effective in correcting the effect of cloud on the tropospheric NO 2 VCD retrieval.
Based on the comparison in Section 3.3.2, we found that under low altitude cloud conditions, the cloud pressure retrieved by HECORA is closer to that of the CALIOP cloud layer products than FRESCO+. For polluted areas, NO 2 is mainly concentrated in the boundary layer, usually less than 2 km (>800 hPa). The small difference of cloud pressure input will make a meaningful impact on the retrieval results of the NO 2 concentration on low altitude cloud pixels [20]. Figure 11 shows the difference between the observed values of MAX-DOAS and TROPOMI using HECORA and FRESCO+ for cloud correction under different cloud pressure ranges. We choose 800 hPa as the boundary between high and low clouds. As can be seen in Figure 11, whether using HECORA or FRESCO+ for cloud correction, the TROPOMI NO 2 retrieval results tend to be smaller than the MAX-DOAS observations, especially in such polluted areas. The underestimation part is related to the large satellite pixels. The average effect of the satellite pixels is unable to capture the spatial gradient of NO 2 [45]. Under low cloud conditions, the mean, median, and standard deviation of the differences between the TROPOMI retrieval results using HECORA cloud parameters and MAX-DOAS observations are smaller. This shows that the cloud correction effect of HECORA under low cloud conditions is stronger than that of FRESCO+. In contrast, TROPOMI NO 2 products using FRESCO+ perform well when the cloud pressure is smaller than 800 hPa. In summary, HECORA and FRESCO+ have their own advantages. HECORA is suitable for low cloud pixels, especially for polluting pixels with high NO 2 concentrations. FRESCO+ is suitable for cloud correction of high cloud pixels. Considering the current situation of air pollution in China, it is appropriate to use HECORA cloud parameters for cloud correction in the EMI trace gas retrieval.
using two cloud retrieval algorithms and the MAX-DOAS NO2 VCD from CAMS, UCAS, NC, and GC at four observation sites. The correlations between tropospheric NO2 VCD using two cloud retrieval algorithms and MAX-DOAS NO2 VCD at four sites are very high, especially for UCAS and GC, and GC and UCAS sites have a relatively lower NO2 VCD than the other two sites. Results show that both cloud retrieval algorithms are effective in correcting the effect of cloud on the tropospheric NO2 VCD retrieval. Based on the comparison in Section 3.3.2, we found that under low altitude cloud conditions, the cloud pressure retrieved by HECORA is closer to that of the CALIOP cloud layer products than FRESCO+. For polluted areas, NO2 is mainly concentrated in the boundary layer, usually less than 2 km (>800 hPa). The small difference of cloud pressure input will make a meaningful impact on the retrieval results of the NO2 concentration on low altitude cloud pixels [20]. Figure 11 shows the difference between the observed values of MAX-DOAS and TROPOMI using HECORA and FRESCO+ for cloud correction under different cloud pressure ranges. We choose 800 hPa as the boundary between high and low clouds. As can be seen in Figure 11, whether using HECORA or FRESCO+ for cloud correction, the TROPOMI NO2 retrieval results tend to be smaller than the MAX-DOAS observations, especially in such polluted areas. The underestimation part is related to the large satellite pixels. The average effect of the satellite pixels is unable to capture the spatial gradient of . Under low cloud conditions, the mean, median, and standard deviation of the differences between the TROPOMI retrieval results using HECORA cloud parameters and MAX-DOAS observations are smaller. This shows that the cloud correction effect of HECORA under low cloud conditions is stronger than that of FRESCO+. In contrast, TROPOMI NO2 products using FRESCO+ perform well when the cloud pressure is smaller than 800 hPa. In summary, HECORA and FRESCO+ have their own advantages. HECORA is suitable for low cloud pixels, especially for polluting pixels with high NO2 concentrations. FRESCO+ is suitable for cloud correction of high cloud pixels.
Considering the current situation of air pollution in China, it is appropriate to use HECORA cloud parameters for cloud correction in the EMI trace gas retrieval.

Conclusions
We have developed a new cloud retrieval algorithm called HECORA based on the O2-O2 477 nm absorption band. Compared to the OMCLDO2 algorithm, the main updates include (1) adopting a new radiative transfer model VLIDORT, (2) improving DOAS retrieval settings, and (3) the first use of O2-O2 VCD nodes to retrieve cloud information. We use an O2-O2 VCDgeo LUT with denser column density nodes to retrieve cloud parameters. HECORA is intended to improve the accuracy of cloud information retrieved from hyperspectral satellite loads and planned to be applied to EMI.
We first verify HECORA using the VLIDORT simulated spectrum. When the Lambertian cloud model is used in the simulation, the results of HECORA cloud pressure are mostly consistent with those of VLIDORT, and the error is small (<20 hPa); the retrieval error decreases with the increase of cloud fraction. When the scattering cloud model is used in the simulation, the cloud pressure retrieval results are close to the middle cloud pressure. For both cloud models, HECORA's retrieval results

Conclusions
We have developed a new cloud retrieval algorithm called HECORA based on the O 2 -O 2 477 nm absorption band. Compared to the OMCLDO2 algorithm, the main updates include (1) adopting a new radiative transfer model VLIDORT, (2) improving DOAS retrieval settings, and (3) the first use of O 2 -O 2 VCD nodes to retrieve cloud information. We use an O 2 -O 2 VCD geo LUT with denser column density nodes to retrieve cloud parameters. HECORA is intended to improve the accuracy of cloud information retrieved from hyperspectral satellite loads and planned to be applied to EMI.
We first verify HECORA using the VLIDORT simulated spectrum. When the Lambertian cloud model is used in the simulation, the results of HECORA cloud pressure are mostly consistent with those of VLIDORT, and the error is small (<20 hPa); the retrieval error decreases with the increase of cloud fraction. When the scattering cloud model is used in the simulation, the cloud pressure retrieval results are close to the middle cloud pressure. For both cloud models, HECORA's retrieval results are better than those using the O 2 -O 2 SCD LUT. The SCD LUT uses the same SCD nodes as the OMCLDO2 algorithm.
The validation of HECORA was also carried out by comparisons of the retrieved effective cloud fraction and cloud pressure with the OMI OMCLDO2, TROPOMI FRESCO+, OCRA/ROCINN, and CALIOP cloud layer products. For OMI measurements, comparisons showed a very good agreement between HECORA and OMCLDO2. Applying HECORA to TROPOMI, the HECORA cloud fraction retrieval results have a high correlation coefficient with the FRESCO+ results and the recalculated OCRA cloud fraction; the cloud pressure retrieval results from different algorithms have some differences. To compare the performance of HECORA and FRESCO+ at different cloud heights, we compared the cloud pressure from HECORA and FRESCO+ with the CALIOP cloud layer product. In the low cloud conditions, the HECORA performs better, while the FRESCO+ retrieval results are slightly better in the high cloud conditions. These differences have a significant impact on the retrieval results of NO 2 .
Finally, the HECORA and FRESCO+ cloud parameters were used for the retrieval of TROPOMI NO 2 VCD and compared with the MAX-DOAS observations in the same region. The retrieval results of TROPOMI tropospheric NO 2 VCD are in good agreement with those of MAX-DOAS. The cloud correction effect of HECORA under low cloud conditions is greater than that of FRESCO+. Combined with the vertical distribution of NO 2 in typical pollution areas, HECORA is suitable for cloud correction for trace gas retrieval in China, and is expected to be applied to EMI shortly thereafter.

Discussion
We found a common phenomenon that FRESCO+ and HECORA overestimate the cloud pressure of high clouds in some cases. The possible reasons are as follows. When the cloud is very high (<400 hPa), the cloud type is mainly ice clouds. The optical properties of ice clouds are complicated by the geometries of the ice particles, the uncertainties in ice crystal concentration, and their size spectra [46]. The cloud model used by two cloud algorithms set cloud parameters more like water clouds. The scattering and absorption characteristics of ice clouds and water clouds are quite different. This makes HECORA and FRESCO+ unable to correctly retrieve the cloud pressure of very high clouds.
Although HECORA applied the O 2 -O 2 VCD geo LUT to cloud retrieval algorithms for the first time, it has achieved good performance. However, there are still some aspects of HECORA that need to be improved in future work. For example, HECORA performs worse than FRESCO+ in high cloud conditions. HECORA does not account for the change of the absorption cross-section of the O 2 -O 2 molecule with temperature. The IPA assumption used by HECORA does not consider any anisotropic parameters, such as the anisotropic surface BRDF. We will consider these ideas to improve HECORA as an orientation in the next release of HECORA. Finally, we will be applying HECORA to EMI as soon as possible.
for the TROPOMI Level 1b Radiance and FRESCO+, NO 2 products. Thanks to the TROPOMI operational cloud data provided by the German Aerospace Center (DLR). Thanks to the developers of VLIDORT and QDOAS software. Heartfelt appreciation and gratitude to everyone who contributed to this article.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: