Calibration and Validation of CYGNSS Reflectivity through Wetlands’ and Deserts’ Dielectric Permittivity

: The reflection of Global Navigation Satellite Systems (GNSS) signals, namely GNSS-Re-flectometry (GNSS-R), has recently proven to be able to monitor land surface properties in the microwave spectrum, at a global scale, and with very low revisiting time. Moreover, this new technique has numerous additional advantages, including low cost, low power consumption, light-weight and small payloads, and near real-time massive data availability, as compared to conventional monostatic microwave remote sensing. However, the GNSS-R surface reflectivity values estimated through the bistatic radar equation, and the Fresnel coefficients have shown a lack of coincidence with real surface reflectivity data, mostly due to calibration issues. Previous studies have attempted to avoid this matter with direct regression methods between uncalibrated GNSS-R reflectivity data and external soil moisture content (SMC) products. However, calibration of GNSS-R re-flectivity used in traditional inversion models is still a challenge, such as those to estimate SMC, freeze/thaw, or biomass. In this paper, a successful procedure for GNSS-R reflectivity calibration is established using data from the CYGNSS (Cyclone GNSS) constellation. The scale and bias parameters are estimated from the theoretical dielectric properties of water and dry sand, which are well-known and empirically validated values. We employ four calibration areas that provide maximum range limits of reflectivity, such as deserts and wetlands. The CYGNSS scale factor and the bias parameter resulted in a = 3.77 and b = 0.018, respectively. The derived scale and bias parameters are applied to the CYGNSS dataset, and the retrieved SMC values through the Fresnel reflection coefficients are in excellent agreement with the Soil Moisture Active Passive (SMAP) SMC product. Then, the SMAP SMC is used as a reference true value, and provides a standard linear regression with an R-square coefficient of 0.803, a root mean square error (RMSE) of 0.084, and a Pearson’s correlation coefficient of 0.896.


Introduction
During the last few decades, a new method based on the reflection of opportunity signals, also known as 'signals of opportunity' (SoOp), has emerged as a very attractive technique for remote sensing.The SoOp method employs receivers that take the advantage of receiving existing signals from other systems, and can be exploited for specific Earth observation applications, such as mean surface height, sea surface wind-speed, soil moisture, freeze/thaw, inundation, wetlands, surface roughness, and other characteristics of vegetation and aboveground biomass [1,2].In particular, the use of Global Navigation Satellite Systems (GNSS) signals is called GNSS Reflectometry (GNSS-R).In GNSS-R, signals from different GNSS constellations can include the constellations of GPS (United States), Galileo (Europe), GLONASS (Russia), and BDS (China).GNSS-R technology is based on measuring the reflected GNSS signals on the Earth's surface and can be used to monitor land and sea surface geophysical properties [3,4].Although GNSS-R missions were initially designed to measure geoid altitudes, ocean winds, and tropical cyclones, current studies have proven the ability of GNSS-R to sense land surface physical properties [5][6][7][8][9][10][11][12].These parameters are usually called essential climate variables (ECVs), and are not only related to surface physics, but also to biosphere and hydrosphere.The first Low Earth Orbit satellite carrying a GNSS-R receiver was the UK Disaster Monitoring Constellation (UK-DMC) mission in 2003 [13].Then, several other missions followed, including the TechDemoSat-1, the NASA's Cyclone GNSS (CYGNSS), the Spire GNSS-R satellites, the Chinese Bufeng-1 satellites [14], the upcoming HydroGNSS and GEROS (GNSS REflectometry, Radio Occultation, and Scatterometry) experiments from ESA, the 3CAT-2 satellite of Polytechnic University of Catalonia (UPC) [15], etc.
A GNSS-R system has the same geometrical configuration properties as a bistatic radar [16].The principle of measurement is based on observing with a GNSS-R receiver the left-hand circularly polarized (LHCP) reflection of a right-hand circularly polarized (RHCP) GNSS signal [17,18].In this case, a perfectly smooth surface produces a near-specular reflection, while a rough surface spreads the transmitted signal over a larger area, producing a scattered reflection.These features allow us to characterize the physical properties of different land surfaces over time and space [19].In this scheme, the scattered signal is sampled over the illuminated zone in delay and frequency domains, while creating the so-called delay Doppler map (DDM), which is the basic product containing physical information of a surface [15,18].An important application of GNSS-R is the measurement of surface reflectivity, which allows for the evaluation of soil dielectric properties and the estimation of soil moisture content (SMC) [16,[20][21].An important property of the returned GNSS signals is the coherency parameter, which has high values for nearly smooth surfaces, and low values for very rough surfaces.This parameter, called bare soil roughness (BSR), has an important implication when estimating the geophysical properties of different land surfaces [17,20].In addition to BSR, when retrieving SMC from vegetated areas, a parameter related to vegetation attenuates the signal.This parameter is named vegetation optical depth (VOD), and it must be taken into account when estimating accurate SMC estimates [22].
In general, and in particular for SMC retrieval from GNSS-R, the GNSS-R reflectivity need to be calibrated due to uncertainties in the receiver's and the transmitter's ranges, and other instrumental issues [17,[20][21].In this sense, the calibration of GNSS-R instruments is very important, while the reflectivity values observed by different systems can be compared for validation purposes, data fusion, etc.A methodology for estimating SMC from CYGNSS data was presented by [20], where the authors employed a multivariable regression based on parameterizing SMC values in terms of GNSS-R reflectivity, VOD, and BSR.A similar approach was used for the University Corporation for Atmospheric Research (UCAR) SMC product [23,24], where CYGNSS SMC was estimated through a linear regression between the uncalibrated CYGNSS reflectivity and the Soil Moisture Active Passive (SMAP) SMC product.Although these authors obtained good correlations due to the tight coupling between SMC and the uncalibrated CYGNSS reflectivity data, direct regression analyses between uncalibrated GNSS-R reflectivity data and ancillary SMC products have no physical significance and cannot be used for instrumental calibration.However, another interesting work was presented by [17], where the authors employed data collected by a GNSS receiver installed on a small aircraft to estimate a bias parameter based on observations of water bodies, and a similar calibration approach was applied in [21] for CYGNSS SMC estimation.Then, the authors of [25] proposed a scale factor calibration for CYGNSS reflectivity data and employed the resulting estimates to retrieve SMC through several calibration steps based on the parameterized water-cloud model of [26].However, the authors normalized the reflectivity values, estimated the scale factor in decibel units, and no bias parameter was included, while the subsequent need for a second calibration scheme was required to achieve "reasonable" correlations with ancillary SMC products.
Previous works on GNSS-R calibration have presented interesting methodologies and obtained "reasonable" correlations to ancillary humidity measurements, including those from in situ sensors and SMAP SMC products.A benefit of the calibration procedure using reflectivity obtained from water bodies is that the surface roughness component is considered negligible, as well as other attenuation factors on the coherent component of the reflected signal.Here, our starting hypothesis consists of considering different types of land surfaces that exhibit similar diffusion properties, that is, with a specific specular behavior, with known dielectric properties, and avoiding attenuation factors on the coherent component of the signal, such as BSR and VOD.Thus, specific zones should be studied as potential surfaces for GNSS-R calibration and/or for validation purposes.
According to previous works, and taking into account the aforementioned methodologies, our innovative calibration method enhances and combines the previous author's ideas and employs a theoretical maximum range through calibration areas with extreme reflectivity.Moreover, our new technique improves the current state-of-the-art of GNSS-R calibration with both a scale factor and a bias parameter.In this scheme, on the surface of the Earth, it is possible to identify dry areas that meet certain requirements to provide the lowest possible reflectivity and set a bias parameter centered at 'zero'.Similarly, it is possible to identify wet areas to provide the highest possible values and set a scale factor.In this scheme, the scale factor would not affect the reflectivity estimates calibrated by the bias parameter, and this itself would adjust the reflectivity ranges to certain theoretical values, such as those from water and dry sand.We assume a linear calibration, and higher degree polynomic functions are dismissed.
These preliminaries can be translated into the following objectives for the signal calibration algorithm: (a) To identify areas that exhibit theoretical scattering properties and suitable dielectric conditions so that the reflectivity values are minimally affected by contributions such as BSR and VOD.
(b) From the zones in the previous objective, verify the potential and capacity of desert areas to obtain a calibration bias parameter using GNSS-R reflectivities.
(c) Once the bias parameter is estimated, verify the suitability of a scale factor based on wetlands' reflectivity, according to the method proposed by [25].
(d) To perform the conversion of calibrated GNSS-R reflectivities into SMC values.
(e) To validate the SMC values estimated from the calibrated reflectivities.
Areas fulfilling the prerequisites of (a) are described in the next section, in terms of biome classes, BSR, and VOD.To accomplish (b) and (c) objectives, the GNSS-R reflectivity values are obtained from the data acquired by the CYGNSS mission; the CYGNSS data are described in Section 2.3.To satisfy objective (d), we employ the converted Fresnel linear reflectivity coefficients from the calibrated GYGNSS reflectivity estimates to estimate SMC.This is possible for low incidence angles, as was pointed out by [17,27].The methodology for SMC retrieval, along with our proposed method for CYGNSS reflectivity calibration, are described in Section 3.Then, the theoretical dielectric properties of wetlands and deserts, and the analysis and calibration of CYGNSS reflectivity from the calibration areas are presented in Section 4.Under these conditions, our results will show the potential capacity of the inversion procedure and highlight the conditions for a correct assessment of SMC.To achieve the last objective of this approach, the specific geographical areas are tested using the existing SMAP SMC products.The validation of the CYGNSS SMC estimates with the SMAP SMC products is presented in Section 4.3.The final discussion and conclusions are given in the last sections.

Calibration Areas
To achieve the objectives proposed in this study, two different areas for wetlands and deserts were selected, for both calibration and validation procedures.This step corresponds to objective (a) described in the Introduction Section.These 4 zones are shown in Figure 1, and the coordinates of the corners are listed in Table 1.Concerning the calibration values based on reflectivity from wetlands, the GNSS-R data comply with certain characteristics as follows.Large wetland areas are selected in order to contain a high number of water bodies so that the probability of identifying a significant number of specular points at water surfaces is very high., 91°E) Calibration areas used in this study with very different dielectric properties.On the one hand, the tropical rain forest at the Savanna of Beni District (Bolivia) and at the Ganges Delta (Bangladesh).On the other hand, the dry deserts of Sahara (Mali) and Rub'Al Khali (Saudi Arabia).The corresponding images and the biome classes are adapted from [28].
Two zones are located in tropical rainforest biomes, one in Bolivia and the other in Bangladesh.The other two zones are located in desert biomes, one in the Sahara and the other in Saudi Arabia.For this purpose, some conditions are considered as follows.The dry areas are selected falling in the arid category [29].For these areas to be used for calibration and validation purposes, and to ensure optimal SMC conditions, the months with minimum rainfall are used.According to [30], the average rainfall in the western and eastern areas of the Sahara Desert, along the 20°N parallel, is practically null.Similar areas can also be identified in the Arabic peninsula [31]).These zones are shown in Figure 1.To ensure the quality of the CYGNSS data, we employ specular points with heights below 700 m, as pointed out in the GNSS-R studies of [31,32].The corresponding land surface

SMAP BSR, VOD, and SMC Data from the Calibration Areas
Concerning objective (a) described in the Introduction Section, the SMAP/Sentinel-1 L2 Radiometer/Radar at 30-s Scene and 3 km EASE-Grid Soil Moisture (Version 2) product [34] provides BSR and VOD estimates along with other parameters, e.g., SMC for the first 5 cm of surface depth, volumetric water content, etc.In this work, we employ SMAP SMC estimates as reference true values for validating our GNSS-R SMC retrieval technique (see Section 4.3).SMAP products are available at [34].Figures 3 and 4 show the SMAP BSR and VOD estimates, respectively, for July and August of 2021.For the selection of the calibration zones, we take into account a very low BSR, as suggested by [31].The SMAP BSR values for the 4 calibration areas are shown in Figure 3, showing values below 0.3 m, and, specifically, the zone of Bangladesh shows values below 0.15 m.The values of VOD from SMAP are shown in Figure 4.As expected, the desert areas have practically null VOD, while the wetlands show a larger range with a variety of values.High VOD values (>0.9) are observed in some areas of Bolivia, specifically at high altitudes, as we can identify in Figure 2. The VOD values are below 0.45 in the center of the figure.Notwithstanding, the possible effects due to VOD attenuation will not influence our calibration procedure, since our method employs the 99% quantile of the data (see next section).On the other hand, the Bangladesh area shows, in general, lower VOD values than that seen in Bolivia, with VOD values below 0.45 in practically all scenes.The low VOD and BSR values for the area of Bangladesh results in an excellent area for calibration purposes.

CYGNSS Data
The CYGNSS mission of NASA is the most recent Earth-observing GNSS-R constellation; it consists of 8 satellites launched onboard a Pegasus XL Rocket on 15 December 2016 (online measurements are available from March 2017 at [35]).The CYGNSS satellites are located in an equatorial orbit with an inclination of 35° at about 500 km altitude.The initial research objectives of CYGNSS were the observation of tropical cyclones, so the coverage is limited to low and middle latitude ranges (38°S to 38°N).However, the land surface scattered signals have provided an unprecedented opportunity to study BSR and dielectric properties [18,19,23].The CYGNSS data will be used to cover objective (b) described in the Introduction.
In CYGNSS, the geometry of observation follows a bistatic configuration [36].Each CYGNSS satellite is equipped with a GNSS up-looking RHCP antenna and two GNSS down-looking LHCP antennas.These antennas are categorized as bistatic radar receivers at the L-band (L1-band frequency at 1.57542 GHz), namely DDM instruments.These instruments are designed to map the scattered signal on the oceans and land surfaces, which are sampled in time and frequency, thus delivering DDMs at the proximity of the specular point.These DDM instruments compute 4 measurements every second, which are compressed and downloaded to the ground processing facilities [19].There are 3 CYGNSS processing levels freely available to the scientific community, ranging from level 0 to 3, and subsequent sub-levels.In this study, we employ level-1 data, which contains geolocated DDMs calibrated into an ideal (analog) power sensor (PDDM).Additional level-1 parameters are required in this study; these include DDM timestamp, latitude (φSP), longitude (λSP), incidence angle (θSP) of the specular points (SP), antenna transmitter gain (Gr), GPS effective isotropic radiated power (EIRP), and ranges from the receiver (Rr) and transmitter (Rt) antennas.The CYGNSS DDM data specifications and processing schemes are given in [37][38][39].The transmitter power (Pt) and gain (Gt) of the antenna, as well as the gain of the receiver antenna (Gr) in the direction of the specular point, are available in the L1 CyGNSS product file.
The geometric distribution of the observed specular points follows a quasi-random spatiotemporal distribution, due to the changing geometry of the GNSS and CYGNSS satellites [23].This distribution is very different from the conventional data delivered by active/passive microwave instruments onboard satellites.Therefore, the challenge includes a very different condition from the first Fresnel ellipses (areas of specular points), because the distances between the transmitter and the receiver differ considerably from one observation to another.In addition, the size of the specular points varies according to other factors, including the incidence angle and the height of the receiver [36,40,41].The authors in [19] quantified a mapping resolution near 150 m depending on the observation geometry.The authors in [42] assessed the spatial resolution of the coherent Fresnel reflection zone to approximately 0.65 × 0.85 km 2 , also depending on the above factors.The CYGNSS data used in this study range from July to August of 2021, and we apply the following filter criterion, e.g., [31], for quality purposes: (1) CYGNSS reflectivity values range between −35 dB and −5 dB; (2) Incidence angles range between 0° and 25°; (3) DDM signal-to-noise ratio (SNR) is greater than 3 dB; (4) Gr (receiver antenna gain towards the specular point) is greater than 5 dB; (5) Land surface heights are lower than 700 m.

Methodology
After covering the first objective, (a), described in the previous section, the following methodology proposed is divided into two parts.First, the reflectivities of water and deserts are calculated for the specular points obtained by the eight CYGNSS satellites.The data are filtered for the calibration areas indicated in Figure 1 and Table 1.Then, in the second part, the derived CYGNSS reflectivity values are contrasted with theoretical water and dry sand reflectivity values.This step corresponds to objectives (b) and (c) described in the Introduction.
To compute CYGNSS reflectivity values, we only consider the strongest scattering power provided by natural land surfaces, which is received from the coherent part of the reflected signal [19,43].In this sense, land surface reflectivity can be sensed from GNSS-R data through the bistatic radar equation for the coherent component of LHCP GNSS bistatic microwave signals [17,36,[44][45][46], which takes the following expression [20] when dealing with GNSS-R data: In this equation, PDDM is the DDM peak value from the analog scattered power.The subscript lr stands for a scattering mechanism when the incident RHCP signal is scattered by the surface and inverts the polarization to LHCP at the receiver position.Γlr is the surface reflectivity from which the SMC might be estimated, after correction of the noise floor component (N) in the DDM [31,46].Rt and Rr are the transmitter and receiver range to the specular point, respectively.Gt •Pt is the transmitter equivalent to isotopically radiated power (EIRP).Gr is the receiver antenna gain in the direction of the specular point.λ is the wavelength of the system, and θ refers to the incidence angle of the signal.Note that θ is constrained for angles range between 0°and 25° as indicated in the previous section.
For each specular point, the values for the variables (PDDM, Pt, Rt, Rr, Gt, and Gr) involved on the right side of Equation ( 1) are available at the CyGNSS L1 product, and they are crucial for calculating the CyGNSS measured reflectivity Γlr.Moreover, as the scattered analog power (PDDM) is affected by a system noise N, it must be corrected by this effect, as mentioned before.The parameter N is estimated from a subset of the DDM where no signal, above the horseshoe shape of the DDM [20], is present.The above equation is the final expression for a CYGNSS specular point reflectivity or measured reflectivity Γlr.
Furthermore, Γlr needs to be calibrated from instrumental bias and corrected for surface roughness and vegetation effects, as explained as follows.For any surface in general, but in particular for land surfaces, the inherent BSR and VOD affect the observed reflectivity [21], and are defined as follows: where τ stands for the VOD and k = 2π/λ.Here, λ is the wavelength of the GNSS system, and σ is the standard deviation of the surface roughness.These variables are found in [34].
Then, the implementation of the BSR and VOD effects in Equation ( 1) is as follows [17,21,22,27,34,36]: In this equation, Rlr is the Fresnel reflection coefficient identified as the surface reflectivity without BSR and VOD effects.The subscripts lr and rl stand for circular cross-polarized reflections.Rlr is required for extracting the soil moisture parameter, which is obtained through inversion of Equation (3).For this study, as we employ smooth surfaces, the BSR component can be avoided.Moreover, for some wide desert areas, vegetation attenuation effects can also be omitted.
Then, since the reflectivity for theoretical water and dry sand dielectric constants values can be determined using the Fresnel reflection coefficients (Rvv and Rhh), we can relate them to the Rlr and Rlr reflectivity circular polarization modes as follows [17,27,36,40]: where the subscripts vv and hh stand for vertical and horizontal polarization, respectively.Note that the variables in Equation ( 4) are function of θ.The dielectric constant of the land surface (εr), and incidence angle of the signal (θ) are the variables defining the Fresnel reflection coefficients.Expressions for these coefficients are found in many works (e.g., [21,47]), and references therein).An important point in this context is the use of low incidence angles θ ≤ 25°, where the difference between the reflections' coefficients for vertical and horizontal polarization can be considered negligible [17,40].Therefore, under these circumstances, the Fresnel equations can be solved either for one of these two coefficients.Thus, by expressing Equation (3) in terms of Equation ( 4), the reflectivity of a land surface sensed by GNSS-R can be calibrated using empirically estimated dielectric constants.
The second part of this methodology involves the calibration of the observed CYGNSS reflectivity values (Γobs) obtained with Equation (3).This step corresponds to objectives (b) and (c) described in the introduction.For this purpose, we employ the following parameterization to obtain calibrated reflectivity values (Γcal): where the bias parameter b is estimated from the minimum reflectivity values, specifically those obtained from deserts, which theoretical reflectivity is approximately 0.06 (−12 dB).
The scale factor a is estimated from the maximum reflectivity values, specifically those obtained from water bodies in the wetlands, whose theoretical reflectivity is approximately 0.348 (−1.96 dB).These theoretical values are deduced in Section 4.1.For a correct determination of these parameters, we first perform a statistical analysis of the retrieved surface reflectivity without calibration, considering the possibility of excluding outliers, if present, and other possible dependences accounting for the different GNSS Pseudorandom Noise (PRN) IDs and the different CYGNSS satellites.In the next step, we estimate the median and the 99% quantile of the observed reflectivity data for deserts and wetlands, respectively, and estimate the calibration parameters using Equation ( 5) in the linear least square fit.In this way, the bias parameter b is constrained by the deserts' reflectivity, and the scale factor a is constrained by the wetlands' reflectivity.Subsequently, accounting for 99% of the data, the minimum and maximum values of the calibrated reflectivity will correspond to the theoretical reflectivity of the dry sand and the water bodies, respectively.Finally, we apply the calibration parameters to the reflectivity values following Equation ( 5) and estimate CYGNSS SMC with the ancillary BSR and VOD data from SMAP following the SMC inversion method of [21].This step corresponds to objective (d) described in the Introduction Section.Then, for objective (e) described in the Introduction Section, the validation of the new CYGNSS SMC estimates can be achieved by comparison to the SMAP SMC estimates.
In brief, the methodology for the calibration of GNSS-R reflectivity is summarized in Figure 5. Starting with the available L1 data from CyGNSS, the observed reflectivity (Гobs) for both wet and dry datasets is estimated with Equation (3).Then, the "Filters" are those described in Section 2, including the limits of the study areas, altitude of the specular points, incidence angles, etc.This filtering provides the observed reflectivity for wetlands and deserts, their incidence angles, and the coordinates of each specular point.Then, the observed reflectivity data of wetlands and deserts are used to obtain the calibration parameters as detailed above, using the 99% quantile and mean, respectively, in the least square regression.These parameters are used to calibrate the observed reflectivity data from any location on Earth.The resulting reflectivity values are used to estimate SMC with ancillary data of VOD and BSR.Finally, the validation is performed with external data sources, for example, SMAP SMC.Note the ancillary data points and the CyGNSS specular point are not coincident, and therefore a 2D linear interpolation is required.

Theoretical Water/Dry Soil Dielectric Properties
The interactions of electromagnetic fields with materials are described through the fundamental electrical property, namely the relative permittivity of material (ε).This value is a complex quantity with real and imaginary components given by the following equation: where ε′ and ε″ are the real component of ε, also called the dielectric constant, and the imaginary component, referred to as the dielectric loss factor, respectively.The real part of the permittivity represents the ability of the materials to store electric energy.Microwave remote sensing is mainly focused on this component, as it allows us to derive an estimate of the water content in materials such as soil or vegetation.In the case of water, it allows for characterizing the chemical composition and thermodynamic properties.Dielectric properties of water and soils are generally well known [48][49][50][51].For these both types of materials, several authors have addressed relevant physical characteristics revealing different behaviors to various physical parameters, such as temperature and frequency, as well as to other constituent contents, like salinity and sediments (e.g., [52][53][54]).
For water bodies, the driving variables are temperature and frequency [55][56][57].The effects of these two variables on the dielectric constant are shown graphically in Figure 6a; these values are derived from the model [56].In this study, the measurements are acquired at a fixed frequency (1.57542 GHz).The water temperature is assumed to be 20 °C.Other effects including temperature variability and roughness must be revised in future studies.For soils, several authors have thoroughly addressed the different behaviors of the dielectric constant to various physical parameters (e.g., [48,51,[58][59][60][61]).While soils are largely affected by textural and moisture conditions, in the 1-1.6 GHz range, the dielectric properties of dry soils remain unchanged [62].Moreover, in the absence of liquid water, the microwave dielectric constant of soils lacks significant dependencies on either temperature or frequency.The dielectric properties of dry soils are only dependent on the soil bulk density.In general, due to the soil bulk density, the real part of the dielectric constant for dry sandy soils might vary between 2.0 and 4.0.The authors of [49] empirically related the real component of soil permittivity to bulk density by the following expression: where ρb is the bulk density of the soil.Figure 6b shows the effects of bulk density on the dielectric constant of three soils with different water content (SMC units are cm 3 /cm 3 ).Usually, the bulk density of the soil varies between 1.1 and 1.9 g/cm 3 [63].In this work, we referred to the online tool soil bulk density from the Global Gridded Surfaces of Selected Soil Characteristics [64] database.The soil bulk density for the deserts is identified as 1.6 g/cm 3 .The corresponding dielectric constant value is 2.904; this value is computed using Equation (7).The other values for constants under SMC of 0.1 and 0.2 in Figure 6b are computed using the model of [45].Note that the dielectric constant considerably increases with SMC, as compared to increases due to bulk density variability.Another factor influencing the dielectric behavior of these materials is the effect of the observation angles.
Figure 7 shows the reflectivity for different incidence angles and different dielectric constant values, where the reflectivity values start being affected for incidence angles above 45° [65,66].Under these cases, following    4).Rhh and Rvv are calculated as specified in [17,27,36].Usual values for dry and wet soils are shown in blue and yellow arrows, respectively.SMC units are cm 3 /cm 3 .

Analysis and Calibration of CYGNSS Reflectivity
The statistical analysis of the retrieved surface reflectivity without calibration for the different GNSS PRN IDs and the different CYGNSS satellites is shown in Figure 8.The central marks indicate the median values, and the bottom and top edges of the boxes indicate the 25 th and 75 th percentiles, respectively.The whiskers extend to the most extreme data points that are not considered outliers, and the outliers are plotted individually using the '+' symbol.In this figure, we can observe that the different satellites provide similar reflectivity responses for each of the calibration areas.As expected, the deserts provide lower values than the wetlands.In each case, we can observe no very consistent biases for the different GNSS PRN IDs (e.g., PRN 15 shows different biases between Bolivia and Bangladesh), while the different CYGNSS satellites provide lower variability.Notwithstanding, few outliers are observed, and the calibration of these biases will be addressed in future research with a large amount of data.
The statistics of quantile 99% for the wetlands and median average for the deserts are shown in Table 2, and the results of the least square estimation provided a bias parameter of b = 0.018, and a scale factor of a = 3.77.Similar to Figure 8, Figure 9 provides the reflectivity values after applying these calibration parameters with Equation (5).In this figure, we can observe all the reflectivity values were rearranged according to wet and dry constraints specified in Section 3. The quantile 99% of the wetlands data reaches −1 dB, and the deserts are between −15 dB and −8 dB, as implemented by the constraints.Although Figures 8-10 show reflectivity values expressed in decibels (dB), the calibration adjustment procedure must be performed using decimal reflectivity values, as detailed in the previous section.In Figure 10, the histograms of the reflectivity estimates before and after calibration are presented for each calibration area.In this figure, we can better observe that the reflectivity values have shifted to higher ranges.For instance, in the case of desert areas, we can observe nearly coincident mean values for the calibrated data.

Validation of CYGNSS SMC with SMAP SMC
Our validation process is carried out by comparing the SMC retrieved from CYGNSS calibrated reflectivities with the SMAP SMC products.We directly employ the SMC inversion models as indicated in the literature [68][69]40,27,17,36,21], and with no additional adjustments.Although corrections due to VOD attenuation were only necessary for wetlands, we corrected BSR and VOD for all four calibration areas.In the desert areas, the VOD correction was minimum due to the lack of vegetation.Regarding the BSR contribution, the low values shown in Figure 3 generate negligible attenuations because of the system frequency used in GNSS, as suggested in [21].However, the significance of VOD values is rather diverse for wetlands (Figure 4).
The calibrated CYGNSS reflectivity values for the four test areas are converted to SMC according to the indications in the previous section.Figures 11 and 12 show the SMC estimates for desert and wet areas, respectively.In general, the SMC estimates from calibrated GyGNSS reflectivity values are in very good accordance with SMAP SMC.For instance, at first, note the great differences between the SMC ranges when comparing desert and wetlands, i.e., Figures 11 and 12, respectively.The GyGNSS and SMAP SMC ranges are in very good accordance.Concerning the dry areas, in the Arabian Desert (Figures 11b,d), most of the estimates are below 0.1 SMC, and a small number of samples between 0.1 and 0.2 in SMC can be observed in both GyGNSS and SMAP products near 20.5°N 51°E.For the Sahara Desert (Figures 11a,c), most of the estimates are below 0.1 SMC, while some small deviations to SMAP are observed in the northern areas.Concerning the wet areas, in Bangladesh (Figures 12b,d), most of the estimates are above 0.4 SMC, and a small number of samples between 0.3 and 0.3 in SMC can be observed in both GyGNSS and SMAP products near 245°N 88.5°E.For Bolivia, very clear and well-defined structures of enhanced SMC are observed in both SMAP and CYGNSS products, mostly along the river at the meridian 65°W, as well as in other minor elongated structures, e.g., in the northwest and southwest regions, etc.Our CYGNSS SMC provides excellent results in Bolivia.The histograms of SMC from both SMAP and CYGNSS products are shown in Figure 13.This figure shows the statistical comparison of the extracted SMC from the CYGNSS calibrated reflectivity values and the SMAP SMC product.We can appreciate the ranges for CYGNSS are in very good accordance for all four calibration areas.In this figure, a slightly broader range is seen from CYGNSS.This may be the influence of the PNR dispersion due to signals from different GNSS satellites (Figures 8 and 9). Figure 14 shows the linear fit between the retrieved CYGNSS and the SMAP SMC for all the data used in this study (four test areas from July to August 2021).In this figure, three clusters are distinguished, corresponding to the desert areas, and the two wetlands; the right uppermost cluster corresponds to the Bangladesh area.Pearson's correlation coefficient reaches 0.89, and the standard linear regression statistics are R-square of 0.803 and root mean square (RMSE) of 0.084.The relevance of these results shows the good agreement between the retrieved values of SMAP and CYGNSS SMC, which emphasizes the good performance of the applied calibration method, as well as the correct conversion of calibrated CYGNSS reflectivity into SMC through the Fresnel coefficients method, as pointed out in [21].

Discussion
Calibration of GNSS-R reflectivity is an important step that must be accomplished to provide correct values that can be converted into accurate estimates of physical or bio-physical variables.Different studies have addressed the issue of calibrating GNSS-R reflectivity estimates.However, these studies were incomplete or not well-raised (e.g., [17,21,25]), or were based on regression approaches with no physical inversion models [20,[23][24]).Among these studies, based on the results of several experiments and testing different areas in eastern China, Clarizia et al. [21] used a bias value of 0.15 to obtain "reasonable" correlations in SMC retrieval using GYGNSS reflectivity.However, for low reflectivity values such as deserts, a 0.15 increase in the reflectivity of these areas would produce unacceptable SMC values that were not detected in their study.Wan et al. [25] proposed an interesting calibration approach with water reflectivity observations, but the authors had to normalize their reflectivity data to estimate the scale factor parameters, which were fitted using decibel units.However, by estimating the scale factor in decibel units, an exponential fit to the reflectivity range is produced, and no bias parameter can be included, while the subsequent need for a second calibration scheme was required to achieve 'reasonable' correlations with ancillary SMC estimates.Other authors opted to employ regression approaches to ancillary SMC data, but the actual reflectivity values sensed by GNSS-R still remains unknown.
In this work, we developed a calibration method that employs water bodies' and dry sand's reflectivity values.The initial idea was to identify some areas on the Earth's surface that contain extreme dielectric theoretical conditions.In addition to well-known factors, such as surface roughness or SMC for soil surfaces, the reflectivities of water and dry sand are also affected by many other factors that must be considered.When used for calibration purposes, the water temperature must be taken into account, as it is a driving factor that modifies seriously theoretical values of this variable.In this sense, a proper set of calibration references must be taken, such as water bodies, since these can exhibit a thermal homogeneous behavior.It was verified that the water dielectric properties differ substantially with temperature.Another inconvenience was the collection of statistically sufficient specular points in water bodies.This was a difficult task, due to difficulties in identifying suitable water bodies that comply with CYGNSS specular point requirements.Furthermore, saline waters are not appropriate for this purpose.In this work, we opt for a strategy that allows for extracting a higher number of specular points, which is based on the 99% quantile of the histogram of a wetland area.The resulting samples correspond to the values of maxima reflectivity and are supposed to agree with the specular points on water bodies and highly wet soils.Therefore, to ensure a representative number of specular points in water, two reference areas with similar temperatures were selected, a wetland area in the Bolivian Amazon, and a flooded area in Bangladesh.This methodology differs from other studies in which the water reflectivity values used for calibration were collected from rivers and other water bodies that could exhibit distinct temperature conditions leading to different water permittivity values, see, e.g., [25].To this extent, a large sample of observed water reflectivity values could be extracted from the two designated wetlands.
The possibility of using dry sand's reflectivities for calibration has not yet been fully explored.It is demonstrated that, in addition to SMC, bulk density is an important variable that affects its dielectric properties.In turn, the temperature is not a driving factor altering these properties.In this scheme, our starting hypothesis for considering this abiotic variable for calibration purposes was that under drought conditions, some large deserts are constituted by dry sand at 0% SMC.Under these circumstances, the main variable that governs the dielectric properties of dry sand is the bulk density.This variable can be accessed through different databases such as the Global Gridded Surfaces of Selected Soil Characteristics database (2005).The advantage of using desert areas is the minimum influence of SMC, VOD, and BSR that may attenuate the reflectivity sensed by GNSS-R.These test areas can be treated as quasi-specular surfaces.For both the Mali and Arabian deserts used in this study, the observed reflectivity values were very homogeneous.After calibration, the derived CYGNSS SMC estimates in the four test areas were compared to SMAP products (Figures 13a-d), and the results are excellent.
Our suggested calibration model is based on a linear least-squares adjustment using observed and theoretical reflectivity values, which differs substantially from previous calibration procedures [17,20,25,32].Validating with SMAP the SMC values from the corresponding calibrated CYGNSS reflectivities of the four test areas (Figure 13), the results show excellent agreement between both datasets in all test areas.This extent is also verified by the regression analysis depicted in Figure 14, where a very good fit is observed between SMAP and CYGNSS SMC; Pearson's correlation coefficient is 0.89.This method completely differs from the UCAR procedure to retrieve SMC in the "The CYGNSS/DDMI Level 3 soil moisture" [23,24], where a direct regression to match SMAP SMC was performed from the uncalibrated CYGNSS reflectivity.Although the authors obtained good correlations because of the tight coupling between SMAP SMC, the resulting product was not recovered from a physical inversion model.Here, in this work, the CYGNSS reflectivities are calibrated with test areas that provide estimated calibrated reflectivities.The conversion to SMC is carried out independently from the SMAP SMC product after a physical transformation of the calibrated CYGNSS reflectivities using the Fresnel coefficients, as described in [21].
Based on the presented results, this methodology has some shortcomings that must be reviewed and addressed in future work.First, improvements must be made in the selection of test areas.A better identification of wetlands exhibiting similar properties regarding water temperature must be ensured.For desert areas, a better examination of the areas must be achieved to ensure the complete absence of VOD and BSR.It would be worth having better datasets for bulk density in desert areas.However, for these zones, this may not be feasible in the near future.On the other hand, better knowledge of this variable would improve the characterization of dry sand in these environments.Regarding the SMC validation approaches, additional reliable reference datasets are necessary, such as the 'International Soil Moisture Network' [70]).
This work is based on the coherent component of the signal without considering the incoherent effects of BSR and VOD.This implies that this contribution should also be reviewed in future work.As a concluding remark, this work demonstrates the capacity and suitability of dielectric properties from deserts and wetlands for calibrating GNSS-R reflectivity data.

Conclusions
Calibrating GNSS-R land surface reflectivity data is necessary to accurately estimate geophysical variables, such as SMC, biomass, or freeze/thaw, which are essential for monitoring Earth's climate and hydrological cycle.In this work, we calibrated the GYGNSS reflectivity data by applying both a bias and a scale parameter estimated from the theoretical reflectivity values of different calibration areas that provide maximum range limits of reflectivity, such as deserts and wetlands.We used the wetlands of Bolivia and Bangladesh, and the deserts of Sahara and Saudi Arabia, under convenient altitude, BSR, and VOD conditions.Our innovative calibration scheme set the bias parameter with the lowest possible reflectivity of deserts, and the scale factor was estimated by the 99% quantile of the wetlands data to match the highest possible reflectivity response, such as that of water bodies.The CYGNSS scale factor and the bias parameter resulted in a = 3.77 and b = 0.018, respectively.Finally, the calibrated CYGNSS reflectivity values were used to directly obtain SMC estimates through the inversion of the Fresnel coefficients, including the attenuation corrections due to BSR and VOD influences.Our CYGNSS SMC results provide an excellent correspondence with the SMAP SMC products (standard linear regression: Rsquare = 0.77, RMSE = 0.095; Pearson's Correlation = 0.88).We may have scientifically established the ultimate calibration method of GNSS-R reflectivity data to be used for accurate SMC estimation.Author Contributions: I.M. and A.C. materialized the initial concept though investigation of previous research, formulation of new theory and methodology, draft preparation, supervision of achieved results, and provided international academic exchange; X.W. and K.E.provided collaboration, revision tasks, and draft preparation; S.J. provided supervision, mentorship, funding support, and undertook revision tasks.All authors have read and agreed to the published version of the manuscript.
heights from NASA's Shuttle Radar Topography Mission (SRTM) data are shown in Figure 2.

Figure 3 .
Figure 3.The SMAP BSR data for July and August of 2021.

Figure 4 .
Figure 4.The SMAP VOD data for July and August of 2021.Units are dimensionless.

Figure 5 .
Figure 5. Flowchart methodology for calibration of GNSS-R reflectivity.The validation is performed after direct SMC inversion with SMAP estimates.The method to obtain the theoretical reflectivity (Γteo) from the dielectric constant (ε) is provided in Section 4.1.

Figure 7 ,
wet soils provide theoretical reflectivity values ( w teo  ) above −4 dB, and dry sandy soils ( d teo  ) are approximately between −15 dB and −8 dB.On the one hand, we use a theoretical dry sand reflectivity value of −12 dB.This value is in accordance with the permittivity value of 2.904 within the 0° to 25° incidence angle range.On the other hand, we employ −1.96 dB for the theoretical reflectivity of water, which is in accordance with theoretical reflectivity value of pure water at 20 °C.Note these values are for the GNSS frequency of CyGNSS (L1-band frequency at 1.57542 GHz).

Figure 6 .
Figure 6.Dielectric properties for water and soils.In (a), water dielectric constant in terms of frequency and temperature[67].In (b), the effects of soil bulk density on soil dielectric constant.

Figure 7 .
Figure 7. Reflectivity for different dielectric constant values, in terms of incidence angle.Simulation performed using Equation (4).Rhh and Rvv are calculated as specified in[17,27,36].Usual values for dry and wet soils are shown in blue and yellow arrows, respectively.SMC units are cm 3 /cm 3 .

Figure 8 .
Figure 8. Non-calibrated reflectivity estimates in terms of (a-d) GPS PRN and (e-h) CYGNSS satellite IDs for the (a,b,e,f) wetlands and (c,d,g,h) deserts.Data are for July and August of 2021.The central mark in each box indicates the median value, and the bottom and top edges of the boxes indicate the 25th and 75th percentiles, respectively.The whiskers extend to the most extreme data points that are not considered outliers, and the outliers are plotted individually using the '+' symbol.

Figure 9 .
Figure 9. Calibrated reflectivity estimates in terms of (a-d) GPS PRN and (e-h) CYGNSS satellite IDs for the (a,b,e,f) wetlands and (c,d,g,h) deserts.Data are for July and August of 2021.The central mark in each box indicates the median value, and the bottom and top edges of the boxes indicate the 25th and 75th percentiles, respectively.The whiskers extend to the most extreme data points that are not considered outliers, and the outliers are plotted individually using the '+' symbol.

Figure 10 .
Figure 10.Histogram of CYGNSS reflectivity estimates before and after calibration for the (a,b) wetlands and the (c,d) deserts.Data are for July and August of 2021.

Figure 11 .
Figure 11.SMC estimates for the deserts from CYGNSS (a-b) and from SMAP (c-d).Data are for July and August of 2021.SMC units are cm 3 /cm 3 .

Figure 12 .
Figure 12.SMC estimates for the wetlands from (a-b) CYGNSS and from (c-d) SMAP for the calibration areas.Data are for July and August of 2021.SMC units are cm 3 /cm 3 .

Figure 13 .
Figure 13.Histogram of SMC from (a-b) CYGNSS and from (c-d) SMAP for the calibration areas.Data are for July and August of 2021.SMC units are cm 3 /cm 3 .

Figure 14 .
Figure 14.Correlations and linear fit (in blue) between SMC from CYGNSS and from SMAP from all the calibration areas.Data are for July and August of 2021.Standard linear regression: R-square = 0.803, RMSE = 0.084, Pearson's Correlation = 0.896.SMC units are cm 3 /cm 3 .

Table 1 .
Location of test areas used in this study.

Table 2 .
CYGNSS reflectivity statistics from the calibration areas.