Enhancing GNSS-R Soil Moisture Accuracy with Vegetation and Roughness Correction

: Spaceborne Global Navigation Satellite System-Reflectometry (GNSS-R) has been proven to be a cost-effective and efficient tool for monitoring the Earth’s surface soil moisture (SSM) with unparalleled spatial and temporal resolution. However, the accuracy and reliability of GNSS-R SSM estimation are affected by surface vegetation and roughness. In this study, the sensitivity of delay Doppler map (DDM)-derived effective reflectivity to SSM is analyzed and validated. The individual effective reflectivity is projected onto the 36 km × 36 km Equal-Area Scalable Earth-Grid 2.0 (EASE-Grid2) to form the observation image, which is used to construct a global GNSS-R SSM retrieval model with the SMAP SSM serving as the reference value. In order to improve the accuracy of retrieved SSM from CYGNSS, the effective reflectivity is corrected using vegetation opacity and roughness coefficient parameters from SMAP products. Additionally, the impacts of vegetation and roughness on the estimated SSM were comprehensively evaluated. The results demonstrate that the accuracy of SSM retrieved by GNSS-R is improved with correcting vegetation over different types of vegetation-covered areas. The retrieval algorithm achieves an accuracy of 0.046 cm 3 cm −3 , resulting in a mean improvement of 4.4%. Validation of the retrieval algorithm through in situ measurements confirms its stable.


Introduction
Soil moisture plays a crucial role in Earth's surface water cycle and influences various hydrological, meteorological, and ecological processes.Accurately measuring soil moisture can aid in forecasting floods, droughts, and other extreme weather events and improve the understanding of climate change [1].L-band microwave signals have physical properties that make them well-suited for remote sensing applications of surface soil moisture (SSM).These signals have a longer wavelength and can penetrate through clouds and vegetation.Microwave remote sensing has thus become a critical tool for measuring SSM, enabling accurate and consistent measurements over large areas.Satellite-based microwave remote sensing has made significant advances in recent years for the global-scale SSM monitoring.Dedicated missions such as the Soil Moisture and Ocean Salinity (SMOS) mission [2] and the Soil Moisture Active and Passive (SMAP) mission [3] use monostatic L-band microwave scatterometer and radiometer instruments to provide valuable insights into the dynamics of SSM on a global scale.However, these traditional satellite-based sensors have limited spatial resolution, typically around 40 km, and a revisit time ranging from 2-3 days.These limitations prevent their use in high-resolution applications, such as precision agriculture and drought monitoring.Innovative technologies are needed to overcome these limitations and enable high-resolution measurements of SSM for various applications.
The development of Global Navigation Satellite Systems (GNSS) [4] has led to the emergence of reflectometric remote sensing using a signal of opportunity transmitted by GNSS satellites and reflected by the Earth's surface, known as GNSS-Reflectometry (GNSS-R) [5].This technology has rapidly gained attention for its great potential applications in remote sensing.The GNSS-R receiver collects GNSS signals reflected from the Earth's surface, it has lower power dissipation and lighter mass, making it easily deployable on microsatellite platforms.This bridges the spatial-temporal gap in observations left by dedicated monostatic SSM satellite-based sensors [6].Since the reflected signal is influenced by the properties of the geophysical parameters of the interface and exhibits distortions in shape or reflected power, the geophysical parameter retrieval algorithm can be developed by identifying the characteristics of the distortion and relating them to relevant geophysical parameters.
The feasibility of the GNSS-R technology in detecting various geophysical parameters in geoscience fields has been validated through numerous demonstration experiments [7].The first suggestion to use the GNSS-R technique to monitor terrestrial SSM came from [8], who was inspired by GNSS-R sea surface wind speed retrieval.Following this, several ground, tower, and airborne experiments have been conducted to prove the viability of GNSS-R in detecting changes in SSM [9][10][11].
At first, researchers focused on studying the correlation between delay-Doppler mapping (DDM) observables and SSM changes.These studies explained the variation in time delay waveforms under different wet levels.However, owing to the lack of adequate airborne data, GNSS-R-based soil moisture detection was primarily limited to groundbased studies for the next decade.Single-and dual-antenna pattern receivers have been used to detect changes in SSM though interferometric reflectometry (IR) [12].A soil moisture monitoring network can be established using ground-based GNSS-IR by utilizing qualified International GNSS Service Network (IGS) stations.However, the coverage of each ground station is limited to a few hundred meters in its immediate vicinity, resulting in sparse global coverage for the application of soil moisture data in related studies.Moreover, the fundamental retrieval algorithm for spaceborne GNSS-R and ground-based GNSS-IR is different.Ground-based GNSS-IR systems commonly rely on measuring SNR observation data from a long arc on a single GNSS satellite frequency.
The successful launches of the UK Technology Demonstration Satellite-1 (TDS-1) in 2014 [13] and NASA's Cyclone GNSS (CYGNSS) mission in 2016 [14] have made spaceborne GNSS-R more accessible.In recent years, using satellite-based GNSS-R measurements to retrieve SSM has gained considerable interest due to the freely available observations provided by the two missions.Several studies have validated the sensitivity of the DDM-derived observables to changes in land SSM [15][16][17].The DDM signal-to-noise ratio, or the DDM-derived effective reflectivity, has been verified as an indicator to detect changes in SSM [18].Ref. [15] has demonstrated that the DDM observables can detect spatial and temporal variations in land SSM and exhibit consistency over similar land surfaces.The sensitivity of DDM observables to SSM compared with other monostatic microwave observation systems was also studied.The sensitivity of DDM observables is influenced by the amount of vegetation cover.Lower Normalized Difference Vegetation Index (NDVI) values indicate higher sensitivity and Pearson's correlation [16].The attenuation of GNSS-R signals by vegetation is mainly caused by branches and trunks in dense forests [19].Additionally, the sensitivity of GNSS-R to SSM can be significantly altered by surface roughness and inland water bodies [20].After conducting exploratory studies, the key issues in satellite-based GNSS-R SSM detection have become clearer.The effective reflectivity determined as the feature quantity is capable of responding to changes in soil moisture, but it is also influenced by factors such as vegetation, surface roughness, and terrain.However, this knowledge provides a solid foundation for the further development of inversion algorithms.
Modeling the complexity of land surface scattering is challenging due to the presence of multiple factors such as surface roughness, vegetation, and topographic relief.Previous studies have assumed that the coherent scattering component dominates over the land surface and calibrates the effective reflectivity [21].This assumption holds when the land surface has a small-scale roughness compared with the GNSS carrier wavelength, such as bare soil.However, as the surface roughness increased, the contribution of incoherent scattering became more prominent.Therefore, the scattering field should consider both volume scattering from plants and interactions between vegetation and the surface in regions with vegetation cover [22].Previous studies on spaceborne GNSS-R SSM retrieval established empirical statistical models between DDM effective reflectivity and reference SSM using linear regression and spatial averaging methods.In addition, the effects of vegetation cover and surface roughness were considered in the algorithms in order to obtain accurate estimates of SSM.In [23], the changes of effective reflectivity and SSM were regressed using a linear model pixel-by-pixel, whereas [24] established a trilinear regression model between effective reflectivity, vegetation opacity, roughness coefficient, and SSM from all pixel matched parameters.To improve SSM inversion accuracy, many studies have attempted to use machine learning and deep learning methods [25].
Although space-based GNSS-R SSM retrieval algorithms have been developed, few studies have assessed the impact and correction of vegetation and surface attenuation on SSM retrieval from existing models.Vegetation cover and surface roughness can affect soil moisture measurements by influencing the amount of intercepted rainfall and rate of water infiltration into the soil.Different land cover types, such as forest, cropland, and grassland, have different vegetation cover and surface roughness and can therefore have different soil moisture characteristics.This study aimed to retrieve land surface soil moisture using CYGNSS data with correcting vegetation cover and surface roughness on different land cover types and evaluate the impact of these factors on GNSS-R SSM retrieval.The adopted data and method for spaceborne GNSS-R SSM remote sensing are introduced in Section 2. Section 3 presents the results and the effects of vegetation and roughness.A discussion is presented in Section 4. Finally, main conclusions are given in Section 5.

CYGNSS Data
The CYGNSS mission consists of eight microsatellites, which were initially designed to detect ocean surface winds, and the specular points of the observed DDM are located between 38° south and north latitude.The study reported in this paper utilized the V2.1 version of CYGNSS Level 1 data, which was made available in March 2017.To retrieve soil moisture parameters, only ground-based samples were used.The dataset covers the entire year of 2018, with the first six months of data utilized to develop the SSM retrieval model, and the remaining data used to evaluate the accuracy of the GNSS-R-derived SSM.Data were downloaded from the Physical Oceanography Distributed Active Archive Center (PO.DAAC).

SMAP Product
The referenced SSM data used in this study are the SMAP v008 Level-3 SSM product acquired from the National Snow and Ice Data Center, which is a daily update with a spatial resolution of 36 km × 36 km Equal-Area Scalable Earth-Grid 2.0 (EASE-Grid2).While data from satellite descending passes (a.m.) and ascending passes (p.m.) were saved separately in the product files, they were averaged to provide a daily SMAP SSM product in this study.Figure 1 displays the SMAP SSM data on 1 January 2018, showing the data coverage and typical spatial variation of the SSM.The vegetation opacity and roughness coefficient parameters included in the SMAP product were also averaged to facilitate subsequent vegetation and surface roughness attenuation correction.To evaluate the accuracy of the derived SSM for different land cover types, land cover classification data were also used.

SSM Retrieval Method from GNSS-R
Spaceborne GNSS-R SSM remote sensing is based on the sensitivity of effective reflectivity to surface permittivity, which is primarily influenced by SSM.Several semi-empirical models have been developed to estimate the dielectric constant of surfaces.Figure 2a shows the complex permittivity under different SSM conditions at GPS L1 frequency using the Dobson model [26].The results indicate that the imaginary part of the complex permittivity is almost impervious to the SSM, whereas the real part is strongly influenced by it.For instance, when the mass fraction of sand content was 0.8, the mass fraction of clay content was 0.07, and the soil bulk density was 1.25 g•cm −3 , the change in SSM led to significant variations in the real part of permittivity.Surface reflectivity can be derived as: where L R stands for the left circular polarized scattering with the incoming right circular polarized signal. and signal incidence angle  .By combining the semi-empirical dielectric constant model, a physical relationship between SSM and corresponding reflectivity can be obtained.Figure 2b shows the relationship at various signal incidence angles.It can be observed that the surface reflectivity exhibited a monotonous increase as the soil moisture content increased.Furthermore, a larger incidence angle had a more significant impact on the reflectivity value.
The DDM is a fundamental observable of spaceborne GNSS-R.It is generated by the cross-correlation of the reflected signals and local replica code of the receiver, which maps the scattered power over a time delay and Doppler frequency shift range.During signal processing in current CYGNSS mission, the coherent integration time of DDM is often set at 1 ms, followed by 0.5 s to 1 s of incoherent integration to reduce speckle and thermal noise within a short-time correlation.The scattering mechanisms over the sea surface can be approximately explained by the Z-V model [27].The difference in surface scattering between the ocean and land is that the former is dominated by an incoherent component, whereas the latter is dominated by a coherent component.This implies that the primary components of the power dispersed over land are from the specular reflection direction, corresponding to the first Fresnel zone around the specular point on the real ground surface.Previous studies have demonstrated the effectiveness of both the DDM SNR and DDM-derived reflectivity in detecting variations in SSM.The DDM SNR is calculated as the ratio, expressed in decibels, between the highest value in a single DDM bin and the average raw noise counts per bin.Reflectivity, in turn, is a function of surface permittivity, which is mostly influenced by SSM [28].Wet surfaces exhibit a greater dielectric constant and reflectivity than dry surfaces do.An inversion methodology was developed to establish a mapping relationship between the DDM SNR or reflectivity and the surface SSM.Typically, the scattering field over land surfaces contains both incoherent and coherent scattering components that can be simultaneously received by the spaceborne GNSS-R receiver.This study assumes that coherent reflection predominates throughout the land surface, and that the first Fresnel zones near the specular point are homogeneous.Consequently, the received power asymptotically tends toward the value obtained using freespace propagation weighted by the reflection coefficient, and the total path length of the bistatic radar system operation equals the sum of the path lengths [29].Finally, DDM reflectivity, also known as effective reflectivity, can be calibrated using the radar equation for a coherent signal [30] where are the distances from the GNSS transmitter to the specular point and specular point to the receiver, respectively, and  is the incidence angle of the signal at the specular point.The GNSS-R receiver onboard the spacecraft directly generates the DDM in the unit of the processing count.However, to convert the count into received power in watts, a series of precise calibrations is necessary [31].Fortunately, all parameters required in (3) are included in the CYGNSS Level 1 data product.It is important to note that, under the coherent assumption, the spatial resolution of individual observations from spaceborne GNSS-R is primarily determined by the bistatic radar observing geometry and is approximately 0.6 times the initial Fresnel zone size [28].

Effects of Surface Vegetation and Roughness
The reflectivity of the terrain surface can be affected by various factors, including the roughness of the surface and the presence of vegetation.When the GNSS signal is transmitted towards an area with vegetation, the signal can be attenuated by the plants, as shown in Figure 3.The signal passes through the plant canopy twice, resulting in intensity attenuation each time [32].In addition to vegetation, the rough surface of the ground can also scatter the GNSS signal in different directions, weakening the intensity in the direction of the specular point in accordance with energy conservation.To minimize the impact of these perturbing factors, it is important to carefully consider the retrieval algorithm.In passive microwave radiometry, the widely used tau-omega model is a basic zero-order model [33] that accounts for vegetation attenuation with an exponent item and corrects the effect of surface roughness [34].
( ; , , ) ( ; ) ( ; ) ( ; ) ( ; ( 2 cos ) ( ) ( ; ) where  is the incidence angle of the transmitted signal, v m is the soil moisture, k is the wavenumber,  is the standard deviation of surface height, k together represents the roughness coefficient of the land surface,  indicates the vegetation optical depth (VOD), 2 means the two-way vegetation opacity, and indicate the power attenuation from land cover vegetation and surface roughness, respectively.Vegetation opacity can be stated as ( b is a proportionality value that depends on both the vegetation structure and the microwave frequency; VWC is related to the vegetation water content).The method for calculating VWC employs a series of land cover-based equations to estimate the combined foliage and stem VWC from NDVI data [3].The vegetation attenuation of the L-band microwave signal is mainly due to the trunks and branches, whereas the leaves are nearly transparent [35].Therefore, the retrieval accuracy degrades in the dense forest regions.The roughness coefficient depends on the polarization, frequency and geometric characteristics of the ground surface and is parameterized to the standard deviation of the surface height.The roughness coefficient was obtained from a look-up table provided in the SMAP manual [3].In this study, the two models were used directly to correct for the effects of vegetation and surface roughness attenuation in spaceborne GNSS-R SSM retrievals.The effects of these models for different land cover types were analyzed to evaluate their performance in CYGNSS SSM retrieval.Despite the nonlinear correlation between soil moisture and reflectivity according to the dielectric constant and Fresnel equation models, in practice, the range of annual soil moisture variation is often limited for most land surfaces.Therefore, linear regression models are often used, as shown in [23].In this study, a GNSS-R SSM retrieval model was proposed, which also uses a linear model for each grid cell.This model directly regresses soil moisture and aggregated effective reflectivity.The flow chart in Figure 4 presents the data processing and inversion algorithm-building process.As part of the GNSS-R SSM retrieval process, the daily CYGNSS Level 1 product is first screened for valid data using specific criteria.These criteria include ensuring that the delay bin of the DDM peak power falls within a 7-10 bin interval, the DDM SNR is greater than or equal to 2, the receiver antenna gain in the specular point direction is greater than 0 dB, and the specular incidence angle is less than or equal to 60°.Once the nonconforming data are removed, the effective reflectivity is computed at each analog power DDM peak using Equation (3).It is worth noting that the noise floor was not considered in the computation of the effective reflectivity because the results were worse when the peak power was subtracted from the noise floor provided in the CYGNSS product.The incidence angle of the GNSS signal is also an important factor affecting GNSS-R reflectivity, and its sensitivity to SSM decreases as the incidence angle increases, as shown in Figure 2b.To correct this, the approach proposed in [27] was applied.The individual reflectivity values were then gridded into a 36 km × 36 km EASE-Grid2 grid using mean value to match the SMAP SSM product on the same day, generating training data sample pairs.Correction for the attenuation effects of vegetation and surface roughness on effective reflectivity was made using the vegetation and roughness coefficient parameters provided in the SMAP product, as described in Equations ( 5) and ( 6), respectively.A grid point value was marked as invalid in the collocation process if the total number of specular effective reflectivity values in the same grid cell was less than five.The grid cells marked in the SMAP product as inland, urban, and hilly areas were also filtered.Once the masking process is complete, the linear model is fitted pixel-by-pixel using all collocated training datasets.The entire models can be represented as follows.

GNSS R v
The established linear model included a coefficient matrix A , an intercept matrix B, the gridded GNSS-R reflectivity gridded Γ , and the predicted SSM from GNSS-R, denoted as . The formed model can be used to predict daily SSM values in the future.

SSM Retrieval from GNSS-R
To evaluate the performance of the SSM retrieval algorithm, we conducted an experiment using the training dataset to regress the GNSS-R-derived effective reflectivity without vegetation and surface roughness correction.Regression was performed by generating a linear model for each EASE-Grid2 grid cell using the half-year spatial average effective reflectivity and SSM from the SMAP product.Figure 5 displays the resulting slope of the linear model was relatively smaller in arid regions where effective reflectivity was very low and soil moisture content was small and stable, whereas the intercept tended to be negative in such regions.Figure 5b clearly illustrates this trend.6a.The total number of SSM data pairs matched for model testing was 7,736,769.The results indicated that the total bias and RMSD of the CYGNSS-derived SSM were 0.009 cm 3 cm −3 and 0.048 cm 3 cm −3 , respectively.The slope of the linear regression equation between the reference SSM and the inversed SSM was 0.981, and the scatter points were closely aligned with the linear regression line, suggesting a good match between the two datasets.The distribution of bias mainly falls between −0.05 cm 3 cm −3 and 0.05 cm 3 cm −3 , confirming the accuracy of the retrieval model.Figure 7a,b show the spatial map of the relative errors between the CYGNSS-retrieved SSM and SMAP SSM for each EASE-Grid2 grid cell from the testing dataset.The blue area indicates where GNSS-R underestimates the surface SSM, whereas the red area represents where it overestimates the SSM.In Figure 7a, the average bias is essentially small and mainly falls between −0.05 cm 3 cm −3 and 0.05 cm 3 cm −3 .Figure 7b shows that the soil moisture inversion accuracy is less than 0.06 cm 3 cm −3 in most areas, except for a few wetter regions such as the Sudanian Savanna and Peninsular India.However, the standard deviations (STD) of CYGNSS-derived SSM and SMAP SSM in the last half-year of 2018 as shown in Figure 7c,d, respectively, indicate good consistency, demonstrating that CYGNSS can detect daily variation in soil moisture.Nonetheless, large deviations still exist in some areas, such as central and eastern parts of South America and central Africa.

Effects of Vegetation and Roughness
This section evaluates the performance of three retrieval configurations under different land cover types defined by the International Geosphere-Biosphere Programme (IGBP) to assess the impact of vegetation and surface roughness on the quality of CYGNSS-derived SSM.The first configuration involves directly using effective reflectivity to construct retrieval models, ignoring the effects of vegetation and roughness, as described in Section 3.1.The second configuration modifies the specular effective reflectivity with vegetation transmissivity for vegetation attenuation, using Equation ( 5).The third configuration further corrects for surface roughness attenuation using the roughness coefficient in Equation (6).
Compared to the first configuration, the accuracy of the retrieved SSM improved marginally when the effective reflectivity was modified with the vegetation effect.The bias and RMSD of the GNSS-R-derived SSM were 0.009 cm 3 cm −3 and 0.046 cm 3 cm −3 , respectively.However, there was no significant improvement when the third configuration was utilized compared to the second configuration.
The retrieval accuracy of the GNSS-R SSM retrieval model over different land cover types was evaluated by comparing the results with those of the SMAP product.The statistical results are presented in Table 1.The highest retrieval error was found for the land cover with cropland/natural vegetation mosaics, and woody savannas, whereas barren and open shrublands showed the best inversion performance.The discrepancy in retrieval accuracy among different land cover types is partly due to the variation in the volume of local SSM content.After correcting for vegetation attenuation, the performance of the retrieval algorithm improved on vegetated terrain covered with deciduous-broadleaf-forest, mixed-forest, woody savannas, savannas, croplands, grasslands, and cropland/natural vegetation mosaics.The mean improvement was 4.4%.It is worth noting that the spatial resolution of the CYGNSS observations was higher than 36 km.Therefore, using the spatial average method for surface geophysical parameter processing instead of true values on the GNSS-R specular point may explain the lack of significant improvement with vegetation and roughness correction.Additionally, the uncertainty of the SMAP SSM product itself could have influenced the results.

Validation of Soil Moisture by In Situ Observation
The reliability of the GNSS-R SSM retrieval methodology and the performance of attenuation correction were evaluated by comparing the results with in situ SSM data from the International Soil Moisture Network (ISMN).The in situ SSM measurements have a raw time resolution of 10 min and a probe depth of 0.0~0.05m.To align the SSM data from different sources, the in situ measurements were resampled to a daily average value.The GNSS-R SSM values were selected for the nearest grid point values to the corresponding in situ station.Figure 8 compares the time series from the in situ SSM, SMAP SSM, and GNSS-R-derived SSM for both the training and testing data.The Yuma_27_ENE, Knox_City, and Vernon sites exhibited high consistency between the measured and predicted SSM.The predicted SSM from the developed linear model remained stable for six months of extrapolation, although there was a significant systematic difference between the predicted SSM and in situ measurements at the Newton_8_w station.This discrepancy may be due to representative errors for different spatial resolution scales.
Table 2 presents the validation results at the four different sites using various correction strategies.The Yuma_27_ENE station, located in a very arid area, exhibited relatively stable SSM levels throughout the year, and the CYGNSS-derived SSM also showed high stability.For the grassland stations of Knox_City and Vernon, the results showed a somewhat contradictory situation, with the retrieval model underestimating the last season of 2018 at Vernon.The model performed well at Knox_City, whereas the SMAP SSM exhibited the opposite behavior.The addition of attenuation corrections had an impact on the derived SSM at the two stations.At the Newton_8_w station, the vegetation correction improved the overall statistical RMSD value of the retrieved SSM, which was 0.082 cm 3 cm −3 compared to 0.076 cm 3 cm −3 for the SMAP SSM.Although there is a difference between global statistics due to the different spatial resolutions of the two datasets and the relatively small testing sample, the in situ validation confirms the ability of spaceborne GNSS-R reflectivity to sense changes in SSM compared to the SMAP radiometer.The systematic error for absolute SSM retrieval can be addressed using rescaling methods with larger datasets [36].

Discussion
This study reports on the results of the CYGNSS SSM retrieval algorithm and analyzes the performance of vegetation and surface roughness attenuation correction across different land cover types.The findings indicate that spaceborne GNSS-R can be used for SSM estimation, with the accuracy of CYGNSS-derived SSM being relatively stable.The daily CYGNSS-derived SSM shows small changes in temporal bias and RMSD against SMAP SSM during the testing period, as illustrated in Figure 9, with RMSD values of less than 0.06 cm 3 cm −3 .This indicates the stability and reliability of the model.However, on 13 December 2018 there was a rapid increase in the bias and RMSD metrics, which was thought to be related to the CYGNSS observation quality on that day.The inversion error in various SSM bins shows that CYGNSS increasingly overestimates the ground SSM when the SSM is greater than 0.22 cm 3 cm −3 .While the RMSD at different bins indicates a steady increase, the uncertainty remains stable at larger than 0.05 cm 3 cm −3 when the reference value is greater than 0.14 cm 3 cm −3 .Based on the assessments conducted, it can be inferred that the accuracy of soil moisture estimates derived from CYGNSS is comparable to that of satellite-based radiometers.In the retrieval of SSM using spaceborne GNSS-R, the dielectric properties of the land interface are affected by SSM, vegetation and surface roughness.The effective reflectivity of GNSS-R is directly related to its dielectric property.In regions with dense vegetation, vegetation attenuation can significantly impact SSM retrieval.While the tau-omega model has been used in previous research, it is not suitable for terrains with dense vegetation.To improve the accuracy of GNSS-R estimates, a better understanding of the scattering process over rough surfaces is necessary, and more reliable reference geophysical parameters must be selected.Furthermore, the entire GNSS-R inversion algorithm must be improved to accurately quantify the impact of other factors on retrieval accuracy.Vegetation correction improved the SSM inversion results from 0.048 cm 3 cm −3 to 0.046 cm 3 cm −3 .However, when applying the correction method to various land cover types, the impact on CYGNSS SSM retrieval was not significant, indicating inconsistency with previous simulation results [37].The limited improvement observed with the SMAP roughness parameter suggests that factors such as correction model, uncertainty in the SMAP parameters, and representation errors may have contributed to this outcome.

Conclusions
This study examines the potential of spaceborne GNSS-R for estimating land SSM.An empirical statistical model was developed using SMAP Level-3 SSM products as reference data.The accuracy of the CYGNSS-based retrievals of SSM was adversely affected by the attenuation effects of vegetation and surface roughness.However, after correcting for these effects, modest improvements were observed in SSM retrieval over vegetated areas.Nonetheless, the surface roughness correction method was found to be limited due to the absence of reliable small-scale surface roughness data and refined correction methods.The derived SSM had an accuracy of 0.046 cm 3 cm −3 with vegetation correction.The retrieval model developed in this study demonstrated stable performance, as assessed using in situ and SMAP data.It is evident that the current state of spaceborne GNSS-R SSM remote sensing requires the use of external auxiliary data for statistical modeling, and the impact of additional terrestrial geophysical factors must be carefully quantified.Future research should focus on developing models for vegetation and surface roughness correction in satellite-based GNSS-R to improve our understanding of the effects of vegetation cover and surface roughness on soil moisture variability over different land cover types.This study contributes to the advancement of remote sensing technology for soil moisture measurements.

Figure 1 .
Figure 1.Averaged SMAP Level-3 soil moisture from satellite descending passes and ascending passes on 1 January 2018.


are the vertical and horizontal linear Fresnel reflection coefficient; they are the function of surface complex dielectric constant r

Figure 2 .
Figure 2. Complex permittivity under different soil moisture (a) and the relationship between soil moisture and reflectivity at the different signal incidence angle (b).
of the receiver antenna,  is the carrier wavelength of GNSS signal, ts R and rs R

Figure 5 .
Figure 5.The slope (a) and intercept (b) of the linear SSM retrieval model.

Figure 6
Figure6depicts the density scatterplot of the matched SMAP SSM and CYGNSSderived SSM from the testing dataset and the retrieval error distribution.The black dashed line represents a 1:1 diagonal and the red line shows the linear regression line in Figure6a.The total number of SSM data pairs matched for model testing was 7,736,769.The results indicated that the total bias and RMSD of the CYGNSS-derived SSM were 0.009 cm 3 cm −3 and 0.048 cm 3 cm −3 , respectively.The slope of the linear regression equation between the reference SSM and the inversed SSM was 0.981, and the scatter points were closely aligned with the linear regression line, suggesting a good match between the two datasets.The distribution of bias mainly falls between −0.05 cm 3 cm −3 and 0.05 cm 3 cm −3 , confirming the accuracy of the retrieval model.

Figure 6 .
Figure 6.Density scatterplot of GNSS-R derived SSM and SMAP SSM (a) and probability density distribution of inversed SSM deviation (b).

Figure 9 .
Figure 9. Testing data daily error statistic (a) and total retrieval error at different soil moisture bins (b).

Table 2 .
The CYGNSS soil moisture retrieval accuracy at different in situ stations (unit: cm 3 cm −3 ).