Retrieval of High-Resolution Soil Moisture through Combination of Sentinel-1 and Sentinel-2 Data

: Estimating soil moisture based on synthetic aperture radar (SAR) data remains challenging due to the inﬂuences of vegetation and surface roughness. Here we present an algorithm that simultaneously retrieves soil moisture, surface roughness and vegetation water content by jointly using high-resolution Sentinel-1 SAR and Sentinel-2 multispectral imagery, with an application directed towards the provision of information at the precision agricultural scale. Sentinel-2-derived vegetation water indices are investigated and used to quantify the backscatter resulting from the vegetation canopy. The proposed algorithm then inverts the water cloud model to simultaneously estimate soil moisture and surface roughness by minimizing a cost function constructed by model simulations and SAR observations. To examine the performance of VV- and VH-polarized backscatters on soil moisture retrievals, three retrieval schemes are explored: a single channel algorithm using VV (SCA-VV) and VH (SCA-VH) polarizations and a dual channel algorithm using both VV and VH polarizations (DCA-VVVH). An evaluation of the approach using a combination of a cosmic-ray soil moisture observing system (COSMOS) and Soil Climate Analysis Network measurements over Nebraska shows that the SCA-VV scheme yields good agreement at both the COSMOS footprint and single-site scales. The features of the algorithms that have the most impact on the retrieval accuracy include the vegetation water content estimation scheme, parameters of the water cloud model and the speciﬁcation of initial ranges of soil moisture and roughness, all of which are comprehensively analyzed and discussed. Through careful consideration and selection of these factors, we demonstrate that the proposed SCA-VV approach can provide reasonable soil moisture retrievals, with RMSE ranging from 0.039 to 0.078 m 3 / m 3 and R 2 ranging from 0.472 to 0.665, highlighting the utility of SAR for application at the precision agricultural scale.


Introduction
Soil moisture plays a central role in both climate and hydrological systems [1][2][3] and represents a key link between the processes governing surface and atmosphere exchange. The spatial distribution and temporal evolution of soil moisture can vary significantly [4] and as a consequence, traditional point-based measurements often provide limited insight into spatiotemporal patterns of behavior and response. On the other hand, remote sensing data provide an opportunity for characterizing the spatial and temporal structure of soil moisture dynamics across a range of scales [5] but can be limited in terms of providing high-resolution detail.
Among various remote sensing-based measurement approaches [6][7][8], Synthetic Aperture Radar (SAR) data have shown much potential in providing high-resolution soil moisture estimates at both

Ground-Based Evaluation Data
To develop and evaluate the soil moisture retrieval algorithm, ground-based soil moisture measurements from two distinct observation networks are used. Specifically, four COSMOS sites and a single SCAN site (Rogers Farm #1) in Nebraska, United States are identified for soil moisture validation. Vegetation water content of SMAPVEX16-MB dataset is collected to develop empirical relations between VWC, and Sentinel-2-derived NDVI and NDWI (see Table 1). Further details of each of these data sets are provided in the following paragraphs. Table 1. Details of the cosmic-ray soil moisture observing system (COSMOS) and Rogers Farm #1 sites over Nebraska that provide ground soil moisture measurements for evaluating the retrieval algorithm, together with the SMAPVEX16-MB data set for developing relations between vegetation water content (VWC) and normalized difference vegetation index (NDVI)/normalized difference water index (NDWI). SMAPVEX16-MB was designed to support the SMAP post-launch calibration/validation and was conducted in Manitoba Canada from mid-June to late-July 2016 [62]. The main objective of the experiment was to understand and seek to reduce the errors in SMAP soil moisture retrievals [63,64]. The experimental region was mainly covered by agricultural fields, with crops including forage, pasture, canola, flaxseed, soybean, maize and wheat. The ground campaign selected around 50 fields for soil and vegetation sampling. Vegetation sampling was conducted at each field during the experiment. The wet biomass of different plant organs, i.e., root, stem, leaf, flower and fruit (if available), were weighed individually soon after collection. After around 2 weeks of naturally drying until weights no longer changed and an oven-dry correction, the dry biomass was reweighed and determined. In this study, we use the wet-and dry-biomass to determine the VWC per unit area, which is then used to establish an empirical relation with Sentinel-2-derived NDVIs and NDWIs (and then to estimate VWC and canopy transmissivity for the retrieval algorithm (see details in Section 2.3.3).
In addition, surface roughness collected during the experiment is used to determine valid ranges of the surface roughness for the soil moisture retrievals. In-situ and fixed station soil moisture were collected during the SMAPVEX16-MB experiment, which could have been useful for algorithm validation. Unfortunately, valid Sentinel-1 and Sentinel-2 pairs (n.b. Sentinel-1/2 data with an acquisition difference not larger than 3 days) were only available on June 13 and July 20. Only five soil moisture measurements were available on June 13, which would not be statistically significant for retrieval algorithm validation, while Sentinel-2 on July 20 was strongly contaminated by cloud. Thus, the available soil moisture measurements are not able to be used in this study. Additional information on the datasets used herein, as well as details of the instrumentation and related measurement protocols can be found at the National Snow & Ice Data Center (https://nsidc.org/ data/smap/validation/val-data.html) along with the website of the SMAPVEX16-MB experiment (http://smapvex16-mb.espaceweb.usherbrooke.ca/).

COSMOS and SCAN Site Soil Moisture Measurements
Based on details from the International Soil Moisture Network (ISMN, https://ismn.geo.tuwien.ac. at/), more than 67 COSMOS stations are currently operating in 7 countries, with 59 of these located in the United States. Here we utilize measurements from four COSMOS sites installed in northeastern Nebraska to evaluate the soil moisture retrieval algorithm. These sites cover various surface type with different crops and a temporally long time-span, allowing additional Sentinel-1/2 pairs to be examined. From previous analysis of the COSMOS systems [60,65], the effective depth of the systems measurement ranges from 12-76 cm, overlapping the effective penetration depth of C-band SAR (~2 cm) [66,67]. Montzka et al. [68] demonstrated that the effective radius footprint of COSMOS can vary from 150 to 250 m, while Kohli et al. [69] found that it ranges from 130 to 240 m. The COSMOS measurements selected in this study have an average depth of 0-15 cm and footprint radius of~200 m according to the ISMN description.
In addition to the COSMOS systems, soil moisture measured at the Rogers Farm #1 SCAN site are also used for soil moisture evaluation. The SCAN data provide a range of meteorological and associated information from more than 200 stations throughout the United States. Here, we use the soil moisture at 5 cm depth collected via a Stevens Hydra Probe II (www.stevenswater.com). Detailed information of the site can be found at https://wcc.sc.egov.usda.gov/nwcc/site?sitenum=2001, as well as the short descriptive summary in Table 1.

Sentinel-1 SAR Data
The European Space Agency's Sentinel mission [70] represents an integrated earth observation effort specifically designed for global environmental monitoring and security. The mission currently comprises three series (Sentinels 1-3) for land and ocean monitoring, with each consisting of two satellites in the same orbital plane. Sentinel-1A and -B each carry a C-band SAR with a center frequency of 5.405 GHz. The SAR operates at four imaging modes: namely interferometric wide-swath mode, wave mode, strip map mode and the extra wide-swath mode. The interferometric wide-swath mode, which provides a combination of a large swath width (250 km) and a high spatial resolution (5 × 20 m), is used in this study.
The ground range detected product of Sentinel-1A used in this study was acquired between early 2016 to mid-May 2019. The full preprocessing chain (see Figure 1) comprises: radiometric calibration, speckle filter to reduce the speckle noise and range-Doppler terrain correction. All of these steps are implemented by using the Sentinel application platform software [71]. The processed images are projected to Universal Transverse Mercator coordinates and the spatial resolutions resampled to 10 m using a nearest neighbor technique. The ascending orbit images covering the selected COSMOS and Rogers Farm #1 SCAN sites are selected because many of the descending orbit data were not available at these sites. Approximately 1600 Sentinel-1 pixels located within a 200-m radius of the COSMOS site are used for retrieving soil moisture. To reduce the uncertainty in normalization of local incidence angle, we assign the incidence angle in the forward modeling (see Section 2.3) to equal that of Sentinel-1 incidence angle at the site location. Although we collect a long-term time series of Sentinel-1 data, not all retrievals are ultimately used for soil moisture retrieval. It is important that there is data overlap with coincident Sentinel-2 imagery, which reduces the overall collection set. For this application, only Sentinel-1 and Sentinel-2 data with an acquisition difference less than or equal to 3 days are used. Further, only ground measurements that were acquired on the same day as Sentinel-1 are used to evaluate the retrievals. Overall, this reduces the data series from 78 Sentinel-1 images to a maximum of 45 coincident Sentinel-1/Sentinel-2 image pairs.

Sentinel-2 Multispectral Data
Sentinel-2 consists of two satellites (2A and 2B) which operate an MSI that provides 13 spectral bands, with 4 bands at 10 m, 6 bands at 20 m and three bands at 60-m resolution [55]. Sentinel-2A and -B satellites were launched in June 2015 and March 2017, respectively. The available Level 1C images (geometrically orthorectified products at top-of-atmosphere) acquired within 3 days (before or after) of the Sentinel-1 acquisition are used in this study. An atmospheric correction procedure is performed using the sen2cor plugin [72] of the Sentinel application platform software, and the Level 2A bottom-of-atmosphere reflectance product is obtained. To match the spatial scales of Sentinel-2 to Sentinel-1, those bands with 20-m and 60-m resolution are uniformly resampled into 10 m with a nearest neighbor resampling technique. The processed band 4 (665 nm), band 8 (833 nm), band 8A (865 nm), band 11 (1614 nm) and band 12 (2202 nm) are selected to calculate NDVI and NDWI. NDVI and NDWI have been shown to be good predictors of VWC [40,41,73] at low levels of VWC, but the performance can be affected at higher levels, where NDVI tends to saturate with increasing VWC. NDVI or NDWI are defined and computed based on reflectance at near infrared (ρ NIR ) and red bands (ρ R ) or near infrared and short wave infrared bands (ρ SWIR ), i.e., NDVI = (ρ NIR − ρ R )/(ρ NIR + ρ R ) and NDWI = (ρ NIR − ρ SWIR )/(ρ NIR + ρ SWIR ) [73]. The Sentinel-2 MSI has two NIR (band 8 and 8A) and two SWIR (band 11 and 12) band configurations, which enables more options for NDVI and NDWI computation, e.g., NDVI_833-665 = (ρ 833 − ρ 665 )/(ρ 833 + ρ 665 ). Here, the two numbers identified in a specific index refer to the central wavelength (in nm) of the bands that are used for computing the index. Thus, we have two specific NDVIs (NDVI_833-665, NDVI_865-665) and four NDWIs (NDWI_833-1614, NDWI_865-1614, NDWI_833-2202, NDWI_865-2202). To select an optimal index for VWC estimation, we analyze the response of multiple NDVI/NDWI spectral combinations on the VWC measured during SMAPVEX16-MB. Based on the comparison of multiple indices, we will ultimately choose the best index to estimate VWC, hence providing data for computing the backscatter contribution from vegetation canopy in the WCM. Flowchart of the soil moisture retrieval algorithm. The processed Sentinel-2 multispectral images are used to compute various NDVIs and NDWIs and hence to fit an empirical relations with VWC for the WCM input, while the processed Sentinel-1 synthetic aperture radar (SAR) images and coupled forward model are used to construct the cost function. The shuffled-complex evolution University of Arizona (SCE-UA) algorithm [74][75][76] is used for minimizing the cost function and searching for the optimal solutions of soil moisture and surface roughness.

Description of the Soil Moisture Retrieval Algorithm
Several elements are involved in the development of the soil moisture retrieval algorithm employed here. These include: 1) the feasibility of Sentinel-1 backscatters for soil moisture retrieval is evaluated by observing the response of observed and simulated backscatters to the variation of surface parameters, including soil moisture, roughness and vegetation water content. Meanwhile, Sentinel-2-derived NDVIs and NDWIs and ground measured VWC are deployed to construct relations between NDVI/NDWI and VWC that will be used to estimate vegetation descriptor in water cloud model; 2) the bare soil backscattering model of Oh [35] is coupled into the water cloud model to simulate the backscatters at the top-of-canopy for both VV-and VH-polarizations. In the coupled model, backscatter contribution from soil and vegetation canopy are simulated by Oh and NDVI/NWDI derived VWC, respectively; and 3) the Shuffled Complex Evolution University of Arizona (SCE-UA) [74][75][76] (see optimization section below) is applied to minimize the cost function constructed by the model simulations and SAR observations, hence to retrieve both the soil moisture and the surface roughness simultaneously. The overall structure of the retrieval is shown in Figure 1.

The Backscattering Model for Bare Soil
For simplicity, the commonly used empirical Oh model (referred to hereon as Oh-2004), which constructs a relationship between the backscatter and soil moisture, is applied here as the bare soil forward backscattering model. The surface roughness is described using a single parameter, represented by the root mean of the surface height (RMSH), without considering the correlation length. From the aspect of soil moisture inversion, reducing the number of unknowns reduces the uncertainty of the inversion [77]. The Oh-2004 model improves the ratio of cross-polarization (q) compared to the previous version proposed in 2002 [78] and is expressed in Equation (1) as: where σ 0 VH and σ 0 VV are the VH-and VV-polarized backscatters, respectively. σ 0 VH is defined as: where SM is the soil moisture; θ is incidence angle; and k and s are the wave number and RMSH, respectively. Based on Equations (1) and (2), the VV-polarized backscatter can be computed as

The Backscattering Model for Vegetation Canopy
The water cloud model of Attema and Ulaby [37] formulates a description of the backscattering behavior of the vegetation canopy. We utilize the modified version of Bindlish and Barros [38] in which a radar-shadow coefficient was introduced to describe the effect of a vegetation layover. In the model, the total backscatter (σ 0 T ) received by the radar is the sum of the canopy and soil scatterings without considering the interaction between the soil and vegetation (see Equation (4)): where τ 2 is the two-way vegetation transmissivity; and σ 0 veg and σ 0 soil are the backscatters of vegetation and underlying soil, respectively. The σ 0 soil is calculated using Oh-2004, while σ 0 veg is calculated as: with τ 2 = e −2Bm V /cos(θ) (6) where A and B are empirical parameters depending on the canopy type; α is radar-shadow coefficient depending on vegetation type and land use; m V is the vegetation water content as computed by NDVI and/or NDWI (see details in Section 2.3.3).

Empirical Relationship between Vegetation Water Content and NDVI/NDWI
Much effort has been directed towards the estimation of VWC using NDVI and NDWI [40,41,73]. Gao et al. [40] summarized most of the existing relations between VWC and NDVI or NDWI for different crops and recommended a new formulation (e.g., m V = 0.098e 4.225NDVI and m V = 7.84NDWI + 0.6, where m V is VWC) for maize. Here, we estimate VWC using NDVI and NDWI for the purpose of testing the performance of indices from different NIR and SWIR bands that Sentinel-2 provides, and also to evaluate the impact of the derived VWC on soil moisture retrieval.
The ground measurements of VWC collected during the SMAPVEX16-MB experiment that lie within a Sentinel-2 pixel are averaged and used for constructing relations between VWC and NDVIs/ NDWIs. We find that the VWC show a similar response to almost all of the NDVI and NDWI relationships (see Figure 2). That is, with increasing NDVIs and NDWIs, the VWC also increases. The relationship between NDVIs and VWC is well approximated by a power function, while that between NDWIs and VWC by an exponential function. Both relationships show high correlations, with the VWC presenting an R 2 > = 0.84 (see Table 2). We also compare the formulations for maize proposed by Gao et al. [40] as well as the relations proposed in Table 2. However, all of the indices present a higher R 2 than the relationships identified in Gao et al. [40] for Sentinel-2 and SMAPVEX16-MB pairs, thus the relations proposed here in Table 2 are preferentially employed for further analysis. Based on the coefficients of determination, it is difficult to identify the best index for VWC estimation of those proposed in Table 2. All of the indices show high and similar performance on VWC estimation. As such, they all are used to estimate VWC in the first step. The influence of the different VWC relationships on soil moisture retrieval will be analyzed in Section 3.2.

Sensitivity of Parameters in Water Cloud Model on Backscatters
To reduce the influence of parameters A, B and α of the WCM on soil moisture estimation, several studies have conducted experiments to estimate them. For instance, Ma et al. [14] estimated the parameters by using probabilistic inversion in advance of soil moisture estimation, while Baghdadi et al. [79] calibrated the parameter values by fitting the ground measurements against the radar observation. To examine the influence of the parameters on soil moisture retrieval, we first conduct a synthetic experiment to observe the response of simulated backscattering coefficient to the variation of the parameters whose ranges are set according to the findings in Bindlish and Barros [38]. The experiment consists of three individual tests. For each test, only one parameter, e.g., A, varies within its range, i.e., 0-0.2, while other parameters (B and α) are fixed (B = 0.091, α = 2.12; according to Bindlish and Barros [38]). Similarly, we change B while fixing A and α (A = 0.012, α = 2.12) to test the response of backscatter to parameter B. In all the three tests, soil moisture and RMSH have fixed values, i.e., soil moisture = 0.2 m 3 /m 3 , RMSH = 0.8 cm. The value of VWC ranges from 0 to 3.0 with an interval of 0.5 kg/m 2 .
A significant change in the response of backscatters on parameters under different VWC levels is observed in Figure 3. As can be seen in Figure 3a, with parameter A increasing from its minimum to maximum values, almost no changes are observed in the VV-polarized backscatters when VWC is smaller than 1.5, but changes are observed when VWC is larger than 1.5. However, VH-polarized backscatter changes dramatically. Under low level of VWC, the changes show a smaller amplitude, but when VWC is with 3.0 kg/m 2 , backscatter ranges from −27 dB to −7 dB. In Figure 3b, both VVand VH-polarized backscatters decrease with increasing values of parameter B, with VV-polarized backscatter ranging from approximately −26 dB to −12 dB. With the increasing of VWC, the variation shows larger amplitude. Figure 3c shows that no changes are observed in any of the polarized backscatters with increasing values of parameter α, but with different VWC levels, the backscatters show a certain difference. These observations demonstrate that: (1) parameter A should be carefully calibrated when the VH-polarized backscatter is simulated and used for soil moisture retrieval; (2) parameter B should always be carefully calibrated for both polarized backscatters; (3) parameter α is insensitive to canopy backscattering modeling and soil moisture retrieval.

The Global Optimization Algorithm
Among various classic global optimization algorithms, the shuffled complex evolution (SCE-UA) [74,75] has been shown to be an effective and efficient approach [80,81]. The algorithm combines the simplex procedure with the concept of a complex shuffling [76], controlled random search and competitive evolution. The key steps include: (1) initialization and computation of sample size; (2) generation of sample and calculation of the cost function values; (3) sorting the points in order of increasing values of cost function; (4) partitioning of the array into complexes; (5) evolution of each complex; (6) shuffling the complexes and convergence checking and determination of a new iteration loop or stop.
Here, due to its higher effectiveness and efficiency compared to the traditional algorithms, such as genetic evolution algorithm [82] and simulated annealing [83], the SCE-UA is used to minimize the constructed cost function and to simultaneously estimate soil moisture and surface roughness. Specifically, a least square type cost function (Equation (7)) is construct based on the forward model simulations and SAR observations, and the cost function is integrated into the SCE-UA algorithm to search for the minimum values. When the minimum value is reached, the optimal values of soil moisture and RMSH are obtained.
where J is the cost function; σ 0 obs and σ 0 sim are radar observed and model simulated backscatters, respectively; and n is the channel number of the radar observations. To examine the performance of Sentinel-1 VV-and VH-polarized backscatter on soil moisture estimation, we explore three different retrieval schemes, i.e., SCA-VV, SCA-VH and DCA-VVVH, representing models using a single channel with VV, VH and another with a dual channel that combines VV and VH, respectively. Thus, for SCA-VV and SCA-VH, n = 1; and for DCA-VVVH, n = 2.

Implementation of the Retrieval Algorithm
When applying the SCE-UA, the ranges and probability distributions of soil moisture and RMSH are required. Determining the ranges of the variables to be estimated is a challenging, but critically important task [84], as the ranges influence the sensitivity of the variable to the SAR backscatters [18] and may impact on whether the globally optimal value can be found [14,85]. The soil moisture and RMSH are usually physically measurable [18,86], which makes it easier to establish their ranges. For example, the valid soil moisture range is theoretically between the values of the porosity of soil and zero. Furthermore, the actual range of soil moisture can be narrowed by past investigations using real soil samples. For a wet soil sample, such as one that is recently irrigated, most soil moisture values will fall into the upper range of the distribution (i.e., closer to saturation), while for a sample under prolonged drying, it may fall into the first quartile (i.e., closer to wilting point or zero). The determination of the soil moisture range in the present retrievals is based on the long-term time-series of COSMOS measurements (which show a range of approximately 0.143-0.462 m 3 /m 3 ).
As is often the case, direct measurements of RMSH were not available at any of the retrieval sites. In addition to estimating the valid range of RMSH with physical and empirical backscattering models, such as the Advanced Integral Equation Model (AIEM) [19] and Oh model [35], we investigate ranges of RMSH measured from previous experiments in other regions, such as SMAPVEX12 [87] and SMAPVEX16-MB [88]. This study empirically determines the RMSH range by considering their physical bounds and using data from the SMAPVEX16-MB dataset (about 0.196-1.04 cm) for the main retrieval procedure. The distributions of soil moisture and RMSH are assumed as uniform distribution following several previous efforts [14,89]. Notably, the investigated ranges of soil moisture and RMSH are used as the baseline for determining their initial ranges in the SCE-UA (see details in the first row of Table 5 in Section 3.4), and the impact of the predefined soil moisture and RMSH ranges on the soil moisture retrieval algorithm is further analyzed in Section 3.4.

Averaging Strategies for Determining the Soil Moisture
To match with the scales of the COSMOS measurements and retrievals, two strategies are considered in this study: (1) the backscatter (Sentinel-1) and vegetation index (Sentinel-2) pixels that fall within the COSMOS footprint are averaged and then used to calculate the soil moisture, which is then compared with the COSMOS measurements (i.e., average-then-calculate strategy, hereinafter); and (2) the Sentinel pixels that fall within the COSMOS footprint are used to estimate soil moisture at the pixel scale, and are then averaged for comparison with the COSMOS measurements (i.e., calculate-then-average strategy, hereinafter). Both approaches have their positives and drawbacks. The average-then-calculate strategy is simple to implement because the iterative calculation is not required for each pixel, but only for the averaged values. It may also act to reduce random errors (especially the SAR speckle noise) and uncertainties caused by surface heterogeneity. However, it inevitably loses the details of surface characteristics within the backscatter and NDVI/NDWI signals after averaging, and thus fails to take advantage of the high spatial resolution of the Sentinel images. The calculate-then-average strategy may preserve the spatial details but is very time-consuming when iterative computing is performed pixel-by-pixel. Considering the computational efficiency and also the possibility of reducing SAR speckle noise, the average-then-calculate strategy is utilized as the baseline strategy for analyzing the impacts of various factors in soil moisture retrieval (see details from Section 3.2 to Section 3.5). The two strategies are compared and analyzed in detail in Section 3.5.
The spatial variabilities of the retrievals within the COSMOS footprint on a specific acquisition date are represented by the root mean standard deviation (RMSD, defined as RMSD = 1 with N=1600, X i and X the total number of pixels, ith pixel scale retrieval and mean value of all pixel scale retrievals) of all the pixel retrievals and root mean squared error at the pixel scale (RMSEp, defined 2 , with X obs the COSMOS measurements and the other symbols the same definition as those in RMSD). The RMSD is centered over the mean value of the retrievals while the RMSEp is centered over the COSMOS measurement. However, the RMESp is different from RMSE calculated at the COSMOS footprint scale (RMSE, defined as RMSE = 1 D D d=1 X d − X d,obs ) 2 , with X d , X d,obs the mean value of retrievals within the COSMOS footprint and COSMOS measurements on the dth acquisition day, respectively, and D the total number of acquisition day) because RMSEp reflects the error of the pixel scale retrievals against the COSMOS measurements, while RMSE reflects the error of retrievals at the COSMOS footprint scale.

Evaluating Response of Sentinel-1 to Surface Parameters
The response or sensitivity of Sentinel-1 backscatters (especially the VH-polarized backscatter) to soil moisture remains unclear. For this reason, a simple regression analysis is first conducted to examine the response of Sentinel-1 at co-and cross-polarization to the variation of surface parameters, including soil moisture, RMSH and VWC, to ensure that the parameters are retrievable from Sentinel-1 observations. The analysis is conducted both on backscattering model simulations and Sentinel-1 observations. First, a Markov Chain Monte Carlo sampling strategy is utilized to generate a parameter set of soil moisture, RMSH and VWC with a size of 1000 samples. The parameters are uniformly distributed within their physical ranges of the forward model inputs to ensure that various surface conditions (in terms of roughness and vegetation conditions) are taken into account. The coupled Oh-WCM model is applied to reproduce corresponding simulations and to allow the response of simulated backscatter to soil moisture to be observed. In the coupled Oh-WCM model, the values of parameters A, B and α, are set to the "all land uses" values in Bindlish and Barros [38]. Simultaneously, the COSMOS soil moisture and corresponding Sentinel-1 backscatters are applied to analyze the response. The ranges of observed RMSH and VWC are determined from the experimental investigation of SMAPVEX16-MB. Through comparing the observed and simulated databases, we can identify the feasibility of retrieving soil moisture from Sentinel-1 and the capability of the selected forward model.
The responses of backscatter to the surface parameters are presented in Figure 4. Both simulated and observed datasets show that VV-and VH-polarized backscatters increase with increasing soil moisture (Figure 4a). For the observed data, the soil moisture is more strongly correlated to VV-(R 2 = 0.41) than to VH-polarized backscatter (R 2 = 0.18), with the observed VV-SM (representing VV-polarized backscatter against soil moisture) relationship similar in function to the simulated VV-SM. The observed VH-polarized backscatters show a larger range, with many data points larger than the simulated backscatters under the same soil moisture values. Thus, larger uncertainty and overestimation may be introduced into soil moisture retrieval if using the VH-polarized backscatter. The simulated dataset shows that the backscatters increase with increasing RMSH (Figure 4b). This observation has been recognized in many previous studies [34,35,90,91], and thus we do not discuss further. Apart from demonstrating that RMSH is sensitive to both VV-and VH-polarized backscatters [18], it also implies that the RMSH should be carefully estimated prior to or synchronized with, soil moisture retrievals (as is done here). The simulated VV/VH-VWC (representing VV-or VH-polarized backscatter against vegetation water content) relations (Figure 4c) show that the VV-polarized backscatter decreases slightly as VWC increases, but that the VH-polarized backscatter shows a non-monotonic trend with increasing VWC. As shown in Figure 4d, with an increasing NDVI, a very weak increasing trend is observed in both VV-and VH-polarized backscatter.

Influence of Vegetation Water Content Index on Soil Moisture Retrieval
Based on the vegetation indices described in Table 2, here we explore the influence of different VWC values on soil-moisture retrieval. Prior to performing this analysis, a comparison among the three different schemes was undertaken, with the SCA-VV best for soil-moisture estimation. Thus, the SCA-VV scheme is explored in this particular section, using data from the four different COSMOS sites to compare soil-moisture retrievals. The impacts of VWC schemes on soil-moisture retrievals under the two other schemes (SCA-VH and DCA-VVVH) were the same as that under SCA-VV scheme. Details on the comparison among the three schemes are examined further in Section 3.5. Table 3 shows the impacts of VWC schemes, illustrating the significant variability in error metrics (especially R 2 and RMSE) that result from use of the different formulations, and even through using different bands within the same ratio. Although all the selected vegetation indices present similar performance in VWC estimation (see R 2 in Table 2), they show different performance in soil moisture retrievals. First, it can be observed that the NDWI-based VWC estimates result in higher correlations between estimated and observed soil moisture than the NDVI-based schemes do. For other metrics, there is marginal differences (although occasionally NDVI schemes actually perform better than their comparable NDWI indices). Overall, based on R 2 , MAE, RMSE and ubRMSE, the NDWI_865-1614 derived VWC consistently lead to the best soil moisture retrievals. This observation suggests that combination of Sentinel-2 band 8A and 11 can provide optimal estimation of VWC for an improved soil moisture retrieval.

Calibration of Water Cloud Model
As demonstrated in Section 2.3.4, parameters A and B in the water cloud model have significant impact on the backscatter (especially on VH-polarized backscatter). To test the impacts of parameters values on soil moisture retrieval, we choose the values of the parameter for different land uses in Bindlish and Barros [38]. The resulting error metrics of soil moisture retrievals are listed in Table 4. Significant differences are observed across the sites due to the differences in the WCM parameter values. Overall, the "all land uses" values perform best in soil moisture retrievals. Indeed, most of the COSMOS footprints cover more than a single land cover type, which makes the values of "single crops" result in larger errors in soil moisture retrievals. It is worth noting that as our retrievals span 2-3 years, crop types were alternated during the study period. For example, COSMOS090 was planted with maize in 2016 and 2018, and with soybean in 2017 and 2019 according to the Cropland Data Layer data (https://nassgeodata.gmu.edu/CropScape/). Thus, a larger error may be obtained in soil moisture retrieval if we only employ parameter values of a single land use. Given these results, the parameters values of "all-land use" are used in subsequent analyses. To ensure that the selected parameter values are suitable for soil moisture retrieval, a calibration procedure is performed. To do this, soil moisture, surface roughness and vegetation water content are required to drive the model and construct an objective function, given that these parameters are sensitive to backscatter [18]. This kind of calibration procedure usually consists of two-phases: one phase for calibration and one phase for validation. In the calibration phase, a certain percentage (e.g., 25%) of ground measurements (i.e., soil moisture, surface roughness and VWC) and SAR observations are used to identify the optimal parameters (i.e., A, B and α). In the validation phase, the identified parameters are used to retrieve soil moisture and surface roughness using the remaining observation data, while the ground measurements are used to validate the retrievals. As no surface roughness measurements are available in our study area, we perform an iterative procedure to search for optimal parameter values. Specifically, we use the initial parameter values from Table 4 of Bindlish and Barros (2001) [38], which cover a wide range:, i.e., 0.0009 ≤ A ≤ 0.0018, 0.032 ≤ B ≤ 0.138 and 1.26 ≤ α ≤ 10.6. Based on these ranges, we iteratively run the SCA-VV and SCA-VH soil moisture retrieval procedure with steps of A = 0.0002, B = 0.0008 and α = 0.4, respectively. The calibration presented here retrieves soil moisture and roughness under these different parameter values, with the derived soil moisture retrievals evaluated against the soil moisture measurements and the error metrics (R 2 , RMSE). Given the relatively limited sample size of the available observation data, we use 75% of the soil moisture measurements for the calibration experiments, which are randomly selected from the COSMOS029 and COSMOS099 sites (along with the corresponding Sentinel-1 observations).
We find that under fixed A and B values and α ranging from 1.26 through 10.6, the error metrics (R 2 , RMSE) do not change, which is consistent with results observed from Figure 3c. Thus, we focus on searching for the optimal values of A and B for our soil moisture retrieval. Figure 5a,b show the R 2 and RMSE of soil moisture retrievals based on the SCA-VV scheme using data from COSMOS029 site, while Figure 1c and d show those from the COSMOS099 site. Both R 2 and RMSE of VV-polarized backscatter derived soil moisture retrievals vary dramatically with changes in parameter B, but minimal change is observed in R 2 and RMSE when parameter A changes from 0.0009 to 0.0018. These observations are consistent with that observed from Figure 3a,b. It was determined that the optimal parameter values leading to soil moisture retrievals with highest R 2 (≥0.47) and smallest RMSE (≤0.065 m 3 /m 3 ) differ from each other at the two sites. For example, the highest R 2 (>0.47) at the COSMOS029 site is obtained with a B value between 0.04-0.06, while at the COSMOS099 site, B with values of 0.09, 0.05 or 0.039 can produce highest R 2 (=0.44). Furthermore, the highest R 2 or the lowest RMSE can be based on different parameter values: at COSMOS029, the highest R 2 (0.47) is obtained with B between 0.04-0.06, while the lowest RMSE (0.077 m 3 /m 3 ) is obtained with a B value larger than 0.13. Even with the relatively small parameter space being explored here, we see evidence of the equifinality principle at play [92], highlighting the challenge of identifying any single set of optimum parameters. We do note that at the COSMOS099 site, parameter B = 0.09 produces the highest R 2 (0.44) and lowest RMSE (0.065 m 3 /m 3 ), indicating that the choice of B used in the study (B = 0.091) reflects a close to optimal value.  However, differing from the results observed using the SCA-VV scheme; the SCA-VH scheme shows that parameter A is slightly sensitive to soil moisture retrieval;, but relative to B, A has smaller impact on soil moisture retrievals. This observation is consistent with that observed in Figure 3a.

Impact of Ranges of Soil Moisture and Roughness
As demonstrated in Section 2.3.6, initial ranges of soil moisture and RMSH are needed for driving the SCE-UA algorithm, and they can have significant influence on soil moisture retrieval. Here we test their impacts on the retrievals. The ranges of soil moisture and RMSH are first set to 0.15-0.45 m 3 /m 3 and 0.25-0.85 cm, respectively, which are based on the experimentally investigated ranges (as described in Section 2.3.6). As can be seen in Table 5, changes in soil moisture retrieval accuracy are observed with small changes in the ranges of soil moisture and RMSH. Specifically, when narrowing by 0.1 m 3 /m 3 the ranges of soil moisture (with minimum value increasing 0.05 m 3 /m 3 and maximum value decreasing 0.05 m 3 /m 3 ) and by 0.1 cm the ranges of RMSH (minimum value increasing 0.05 cm and maximum value decreasing 0.05 cm), smaller RMSE and MAE of soil moisture retrievals are observed, but the R 2 decreases (from 0.597 to 0.496). In addition, when expanding the ranges of soil moisture (from 0.15-0.45 to 0.05-0.50 m 3 /m 3 ) and RMSH (from 0.25-0.85 to 0.05-1.05 cm), the R 2 decreases considerably from 0.597 to 0.354, and MAE increases from 0.053 to 0.1 m 3 /m 3 and RMSE increases from 0.065 to 0.115 m 3 /m 3 . These observations indicate that the optimization process and soil moisture retrieval accuracy are influenced by the ranges of soil moisture and surface roughness. Notably, the ranges listed in the second and third rows of Table 5 are artificially preset. In an actual agriculture environment, the soil moisture and roughness may vary within different ranges, and the soil moisture retrieval accuracy may be changed if the previous ranges are used. Thus, carefully investigating these ranges is important to improve the soil moisture retrievals. Table 5. Influence of initial range on soil moisture retrieval accuracy. The retrievals are based on the SCA-VV scheme, with VWC estimated using NDWI_865-1614 and parameter values for "all land uses" employed (i.e., the best performing scenario). SM denotes soil moisture. Based on the above analyses regarding the impacts of various factors on soil moisture retrievals, this section systematically examines soil moisture retrievals under the three schemes (SCA-VV, SCA-VH and DCA-VVVH), while also exploring the two different strategies (average-then-calculate and calculate-then-average). In this application, the VWC is determined using the NDWI_865-1614, as this produced the best result from the analysis presented in Section 3.2. The retrievals are evaluated across the four different COSMOS soil moisture measurement sites.

Range
Results for the average-then-calculate strategy are presented in Figure 7. Generally, the soil moisture retrievals from all three schemes show reasonable correlations with the ground measurements, with R 2 ranging from 0.331 to 0.655 across all sites. Both the soil moisture retrievals and the ground measurements are obtained from a 2-3-year-long time series, representing different seasons, and various surface conditions in terms of vegetation types, surface roughness and agronomic events. As such, it is expected that there will be significant variation in the root mean square error (RMSE), which ranges from 0.132 m 3 /m 3 to 0.065 m 3 /m 3 across the sites. For individual sites and dependent on the retrieval scheme employed, quite distinct error metrics are observed.
As shown in the left panel (A) of Figure 7, the soil moisture retrieved by the SCA-VV scheme at the four sites have R 2 values between 0.47 and 0.655, with RMSE (ubRMSE) less than 0.078 (0.076) m 3 /m 3 . The COSMOS099 and COSMOS102 have the highest R 2 and smallest RMSEs, while the COSMOS029 and COSMOS090 indicate a larger RMSE and smaller R 2 . These differences may be caused by the surface heterogeneity. The COSMOS029 is located in a soybean-maize alternated cropland, and the different types of crops may influence the accuracy of retrievals. The COSMOS090 is located in a quarter sector of a center pivot that is mainly planted with maize and soybean yearly alternated. Surface heterogeneity inside and outside the sector, together with variability in irrigation, may also lead to error in retrievals. The COSMOS099 site is also covered by interannually alternated soybean and maize, while COSMOS102 is planted with grass. Grass have a reduced influence on SAR penetration in comparison with maize, hence presenting better soil moisture estimates. The retrievals over COSMOS090 and COSMOS102 slightly overestimate compared to the ground measurements, which show biases of 0.025 m 3 /m 3 and 0.039 m 3 /m 3 , respectively.
Relative to the SCA-VV retrievals (A: left panel of Figure 7), the SCA-VH retrievals present larger errors and more regular overestimation (B: middle panel of Figure 7). In this case, the R 2 at the four COSMOS sites ranges from 0.327 to 0.412, with RMSEs between 0.084 and 0.132 m 3 /m 3 . As can be seen in Figure 4, the SAR observed VH-polarized backscatter is obviously larger than the simulated VH-polarized backscatter and poor correlation is observed between the observed VH-polarized backscatter and soil moisture, with an R 2 of 0.18. This discrepancy between SAR observations and model simulations (observations greater than simulations) results in overestimated soil moisture due to the positive correlation between backscatter and soil moisture. Thus, the mismatch between the COSMOS observed and Sentinel retrieved soil moisture is likely driven by the limited capacity of the VH-polarized backscatter to sense soil moisture under densely vegetated surfaces. The evaluation of DCA-VVVH scheme derived soil moisture retrievals are shown in the right (C) panel of Figure 7. Based on R 2 and RMSE, the retrieval accuracy of DCA-VVVH scheme is better than that of SCA-VH, but worse than that of SCA-VV. By analyzing the synthetic results from the three retrieval schemes, we find a common point: the bias for most sites are larger than zero and there are more scatters above 1:1 line than those below, meaning that all three schemes overestimate the soil moisture to a certain degree. In addition to the possible reason mentioned above (i.e., that SAR observations are larger than model simulations under the same soil moisture values (as seen in Figure 4a), leading to overestimation), the heterogeneity of the surface within the COSMOS footprint and difference in average methods of COSMOS measurements and remote sensing data (backscatter and NDWI) may also be factors to consider. The COSMOS footprint covers a circular domain within a radius of around 200 m and where the soil moisture is spatially distributed heterogeneously. However, this heterogeneity cannot be captured by COSMOS, nor by the area-averaged backscatter and NDWI. Furthermore, different area-averaged soil moistures are obtained due to the difference in the averaging methods. The COSMOS soil moisture is an inverse distance weighted areal value. According to the findings in the work of Kohli et al. [69] and Schron et al. [93], the area within a radius less than 50 m has a greater contribution than the area beyond. However, the backscatter and NDWI within the COSMOS footprint are averaged with equal weight before they are used for soil moisture retrieval.
The calculate-then-average strategy provides another perspective on soil moisture estimation. The benefit of this particular strategy is that it can more accurately quantify the spatial variabilities and errors of each pixel against the COSMOS measurements. They are illustrated using vertical (RMSD) and horizontal (RMSEp) error bars in Figure 8. Here the SCA-VV scheme is examined, based on the previous comparison among the three schemes. As can be seen in Figure 8, the retrievals from the calculate-then-average strategy show significantly improved accuracies, with MAE less than 0.049 m 3 /m 3 , RMSE less than 0.062 m 3 /m 3 and ubRMSE less than 0.048 m 3 /m 3 at all sites. However, a comparison of R 2 values indicates that this strategy results in marginally smaller values than the average-then-calculate strategy, apart from at COSMOS029 (0.508 compared to 0.472 in Figure 7).

Soil Moisture Retrieval at the Point Scale
Given that the soil moisture measurements at the Rogers Farm #1 Nebraskan SCAN site are only representative of the point-scale, the original 10-m resolution backscatter values corresponding to this location are applied to derive soil moisture. As shown in Figure 9, the retrievals present good agreement with the ground measurements, with an R 2 of 0.597 and RMSE of 0.069 m 3 /m 3 . In contrast to evaluations against the COSMOS measurements, there is a tendency towards underestimation of soil moisture relative to the measurements at the Rogers Farm #1 Nebraskan SCAN site. To be precise, the retrievals are less than the measurements at the site. This discrepancy may be attributed to the difference in the horizontal and vertical footprint of the retrievals and measurements. The HydraProbe is installed inside an irrigated crop field with a fixed depth of 5 cm, providing soil moisture that represent the point-scale value, while the remote sensing retrieval represents an area average of 10 m × 10 m at the superficial surface layer, where dry surface conditions from edge effects may be included, and the COSMOS measurements represents a dynamic effective measurement depth that is based on the moisture status of the soil [94], further complicating a diagnosis of the under-versus overestimation behavior.

Discussion
The present work provides a general soil moisture retrieval framework that can physically interpret the backscatter from both the vegetation canopy and soil surface and does not explicitly require any in-situ soil moisture measurements to calibrate the algorithm. To this end, a soil moisture retrieval algorithm based on an optimization and using combined Sentinel-1 and Sentinel-2 data are proposed and evaluated against multiscale ground measurements. Overall, the retrieval accuracy of soil moisture from Sentinel-1 varies with both channels used, aggregation approach, and also ground condition of the evaluation data. Such outcomes have been found in related efforts. For example, Bai et al. [59] obtained soil moisture based on an iterative algorithm over the Tibetan Plateau with a correlation coefficient (R) of 0.6 and RMSE of 0.073 m 3 /m 3 for ascending, and R of 0.8 and RMSE of 0.055 m 3 /m 3 for descending data, respectively. Amazirh et al. [56] obtained soil moisture retrievals with an RMSD of 0.03 m 3 /m 3 from combined Sentinel-1 and Landsat 7/8 and with an RMSD of 0.16 m 3 /m 3 from Sentinel-1 VV-polarized backscatter only, while Bauer-Marschallinger et al. [23] found that a change detection based approach resulted in soil moisture with variable error metrics at different locations, for example, an R of 0.49 and RMSD of 0.032 m 3 /m 3 are observed at a COSMOS station in Emilia-Romagna and an R of 0.11 and RMSD of 0.085 m 3 /m 3 at Torre Dell'Olmo station (see details in Bauer-Marschallinger et al. [23]). These results fall broadly in line with those presented herein, where R 2 ranged from 0.427 to 0.655 and RMSE from 0.039 m 3 /m 3 to 0.078 m 3 /m 3 under the SCA-VV scheme. Recent efforts exploring machine learning based soil moisture retrieval, including the work of Attarzadeh et al. [58] and Holtgrave et al. [95], are data-driven approaches that do not explicitly interpret the backscattering process and of course ground measurements are needed to train the model. Most importantly, the machine learning based approaches do not estimate multiple variables simultaneously, whereas the approach presented in this study estimates both soil moisture and surface roughness under the consideration of vegetation canopy backscattering.
The evaluation of the proposed retrieval technique is based upon high-resolution SAR images, undertaken at a scales of order 100 m and 10 m. One of the reasons for undertaking high-resolution retrievals is that they can provide insight into soil moisture dynamics and behavior at the within-field scale. However, this scale also presents complexities and has proven difficult in retrieving accurate soil moisture values as demonstrated by Bauer-Marschallinger et al. [23]. For example, speckle noise [96] is an intrinsic property of SAR images that cannot be completely eliminated, although a filter has been employed to reduce its impact. While detailing the impact of SAR image de-noising is beyond the scope of this study, directing future efforts towards de-noising strategies is likely to result in improved soil moisture retrievals.
In the following paragraphs, the main sources of error in the soil moisture retrievals are discussed, with an aim to not only reveal the impacts of factors on retrieval accuracy, but also to provide suggestions for SAR-based soil moisture estimation and future versions of cost-function based retrieval algorithms. Consistent with many previous findings that Sentinel-1 VV-polarized backscatter can be used to derive soil moisture, the presented SCA-VV scheme results in acceptable retrievals at both the COSMOS footprint and Rogers Farm #1 SCAN site point scales. The SCA-VH generally shows the worst performing retrieval following the same evaluation strategy, while the DCA-VVVH based estimates lay between the SCA-VV and SCA-VH. The difference in the accuracy of these three schemes is mainly attributed to the polarization effect of backscattering. The VH-polarization is more sensitive to volume scattering and may contain double scattering and interactions between soil and canopy [97,98], while VV-polarization is sensitive to both surface and volume scatterings [99], dominated by the direct contribution from the ground. These results are borne out in our analysis of relations between observed backscatters against soil moisture and NDVI. As such, we can infer that the VV-polarized backscatter is recommended for soil moisture retrieval over other polarization combinations. Additionally, the discrepancy between model simulations and SAR observations [90] is also observed in our work, particularly obvious at VH-polarized backscatter (see Figure 4a), which could also influence the soil moisture retrievals and should be carefully considered.
Second, in order to estimate VWC for different crops, various empirical approaches have been proposed and applied [41,73]. Following an extensive collection of past studies [40], we explore the development of empirical relations between ground measured VWC and Sentinel-2 based NDVIs and NDWIs. One of the innovative characteristics of the Sentinel-2 MSI is that it has two NIR bands at 833 and 865 nm and two SWIR bands at 1614 and 2202 nm, which provide an opportunity to monitor the status of the green crops, and also to construct more specific NDVIs and NDWIs. Overall, our results indicate that vegetation/water index constructed with bands at 865 nm and 1614 nm performs best in efforts towards soil moisture retrieval. However, consistent with previous studies [41,100,101], we find that both NDVIs and NDWIs may become saturated when the vegetation is dense. As such, the vegetation/water indices become less sensitive to VWC at large values, leading to underestimation and ultimately influencing the accuracy of the soil moisture retrieval. Although it is not explored here, the use of a radar vegetation index (RVI) for VWC estimation has been positively reported in several works, such as Huang et al. [102] and Kim et al. [103], and may provide an alternative for vegetation canopy transmissivity calculation and subsequent soil moisture retrieval.
The empirical parameter values in the WCM have a varying influence on the estimation of soil moisture, highlighting that selecting correct parameter values is a key step in the retrieval process [32,104]. Interestingly, previous studies, such as the work of Paloscia et al. (2013) and Bai et al. (2017), obtained different parameter values at different geographic locations for Sentinel-1-based assessments. Indeed, we retrieved soil moisture using the calibrated values for VV-polarization of Paloscia et al. (2013), which showed poorer retrieval performance for soil moisture, with an R 2 < 0.23 and RMSE > 0.093 m 3 /m 3 . Practically, if our retrievals are performed on a heterogeneous surface or using a time series data set where the land uses change, it is difficult to determine the values of the parameters. Calibration is also challenging because of the difficulty in determining the parameter ranges. Under this condition, using the generalized values, such as the "all land uses" of Bindlish and Barros (2001), presents as a useful alternative, as demonstrated herein. Our calibration experiments demonstrate that "optimal" parameter values are unlikely to be a single parameter combination. Rather, there are many parameter combinations that can produce equally good statistical responses, with all depending on frequency and polarization and surface conditions of the sites being studied. Of course, for the purposes of model application, parameter values still need to be identified, and the parameter values used in this study are among the optimally calibrated values. However, site-specific calibration is usually just that: specific to a particular location that is unlikely to be applied more generally to other locations. Given the findings from previous studies and observations from our calibration assessment, we suggest that the generalized parameter values can be applied, particularly when time series data are applied in a heterogeneous site. A comparative calibration (as done in this study) by observing soil moisture retrievals under different parameter values within larger ranges is necessary to ensure that the applied parameter values are among the optimally values.
The initial ranges of soil moisture and RMSH also have a direct influence on the soil moisture retrieval accuracy. Section 3.4 has demonstrated this influence, and here we try to explore why and how the initial ranges influence the retrievals, as well as to provide specification on determining the initial ranges towards the improvement of soil moisture retrieval. We have observed from Figure 4a that some of the SAR-observed backscatters are larger than model-simulated backscatters under a specific soil moisture value. Thus, for those cases when the observed backscatters are larger than simulated backscatter, the cost function of the retrieval algorithm hardly reaches the globally minimum value. Under this condition, the soil moisture retrievals are inevitably overestimated due to the fact that backscatter is positively correlated to soil moisture. However, we have predetermined the initial range (0.15-0.45 m 3 /m 3 ) of soil moisture in the retrieval algorithm, which forcibly makes the retrievals smaller than or equal to the upper boundary value. This is one of the reasons behind the observation that soil moisture retrievals tend to asymptote at 0.45 m 3 /m 3 in Figure 7. Similarly, smaller SAR observations (than model simulations) may underestimate soil moisture and the underestimations could not be smaller than the lower boundary of the initial ranges. Thus, we can see that the initial ranges influence the soil moisture retrieval by constraining the abnormally over-and underestimated retrievals within a pre-set range. Specific to our case, the range of 0.15-0.45 m 3 /m 3 proved to be better than the range of 0.20-0.40 m 3 /m 3 in R 2 and better than the range of 0.05-0.5 m 3 /m 3 in both R 2 and RMSE of soil moisture retrievals ( Table 5). Determination of the range of 0.15-0.45 m 3 /m 3 is based on actual observations, which influences the more accurate soil moisture retrievals. Relative to the range of 0.15-0.45 m 3 /m 3 , the range of 0.05-0.5 m 3 /m 3 reduces the constraints to the over-and/or underestimations, leading to larger errors in soil moisture retrieval with R 2 decreasing and RMSE increasing considerably, while the range of 0.20-0.4 m 3 /m 3 over-constrains the retrievals. Although a decreased RMSE is achieved, R 2 decreases simultaneously. Thus, establishing reasonable ranges of soil moisture and RMSH obviously requires careful consideration, especially for cost function-based algorithms. We would suggest that presetting the initial ranges should be based on the actual situation of the study area and an investigation of the past records is necessary.
The accuracy of the soil moisture retrieval is also impacted by the scale of the comparison being undertaken. For example, remote sensing retrievals at the 10-m scale tend to underestimate relative to the point scale measurements at Rogers Farm #1 SCAN site, while retrievals at the 100-m scale are marginally higher than the COSMOS measurements. The measurements of the COSMOS footprint scale, the remote retrievals at 10 m, and the point scale of Rogers Farm #1 SCAN site all possess different spatial representativeness, which is obviously closely related to the scaling effects. Thus, scaling effects represent an important source of error, and highlight their place as one of the more challenging issues in remote sensing community.

Conclusions
To date, most Sentinel-1 SAR soil moisture retrievals have tended to focus on either empirical or machine learning methods, neither of which directly interpret the backscattering process nor work well without calibration of the algorithms against ground measurements. To this end, we present an algorithm that can simultaneously estimate soil moisture and RMSH under a framework of a forward backscattering model-based cost function. The algorithm is able to estimate soil moisture over a vegetated field with consideration of canopy backscattering contribution quantified by Sentinel-2 data. The proposed algorithm is evaluated at scales on the order of 100-m and 10-m spatial resolution. Through comparison against in situ soil moisture from a number of COSMOS stations (O~100 m) and single site (O~1-5 cm) scales, our algorithm shows an R 2 ranging from 0.472 to 0.655 and RMSE ranging from 0.078 to 0.039 m 3 /m 3 at the COSMOS scale and an R 2 of 0.597 and RMSE of 0.069 m 3 /m 3 at the single point scale.
Three retrieval schemes (SCA-VV, SCA-VH and DCA-VVVH) in combination with multi-configuration options for estimating the VWC, the influence of WCM parameter values, specification of initial value ranges and spatial representativeness of the ground measurements, are all explored. Overall, the SCA-VV (utilizing VV-polarized backscatter) provides the best performance, while the SCA-VH is the least accurate of the examined polarization schemes. In terms of choice of vegetation index, the NDWI_865-1614 based VWC estimation provides the best soil moisture retrievals. These findings provide useful directions for further SAR-based soil moisture estimation, illustrating both the sensitivities and uncertainties that can influence accurate retrieval. Additional analysis and investigation of these factors are likely to provide an improved capacity of soil moisture retrieval from Sentinel-1 SAR. Towards this task, a more comprehensive investigation on improving forward backscattering model and inversion procedure, as well as the range of evaluation data used, will be examined in future work. Such tasks would include calibrating the forward model to better predict the radar observation and further reducing the speckle noise of radar data using advanced filtering methods. The spatial representativeness and scaling effects should be also carefully addressed in both the retrievals and the validation dataset. SAR based soil moisture retrieval remains challenging, especially at high-resolution, but this is the precise scale required for both agricultural and water management applications, as well as in driving further advances in land surface modeling and hydrological process description.