Retrieving Surface Soil Moisture over Wheat-Covered Areas Using Data from Sentinel-1 and Sentinel-2

: Surface soil moisture (SSM) is a major factor that affects crop growth. Combined microwave and optical data have been widely used to improve the accuracy of SSM retrievals. However, the inﬂuence of vegetation indices derived from the red-edge spectral bands of multi-spectral optical data on retrieval accuracy has not been sufﬁciently analyzed. In this study, we retrieved soil moisture from wheat-covered surfaces using Sentinel-1/2 data. First, a modiﬁed water cloud model (WCM) was proposed to remove the inﬂuence of vegetation from the backscattering coefﬁcient of the radar data. The vegetation fraction (FV) was then introduced in this WCM, and the vegetation water content (VWC) was calculated using a multiple linear regression model. Subsequently, the support vector regression technique was used to retrieve the SSM. This approach was validated using in situ measurements of wheat ﬁelds in Hebi, located in northern Henan Province, China. The key ﬁndings of this study are: (1) Based on vegetation indices obtained from Sentinel-2 data, the proposed VWC estimation model effectively eliminated the inﬂuence of vegetation; (2) Compared with vertical transmit and horizontal receive (VH) polarization, vertical transmit and vertical receive (VV) polarization was better for detecting changes in SSM key phenological phases of wheat; (3) The validated model indicates that the proposed approach successfully retrieved SSM in the study area using Sentinel-1 and Sentinel-2 data.


Introduction
Surface soil moisture (SSM) is a key variable that couples the land and the atmosphere, as well as energy and water cycles. SSM, therefore, plays an essential role in hydrology, climatology, meteorology, ecology, and agronomy [1][2][3]. SSM is particularly important in arid and semi-arid agricultural regions, where its spatiotemporal distribution affects crop growth and development [4,5]. Despite its importance, it is difficult to accurately retrieve SSM over large scales due to the complexity of natural surfaces [6,7].
Retrieving SSM using remote sensing technology has been investigated for more than 30 years. Among the remote sensing methods, most optical methods estimate SSM by using the spectral reflectance indices, which are easy to implement but can be easily affected by weather [8]. In thermal infrared methods, SSM is mainly estimated from thermal inertia [9]. It is important to note that the vegetation canopy can conceal soil radiation information, thereby affecting the accuracy of SSM retrievals in areas with high vegetation coverage [10]. Microwaves have stronger penetrating ability and are not influenced by clouds due to their longer wavelengths.
Previous quantitative studies showed that microwave remote sensing methods could effectively estimate SSM. Passive sensors measure the intensity of naturally emitted microwaves from the Earth's surface. However, the atmospheric effect is not negligible June (harvest) of the following year. A total of 28 samplings in this region were selected on slopes between 0% and 5% ( Figure 1c). According to the statistic of Hebi in the 2019 National economic and social development communique, the wheat fields in the study area covered approximately 897.4 km 2 in 2019.
Water 2021, 13, x FOR PEER REVIEW 3 of 16 corn, cotton, and canola, and wheat cycles occur between late September (emergence) and the middle of June (harvest) of the following year. A total of 28 samplings in this region were selected on slopes between 0% and 5% ( Figure 1c). According to the statistic of Hebi in the 2019 National economic and social development communique, the wheat fields in the study area covered approximately 897.4 km 2 in 2019.

Sentinel-1 Data
Although these two satellites pass over the study area at least 2-3 times per month and considering that the sensitivity of wheat differs at different growth stages, this study only analyzed the SSM when wheat was in key phenological stages (jointing, heading, and filling stages). Therefore, Sentinel-1 images from 9 April, 3 May, and 27 May 2019 were used. The acquired SAR images were single look complex (SLC) and level-1B data, which need to be pre-processed prior to retrieving SSM. The Sentinel-1pre-processing used for the Sentinel-1 images was based on a common software platform, namely the Sentinel application platform (SNAP). The specific processes were conducted as follows:

Sentinel-1 Data
Although these two satellites pass over the study area at least 2-3 times per month and considering that the sensitivity of wheat differs at different growth stages, this study only analyzed the SSM when wheat was in key phenological stages (jointing, heading, and filling stages). Therefore, Sentinel-1 images from 9 April, 3 May, and 27 May 2019 were used. The acquired SAR images were single look complex (SLC) and level-1B data, which need to be pre-processed prior to retrieving SSM. The Sentinel-1pre-processing used for the Sentinel-1 images was based on a common software platform, namely the Sentinel application platform (SNAP). The specific processes were conducted as follows: Water 2021, 13, 1981 4 of 16 (1) The SAR images were corrected using radiometric calibration so that their pixel values truly represented the radar backscatter of the reflecting surface. Radiometric calibration was conducted based on the following expression: where σ 0 is the backscattering coefficient (dB); i and j represent the i-th row and the jth column, respectively; (DN ij is the digital number of the SAR image; and A ij is the calibration parameter; (2) Image mosaics were constructed, and geometric corrections were made; (3) The refined Lee filter method was applied to the multi-look images for speckle noise removal [25]. In this method, the weights of neighboring pixels were first estimated using kernel density, after which the value of the central pixel was calculated using linear weighting. Following this process, the speckle was effectively eliminated while preserving the edge information of the image. Figure 2a shows the pre-processing results of Sentinel-1 SAR data from the study area for 9 April 2019.
(1) The SAR images were corrected using radiometric calibration so that their pixel values truly represented the radar backscatter of the reflecting surface. Radiometric calibration was conducted based on the following expression: where is the backscattering coefficient (dB); i and j represent the i-th row and the j-th column, respectively; is the digital number of the SAR image; and is the calibration parameter; (2) Image mosaics were constructed, and geometric corrections were made; (3) The refined Lee filter method was applied to the multi-look images for speckle noise removal [25]. In this method, the weights of neighboring pixels were first estimated using kernel density, after which the value of the central pixel was calculated using linear weighting. Following this process, the speckle was effectively eliminated while preserving the edge information of the image. Figure 2a shows the pre-processing results of Sentinel-1 SAR data from the study area for 9 April 2019.

Sentinel-2 Data
Three Sentinel-2 optical images were used over the study sites and were located as close as possible to the SAR data (depending on cloud coverage; Table 1). These acquired optical images were level-2A data, in which bottom-of-atmosphere reflectance was corrected in cartographic geometry. We resampled the data to a 10 m resolution. Figure 2b shows the Sentinel-2 optical image of the study area on 3 April 2019.

Sentinel-2 Data
Three Sentinel-2 optical images were used over the study sites and were located as close as possible to the SAR data (depending on cloud coverage; Table 1). These acquired optical images were level-2A data, in which bottom-of-atmosphere reflectance was corrected in cartographic geometry. We resampled the data to a 10 m resolution. Figure 2b shows the Sentinel-2 optical image of the study area on 3 April 2019. Table 1. List of satellite-based and in situ data used over the study area. Contemporaneous with the satellites acquisitions, gravimetric soil moisture (GSM) at depths of 0-10 cm were conducted during the key phenological phases of wheat (jointing, heading, and filling stage). Data collection was divided into three periods, with 28 sampling sites selected for each period. At each sampling site, three sampling points were selected that were separated by approximately 10 m. In addition, the location of each in situ measurement was recorded using a global positioning system (GPS) device. It should be noted that, as soon as the soil samples were returned to the laboratory, they were immediately weighed with an electronic balance (accuracy of 0.1 g) and dried in an oven at 105 • C for 20-24 h, until a constant temperature was reached. GSM was obtained using the following formula: where W wet and W dry are the weights of soil samples collected before and after drying, respectively. The volumetric soil moisture (VSM) was then obtained by multiplying the GSM by the soil bulk density. Therefore, where ρ b is the soil bulk density.
The SSM used in this study was the VSM.

Vegetation Parameters
To obtain the vegetation parameters, wheat samples were collected at the same times as the soil moisture samples. The main vegetation parameters were plant height, plant density, and VWC. The height of the wheat was measured using a meter scale over a 20 × 20 cm area; the plant density was determined for 1 m 2 of each wheat field. The fresh weight (W F ) was measured after the wheat samples were brought back to the laboratory, and the dry weight (W D ) was obtained by drying the wheat in an oven. The specific equation for VWC is as follows: where ρ is the plant density of wheat. The details of the experimental sites are summarized in Table 1.

Methods
In this study, we propose a proper SSM retrieval method for wheat-covered fields based on the data described in Section 2. After a variety of tests, the decomposed scattering model and SVR were implemented for SSM retrieval in this region. A flowchart of the processing steps for SSM estimation is shown in Figure 3. The SSM retrieval method used in this study comprised three main phases. The first phase was obtaining the parameters for the model from remote sensing; i.e., the backscattering coefficient images of the study areas from Sentinel-1 SAR were obtained by pre-processing, and the backscattering coefficients for each sample were extracted according to their latitudes and longitudes. The vegetation spectral indices obtained from Sentinel-2 underwent a similar process. The second phase was the central part of the processing chain. To eliminate the influence of vegetation on radar backscattering, a modified WCM with the vegetation fraction was constructed. VWC, as one of the important parameters in the WCM, can be expressed using the multiple linear regression model combined with the vegetation indices obtained from the Sentinel-2 data. Finally, to achieve an efficient and robust SSM retrieval algorithm, a machine learning approach was used. In detail, an SVR technique was applied that allowed for non-linear relationships between a target variable and several input features. Further, the coefficient of determination (R 2 ) and root mean square error (RMSE) were calculated to evaluate the accuracy of the SSM estimates, and the SSM values were mapped throughout the study area. More details on each part of the soil moisture retrieval algorithm are given below. values were mapped throughout the study area. More details on each part of the soil moisture retrieval algorithm are given below.

Modified WCM
Vegetation canopies reduce the sensitivity of radar measurements to soil moisture, thereby affecting the accuracy of soil moisture estimations. Addressing this issue was the main aim of this study. The WCM is based on the radiation transport model and was proposed by Attema and Ulaby in 1978 [26]. According to the model, the total backscattering term ( ) over the vegetated fields is simply divided into two parts: the backscatter contribution from the vegetation canopy ( ) and the backscatter contribution from the soil surface ( ). To better describe the backscattering of the soil and vegetation during different periods in wheat-covered areas, the vegetation fraction was introduced into the WVC. For a given incidence angle, the model is described as follows:

Modified WCM
Vegetation canopies reduce the sensitivity of radar measurements to soil moisture, thereby affecting the accuracy of soil moisture estimations. Addressing this issue was the main aim of this study. The WCM is based on the radiation transport model and was proposed by Attema and Ulaby in 1978 [26]. According to the model, the total backscattering term (σ 0 pp ) over the vegetated fields is simply divided into two parts: the backscatter contribution from the vegetation canopy (σ 0 veg ) and the backscatter contribution from the soil surface (σ 0 soil ). To better describe the backscattering of the soil and vegetation during different periods in wheat-covered areas, the vegetation fraction was introduced into the WVC. For a given incidence angle, the model is described as follows: where θ is the incident angle, f v is the vegetation fraction; σ 0 pp is the co-polarized total backscattering coefficient, σ 0 veg is the backscatter contribution from the vegetation canopy, σ 0 soil is the backscatter contribution from the soil surface, and L 2 is the double attenuation factor. V 1 and V 2 denote the vegetation descriptors. These can be VWC, NDVI, the leaf area index (LAI), or other vegetation descriptors [27][28][29]. In this study, V 1 and V 2 refer to VWC (kg/m 2 ). The empirical parameters a and b depend on the vegetation type and the incident angle. Bindlish and Barros [30] proposed values for a and b for different land cover types. As the crop in the study area was wheat, winter wheat was selected as the land cover type, with values of 0.0018 and 0.138 for a and b, respectively.
Moreover, f v is an additional parameter in the vegetation scattering model used to distinguish the proportions of vegetation coverage and bare soil in pixels. f v can be calculated using the mixed pixel decomposition model [31]: where NDVI min denotes a bare soil pixel, which is theoretically close to zero, and NDVI max denotes a pure vegetation pixel, which is theoretically close to one. In order to reduce the influence of weather conditions, a 0.5% confidence level was used to obtain the thresholds for NDVI min and NDVI max .

Building the VWC Model
Based on previously published studies, the vegetation indices can be used to estimate VWC [32]. The vegetation indices commonly used for VWC estimation include NDVI [33] and NDWI [34]. These two vegetation indices are be calculated as follows: where R NIR is the reflectivity in the near-infrared band, R Red is the reflectivity in the red band, and R SWIR is the reflectivity in the shortwave infrared band. Sentinel-2 data have three shortwave infrared bands: SWIR 1 (central wavelength = 1.374 µm), SWIR 2 (central wavelength = 1.610 µm), and SWIR 3 (central wavelength = 2.190 µm). As the spatial resolution of SWIR 1 is only 60 m, SWIR 2 and SWIR 3 were selected to calculate the vegetation indices (NDWI 1610 and NDWI 2190 ) in this study.
In recent years, studies have focused on the red-edge band between the red and near-infrared bands because of the abrupt changes in leaf reflectivity that occurs within this band. This focus has produced good applications for identifying surface types, calculating parameters, distinguishing vegetation growth states, and estimating vegetation leaf area indices. The normalized difference red-edge index (NDRI) can be calculated as follow: where R Red−edge is the reflectivity in the red-edge infrared band. The Sentinel-2 data have three red-edge infrared bands, and the central wavelengths of red-edge bands 1 and 2 are located at the valley value (0.705 µm) and peak value (0.740 µm) of the red-edge band range, respectively. Therefore, the two red-edge infrared bands were used to calculate the NDRI in this study.
In this study, the model used to estimate VWC was divided into two steps. The first step established the exponential relationship between the VWC and the vegetation index. This exponential expression is as follows: Water 2021, 13, 1981 8 of 16 where C is the VWC, α and β are the parameters to be solved, and x is the vegetation index (NDVI, NDWI 1610 , NDWI 2190, or NDRI).
In the second step, the VWC obtained in the first step was used as the characteristic parameter of the multiple linear regression equation. Thus, the modified VWC can be expressed as where m veg is the modified VWC, k is the number of vegetation indices, ε is the bias that obeys the normal distribution N(0,σ 2 ), and γ is a parameter. If (y 1 ; x 11 , x 21 , . . . , x k1 , . . . , (y n ; x 1n , x 2n , . . . , x kn ) is a sample of capacity n, then: The value of the parameter γ i (i = 0, 1, 2, . . . , k) can then be estimated.

SVR Estimation of SSM
SVR is a supervised regression technique that allows the modeling of multi-dimensional and non-linear relationships between target variables [35]. According to functional theory, as a kernel function satisfies Mercer's theorem, it can correspond to some type of inner product in the high-dimensional feature space. Owing to its accurate estimation, good intrinsic generalization ability, and its ability to deal with complex non-linear problems, the SVR technique can be applied for soil moisture estimation [36].
In this study, the SVR technique was chosen to retrieve SSM from the backscattering coefficients, and the entire procedure can be divided into two main phases: the training and estimation phases.
During the training phase, the field measurements, coupled with the features extracted from remote sensing data, were exploited to determine the underlying relationship between the input features and the output target value. According to our previous analysis, the features of the remote sensing data were estimated using the modified WCM, which reduced the ambiguity in the SAR signal due to the presence of vegetation. The relationships between the backscattering coefficient and the field measurements were implemented using MATLAB, and these reference samples were divided into two subsets, that is, 75% (63 samples) for training and 25% (21 samples) used for the quantitative assessment of the estimation performance.
Typically, the SVR technique has different configurations of the free model parameters, namely hyper-parameters that can control the learning process of the estimation method. These hyper-parameters are composed of the regularization parameter C, the tolerance to errors ε, and the kernel parameters. The Gaussian radial basis function (RBF) kernel function was chosen due to its limited computational overhead [37], and a grid search strategy was adopted to drive the selection of the best parameter configuration.
After the above learning phase (carried out off-line), the trained SVR regressor was ready for the estimation phase. During the estimation phase, independent test samples were used to quantitatively assess the estimation performance using common metrics such as the RMSE and R 2 .

Estimating VWC from Sentinel-2 Data
A total of 84 in situ measurements were acquired, of which 68 random samples were used to build the VWC models, while the remaining 16 samples were used to validate the performance of the models. According to Equation (12), the fitting relationship between the measured VWC and the estimated VWC was based on a multiple linear regression model under different combinations of vegetation indexes, as shown in Figure 4. The R 2 and RMSE between the estimated and measured VWC values based on the 16 validation data points are shown in Table 2. ter 2021, 13, x FOR PEER REVIEW 9 of model under different combinations of vegetation indexes, as shown in Figure 4. The R and RMSE between the estimated and measured VWC values based on the 16 validatio data points are shown in Table 2.    By examining the characteristic parameters of the multiple linear regression model, which include VWC with two vegetation indices, Figure 4a-f show that NDVI + NDRI yielded the best result, with R 2 and RMSE values of 0.917 and 0.162, respectively. The R 2 gradually increased with an increasing number of characteristic parameters, whereas the RMSE gradually decreased. Regarding the characteristic parameters of multiple regression models that included VWC with three vegetation indices (Figure 4g-j), NDVI + NDWI 2190 + NDRI yielded the best result, with R 2 and RMSE values of 0.963 and 0.108, respectively. NDVI + NDWI 1610 + NDWI 2190 + NDRI yielded the best result of all combinations. The R 2 between the estimated and measured SSM was 0.965, and the correlation between them was statistically significant at the 0.01 level. Compared with the inputs of the three characteristic parameters, the R 2 values for model No. k were 0.079, 0.027, 0.057, and 0.002 higher, respectively. RMSE decreased by 0.084, 0.035, 0.065, and 0.006, respectively. Therefore, model No. k was used as the modified VWC estimation model in this study.

SSM Retrieval Results Using the Modified WCM
As stated in Section 4.1, the VWC values from the Sentinel-2 vegetation index were used in the formula of the WCM to obtain the soil backscattering coefficient. The corresponding backscatter coefficients for the 84 samples in their image according to the coordinates, their relationships before and after removing the VV vegetation influence, and the VH polarization backscattering coefficient are shown in Figure 5. The backscattering coefficients of VV polarization (Figure 5a) were higher than those of VH polarization (Figure 5b) at the same sampling points. After removing the influence of vegetation, the value of the soil backscatter coefficient was generally lower than that of the total backscatter coefficient. The variation in the VV polarization backscattering coefficient was −2.35 dB, while that of the VH polarization was −2.92 dB. The variation in the VH polarization backscattering before and after the correction was greater than that of the VV polarization. This indicates that VH polarization was more easily affected by the vegetation layer during the transmission process.
After extracting the most relevant features using SVR, training was performed, as shown in the flowchart in Figure 6. The R 2 and RMSE values are presented in Table 3. In order to evaluate the accuracy of the modified model, we compared the performances with the original WCM (where VWC is composed of NDVI) and the radar backscatter coefficient, which ignores the influence of vegetation.
value of the soil backscatter coefficient was generally lower than that of the total backscatter coefficient. The variation in the VV polarization backscattering coefficient was −2.35 dB, while that of the VH polarization was −2.92 dB. The variation in the VH polarization backscattering before and after the correction was greater than that of the VV polarization. This indicates that VH polarization was more easily affected by the vegetation layer during the transmission process.   As shown in Figure 6 and Table 3, for the VV polarization, the R 2 between the estimated and measured SSM was 0.86 for the modified WCM, and the RMSE of the estimated SSM was 2.119 %, while the R 2 and RMSE values of SSM were 0.801 and 2.992%, and 0.661 and 3.314% respectively, for the original WCM and radar backscatter coefficient. The correlation between the estimated and measured SSM was statistically significant at the 0.01 level.
In contrast, the scattered VH polarization points deviated more from the 1:1 line than those of the VV polarization. For the VH polarization, the R 2 between the estimated and measured SSM was 0.667 for the modified WCM, and the RMSE of the estimated SSM was 3.629%, while the R 2 and RMSE values of SSM were 0.586 and 3.994%, and 0.451 and 4.192% respectively, for the original WCM and radar backscatter coefficient. The correlation between the estimated and measured SSM was statistically significant at the 0.05 level.
From the results, this study confirms previous findings regarding the significance of using vegetation indices in the soil moisture retrieval process in vegetated areas. Further, VV polarization had good accuracy and stability for retrieving SSM in the study area, and the modified WCM yielded satisfactory results in retrieving SSM with Sentinel-1 and Sentinel-2 data.
The spatial distribution of SSM retrievals and frequency diagram of SSM in the study area are shown in Figure 7. Based on the supervised classification technology of threshold segmentation by the environment for visualizing images (ENVI) software, the non-wheat areas, such as towns, rivers, and other non-agricultural areas in the Sentinel-1 SAR image of the study area were removed.
As seen in Figure 7, the retrieved SSM in the study areas were mainly distributed in the range of 25-40%. The result of the SSM retrieval on 3 April was slightly drought, but the SSM in a small part of the central and southern were significantly higher than that in other areas (Figure 7a). There is a possibility of being irrigated of the wheat fields in the dry spring. Due to the fact that there was continuous rainfall in the south of the study area, the results of SSM retrieval were relatively moist in the southern on 3 May (Figure 7b). According to the meteorological data, there was no amount of rainfall in the study area before the satellite's transit, and this may explain why the SSM was slightly lower on 23 May than that on 3 May (Figure 7c). The SSM retrieval results were basically consistent with the measured SSM. Therefore, the method proposed in this study had strong applicability for the study area.
Water 2021, 13, x FOR PEER REVIEW 11 of 16 After extracting the most relevant features using SVR, training was performed, as shown in the flowchart in Figure 6. The R 2 and RMSE values are presented in Table 3. In order to evaluate the accuracy of the modified model, we compared the performances with the original WCM (where VWC is composed of NDVI) and the radar backscatter coefficient, which ignores the influence of vegetation.
(e) (f)  the modified WCM yielded satisfactory results in retrieving SSM with Sentinel-1 and Sentinel-2 data. The spatial distribution of SSM retrievals and frequency diagram of SSM in the study area are shown in Figure 7. Based on the supervised classification technology of threshold segmentation by the environment for visualizing images (ENVI) software, the non-wheat areas, such as towns, rivers, and other non-agricultural areas in the Sentinel-1 SAR image of the study area were removed. As seen in Figure 7, the retrieved SSM in the study areas were mainly distributed in the range of 25-40%. The result of the SSM retrieval on 3 April was slightly drought, but the SSM in a small part of the central and southern were significantly higher than that in other areas (Figure 7a). There is a possibility of being irrigated of the wheat fields in the dry spring. Due to the fact that there was continuous rainfall in the south of the study area, the results of SSM retrieval were relatively moist in the southern on 3 May ( Figure  7b). According to the meteorological data, there was no amount of rainfall in the study area before the satellite's transit, and this may explain why the SSM was slightly lower

Discussion
As a classical model, the WCM has been widely used to retrieve soil moisture information from areas with vegetation cover. In recent years, some studies have used Sentinel-1 and Sentinel-2 data to estimate SSM. Guo et al. [38] estimated farmland SSM using multisource remote sensing data from Sentinel-1 radar and Sentinel-2 optical images. They applied the Oh model [39], SVR, and a generalized regression neural network (GRNN) to retrieve SSM. They used the WCM to remove the influence of vegetation. The inputs of SVR included the dual-polarization radar backward scattering coefficient, altitude, local incident angle, and vegetation indices (NDVI, the modified soil adjusted vegetation index (MSAVI), and the difference vegetation index (DVI)). Their results indicated that combining multi-characteristic parameters based on SVR delivered the best retrieval accuracy with the R 2 of 0.903 and an RMSE of 0.014 cm 3 /cm 3 . Zhao et al. [40] estimated the SSM for winter wheat fields using Sentinel-1 and Sentinel-2 data. Based on near-infrared, red, and shortwave infrared bands, they proposed a new fusion vegetation index (FVI) to estimate VWC. They used the Maclaurin series to improve the WCM and considered that the singlepolarization backscattering coefficients could be replaced by VV/VH. As a result of their retrieval analysis, they obtained an R 2 value of 0.7642 and an RMSE of 0.0209 cm 3 /cm 3 in VV/VH; their R 2 and RMSE values were 0.6791 and 0.0249 for VV polarization, and 0.5151 and 0.0289 for VH polarization, respectively. In the present study, vegetation indices including NDVI, NDWI 1610 , NDWI 2190 , and NDRI were used (NDRI was composed of two red edge bands). To the best of the authors' knowledge, this study is the first to propose removing the influence of vegetation on SSM estimation by using the red side bands in Sentinel-2 data. Baghdadi et al. [7] estimated the SSM of crop fields and grasslands from Sentinel-1/2 data. They combined the WCM with the integral equation model (IEM) [41] using real data composed of a C-band radar backscattered signal, NDVI, soil moisture, and surface roughness values. Their results indicated that the soil contribution to the total radar backscatter signal was lower in VH polarization than in VV polarization. Zeng et al. [17] studied SSM under different vegetation covers based on Sentinel-1A and SVR techniques and concluded that VV polarization could achieve high retrieval accuracy. Wang et al. [42] combined full polarization Radarsat-2 SAR data and SVR techniques to estimate soil moisture in sparsely vegetated arid areas. They determined that the inversion accuracy of the co-polarization data (VV or HH polarization) was higher than that of the cross-polarization data (VH or HV polarization). Comparing the inversion results reported in this study with those of the previous studies mentioned above, it is possible to conclude that VV polarization is more sensitive to SSM than VH polarization.
One of the limitations of this study is that it only focused on wheat fields. In addition, only a small number of sampling sites were measured. Furthermore, as wheat is a droughtresistant crop, it is also planted on hills and mountains in China, while this study only examined wheat on plains. However, soil moisture retrieval methods that are based on other kinds of crops and those that encompass many terrain conditions may have more practical significance than the method reported here.

Conclusions
Using Hebi, a representative wheat planting area in Henan Province, China, as the study area, we investigated the potential for synergy between C-band Sentinel-1 SAR and Sentinel-2 optical data for SSM retrieval in wheat fields. To extract the soil backscattering coefficient (σ 0 soil ) from the Sentinel-1 SAR data, the WCM was selected to remove the influence of the vegetation layer from the radar backscattering coefficient. Then, combined with the WCM and SVR algorithms, the SSM of the wheat-covered fields were retrieved and analyzed under different polarization modes (VV, VH). The main conclusions of this study can be summarized as follows: (1) A modified WCM was constructed using FV and VWC values calculated from Sentinel-2 data. The FV was used to distinguish the proportions of vegetation coverage and bare soil in pixels, and the VWC model, which was based on the combination of four vegetation indices (NDVI, NDWI 1610 , NDWI 2190 , and NDRI), was able to effectively remove the influence of the vegetation canopy on the backscattering coefficient of the Sentinel-1SAR data; (2) Compared with Sentinel-1 VH polarization data (after removing the vegetation influence using the WCM), VV polarization data produced higher estimation accuracies regarding the SSM retrieval. This result indicates that the VV polarization contains more soil backscattering information and is more sensitive to changes in SSM than VH polarization; (3) C-band Sentinel-1 SAR and Sentinel-2 optical data were used to study wheatcovered fields. The results indicate that the estimated SSM based on these two kinds of satellite data are applicable to agricultural environments for wheat. To further this research, we intend to examine the proposed algorithm with regard to other crops and different regions. Furthermore, to estimate SSM more accurately on a large scale, some advanced algorithms, such as deep learning, should be implemented.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
The data presented in this study will be made available on request from the corresponding author.