Surface Soil Moisture Retrieval on Qinghai-Tibetan Plateau Using Sentinel-1 Synthetic Aperture Radar Data and Machine Learning Algorithms

: Soil moisture is a key factor in the water and heat exchange and energy transformation of the ecological systems and is of critical importance to the accurate obtainment of the soil moisture content for supervising water resources and protecting regional and global eco environments. In this study, we selected the soil moisture monitoring networks of Naqu, Maqu, and Tianjun on the Qinghai–Tibetan Plateau as the research areas, and we established a database of surface microwave scattering with the AIEM (advanced integral equation model) and the mathematical expressions for the backscattering coefﬁcient, soil moisture, and surface roughness of the VV and VH polarizations. We proposed the soil moisture retrieval models of empirical and machine learnings algorithms (backpropagation neural network (BPNN), support vector machine (SVM), K-nearest neighbors (KNN), and random forest (RF)) for the ascending and descending orbits using Sentinel-1 and measurement data, and we also validated the accuracies of the retrieval model in the research areas. According to the results, there is a substantial logarithmic correlation among the backscattering coefﬁcient, soil moisture, and combined roughness. Generally, we can use empirical models to estimate the soil moisture content, with an R 2 of 0.609, RMSE of 0.08, and MAE of 0.064 for the ascending orbit model and an R 2 of 0.554, RMSE of 0.086, and MAE of 0.071 for the descending orbit model. The soil moisture contents are underestimated when the volumetric water content is high. The soil moisture retrieval accuracy is improved with machine learning algorithms compared to the empirical model, and the performance of the RF algorithm is superior to those of the other machine learning algorithms. The RF algorithm also achieved satisfactory performances for the Maqu and Tianjun networks. The accuracies of the inversion models for the ascending orbit in the three soil moisture monitoring networks were better than those for the descending orbit.


Introduction
Soil moisture is a key factor in the water and heat transfer and energy transposition in land-atmosphere systems [1], and it is also vital to connecting the water of surface water, groundwater, and carbon cycles of terrestrial ecosystems [2]. As a crucial parameter in hydrology, meteorology, ecology, and agriculture, researchers use soil moisture in hydrologic modeling [3], numerical weather forecasting [4], and overland flow predictions [5]. Therefore, the accurate and dynamic monitoring of soil moisture is critical for environmental protection. The main advantages of microwave remote sensing are its real-time detection, high penetrating power, and the fact that it is not easily influenced by cloudy weather. The soil volumetric water content has a substantial effect on the variation in the soil dielectric constant, and the soil dielectric properties are bound up with the brightness temperature and backscatter coefficient of the microwaves [6]. Consequently, microwave remote sensing technology is a potential method for soil moisture monitoring [7]. According to different energy sources, we can divide microwave remote sensing into two types: active and passive. The resolution of passive microwave radiometers is generally above 10 km, which is helpful for monitoring surface ecological environmental elements on a global scale and obtaining essential data for global change research [8]. However, passive microwave remote sensing cannot represent the changes in the local-scale soil moisture. Active microwave remote sensing makes up for these deficiencies with its high resolution. The Sentinel-1 can provide C-band SAR data with repeated observations, the revisit period is 6 days, the spatial resolution is 10 m, and it has considerable potential for soil moisture inversion [9,10].
The establishment of the microwave surface scattering model and an understanding of the influence of the soil volumetric moisture content on the SAR parameters are the prerequisites for soil moisture inversion using SAR data [11]. The interaction between the electromagnetic waves scattered by random surfaces and ground objects primarily depends on the system factors (frequency, polarization mode, and incidence angle) of the microwave sensor, and it is also closely related to the ground roughness and dielectric properties. Therefore, researchers have proposed empirical and theoretical models to reveal the relationship between the soil moisture content and SAR factors. The Oh model [12], Dubois model [13], and Shi model [14] are common empirical models. However, they are only suitable for special environments and lack universality due to their dependence on observation data. Researchers widely use theoretical models based on the electromagnetic wave radiation transfer equation to describe surface scattering due to it good physical basis. The early theoretical models include the SPM (small perturbation model) [15], GOM (geometrical optics model) [16], and POM (physical optics model) [17]; however, we can only apply these models within a certain ground roughness range. Fung developed the IEM (integral equation model) using the Maxwell equation of electromagnetic waves to broaden the model's application [18]. We can use the model to simulate surface scattering within a large ground roughness range. Chen [19] proposed the AIEM (advanced integral equation model), which has a higher accuracy and more compact form, by improving the IEM. We can use the model to simulate surface scattering due to the advantages of its higher theoretical foundation, clearer structure, and stronger universality. Baghdadi [20] proposed semiempirical calibration by using the IEM to better reconstruct the surface scattering characteristics of bare farmland. According to the results, the backscattering coefficient measured in the experiment coincided with that of the simulation of the semiempirical model. The researchers validated the performance of the AIEM through different correlation length parameterizations [21]. According to the results, we can retrieve the soil moisture from SAR images based on the AIEM in semiarid districts. However, the IEM and AIEM achieve good satisfaction only in bare soil, and there are obvious errors in vegetation-covered areas [19].
Reducing the impacts of the roughness and vegetation on the surface backscattering is a critical issue in the process of soil moisture inversion using microwave images. Zribi [22] modified the geometrical features of the local soil framework based on the fractional Brownian model to estimate the backscattering coefficients of farmlands. The authors present the theoretical research on the generation and propagation of the roughness error, and according to the result, the profile extent, profile morphology, profile measurement number, and profile measurement precision in different directions are the major factors that affect roughness errors [23]. However, there are still uncertainties in the research on the roughness parameterization scheme.
Machine learning algorithms can describe the complicated relationships of variables and have been introduced to monitor soil moisture at different scales. A method using an Artificial Neural Network (ANN) has been put forward to model, test, and validate soil moisture for GMES Sentinel-1 [24]. The brightness temperature, soil moisture, surface soil temperature, and vegetation water content were employed to simulate global soil moisture by using a Neural Network technique by Kolassa [25]. The three machine learning algorithms of random forest (RF), support vector machine (SVM), and K-nearest neighbors (KNN) were used to research the soil moisture, thus downscaling the presence of seasonal differences [26]. The data fusion and random forest were used to generate surface soil moisture over the agricultural field [27]. A new approach combining machine learning and multi-sensor data was put forward to predict soil moisture in Australia [28], and the proposed model generated satisfactory performance compared to random forest regression, support vector machine, and CatBoost gradient boosting regression. Although machine learning algorithms can effectively explain non-linear problems, the lack of a physical foundation and the excessive dependence on training samples are their main disadvantages. Therefore, combining physical models and machine learning algorithms is a valid approach for modifying soil moisture inversion precision.
The Qinghai-Tibetan Plateau (QTP) is the highest and largest plateau in the world [29]. The QTP directly affects the local climate and environment via atmospheric circulation and hydrology procedures, and it also impacts climate change not only in China and Asia but also around the globe [30]. The soil moisture, as the critical surface element of the QTP, is of critical importance to predicting the atmospheric circulation and climate change of the plateau through the adjustment of the ground evaporation and infiltration, controlling the surface energy allocation, and influencing the soil freezing and thawing. The soil moisture also influences the monsoon climate and rainfall forms of the plateau. Therefore, the use of the active microwave technique to grasp the exact local soil moisture information of the QTP is essential for understanding the energy exchange of this district and its impacts on the environments of the surrounding areas.
Therefore, in this study, we selected three soil moisture observation networks in the QTP as the research areas: Naqu, Maqu, and Tianjun. We used the soil moisture measurement and Sentinel-1 data with the VV and VH polarizations of the ascending and descending orbits to model and retrieve the soil moisture. First, we analyzed the response of the soil moisture and surface roughness to the backscattering coefficient based on the AIEM, and we established the mathematical expressions for the backscattering coefficient, soil moisture, and surface roughness of the VV and VH polarizations. Subsequently, we proposed empirical and machine learning models for the soil moisture retrieval for the ascending and descending orbits by using the soil moisture measurement data and Sentinel-1 images from 2017-2019 of the Naqu station. Finally, we obtained the 2020 soil moisture results of the Naqu station based on the empirical model and machine learning models, and we also evaluated the accuracies of these models with measurement data. We also obtained the soil moisture results for Maqu in 2018 and Tianjun in 2020 to further verify the precision and applicability of the soil moisture retrieval models.

Soil Moisture Monitoring Networks
In order to obtain more accurate local soil moisture measurements in the QTP, we selected three soil moisture monitoring networks as the research areas: Naqu, Maqu, and Tainjun ( Figure 1). The Naqu network is on the central QTP, the Maqu network is on the eastern QTP, and the Tianjun network is on the northeast QTP. The Naqu network was established in Naqu (29°55′-36°30′N, 83°55′-95°5′E), the Tib Autonomous Region, China. The mean elevation is 4650 m, and the terrain is mounta ous. The subrigid semiarid climate is the dominant climate type in the observation ar The average annual precipitation is about 500 mm, with 75% of the precipitation falli from May to October. The surface vegetation is mainly alpine grassland. The Naqu n work consists of 56 soil moisture and temperature measurement stations, which were stalled in three different networks to meet different spatial scale needs. At each statio soil moisture/temperature sensors were inserted horizontally at 5 cm, 10 cm, 20 cm, a 40 cm soil depths, respectively. The data collection interval is 30 min. The EC-TM and TM capacitance probes manufactured by Decagon (United States) are used to establish t monitoring network. The sensors measure soil moisture according to the sensitivity of s dielectric permittivity to liquid soil water. The 10 soil samples from different stations w collected to calibrate the sensor, the soil moisture is measured by the gravimetric metho and the soil dielectric permittivity is measured by the sensor simultaneously. A calibrat conversion between the measured soil moisture and the measured dielectric permittiv is then developed. The measured soil moisture turns out to be in the physical range af the calibration [31]. The measurements of soil moisture and temperature at differe depths in the Naqu network from 2015 to 2021 are shown in Figure 2. The mean values soil moisture and temperature during the observation period were 0.16 m 3 /m 3 and 3.45 ° respectively, and their trends were relatively similar. In order to match the Sentinel-1 da we selected 23 soil moisture station measurements from the Naqu network for the s moisture modeling and validation for 2017-2020. The subrigid semiarid climate is the dominant climate type in the observation area. The average annual precipitation is about 500 mm, with 75% of the precipitation falling from May to October. The surface vegetation is mainly alpine grassland. The Naqu network consists of 56 soil moisture and temperature measurement stations, which were installed in three different networks to meet different spatial scale needs. At each station, soil moisture/temperature sensors were inserted horizontally at 5 cm, 10 cm, 20 cm, and 40 cm soil depths, respectively. The data collection interval is 30 min. The EC-TM and 5 TM capacitance probes manufactured by Decagon (United States) are used to establish the monitoring network. The sensors measure soil moisture according to the sensitivity of soil dielectric permittivity to liquid soil water. The 10 soil samples from different stations were collected to calibrate the sensor, the soil moisture is measured by the gravimetric method, and the soil dielectric permittivity is measured by the sensor simultaneously. A calibrated conversion between the measured soil moisture and the measured dielectric permittivity is then developed. The measured soil moisture turns out to be in the physical range after the calibration [31]. The measurements of soil moisture and temperature at different depths in the Naqu network from 2015 to 2021 are shown in Figure 2. The mean values of soil moisture and temperature during the observation period were 0.16 m 3 /m 3 and 3.45 • C, respectively, and their trends were relatively similar. In order to match the Sentinel-1 data, we selected 23 soil moisture station measurements from the Naqu network for the soil moisture modeling and validation for 2017-2020.

Maqu Soil Moisture Monitoring Network
The Maqu network was established in Maqu County (33°6′30″-34°30′15″N, 100°45′45″-102°29′E), the Ganan Tibetan Autonomous Prefecture, Gansu Province, China. The terrain of Maqu County is high in the northwest and low in the southeast, with elevations ranging from 3300 m to 4800 m. Maqu has a subrigid semihumid climate, the cold season is long and cold, and the warm season is short and mild. The average annual temperature and precipitation in the observation area are 2.9 °C and 611.9 mm, respectively. The surface vegetation is mainly low grassland. A total of 20 soil moisture and temperature measurement stations were installed in the Maqu network, and the soil moisture and temperature at depths of 5 cm, 10 cm, 40 cm, and 80 cm were observed at each station. The data collection interval is 60 min. Su [32] provides more detailed information on the Maqu soil moisture monitoring network. We selected 18 soil moisture station measurements from the Maqu network for the soil moisture validation for 2018.

Tianjun Soil Moisture Monitoring Network
The Tianjun network was established in Tianjun County (36°53′-48°39′12″N, 96°49′42″-99°41′48″E), the Haixi Mongolian and Tibetan Autonomous Prefecture, Qinghai Province, China. The mean elevation of Tianjun County is more than 4000 m in the territory. This region has a plateau continental climate with low temperatures and an uneven precipitation distribution. The alpine meadow is the main land cover type. The 58 soil moisture and temperature measurement stations were installed in the Tianjun network. The soil moisture and temperature at depths of 5 cm, 10 cm, and 30 cm were observed at each station. The data collection interval is 30 min. We selected 19 soil moisture station measurements from the Tianjun network for the soil moisture validation for 2020.

Remote Sensing Data
The Sentinel-1 is composed of two satellites (A and B), carrying a C-band synthetic aperture radar that provides continuous images. In this paper, the ground range detected (GRD) products from the interferometric wide swath (IW) mode in the VV and VH polarizations were employed to inverse the surface soil moisture. We performed the prepro- Maqu has a subrigid semihumid climate, the cold season is long and cold, and the warm season is short and mild. The average annual temperature and precipitation in the observation area are 2.9 • C and 611.9 mm, respectively. The surface vegetation is mainly low grassland. A total of 20 soil moisture and temperature measurement stations were installed in the Maqu network, and the soil moisture and temperature at depths of 5 cm, 10 cm, 40 cm, and 80 cm were observed at each station. The data collection interval is 60 min. Su [32] provides more detailed information on the Maqu soil moisture monitoring network. We selected 18 soil moisture station measurements from the Maqu network for the soil moisture validation for 2018.

Tianjun Soil Moisture Monitoring Network
The Tianjun network was established in Tianjun County (36 • 53 -48 • 39 12 N, 96 • 49 42 -99 • 41 48 E), the Haixi Mongolian and Tibetan Autonomous Prefecture, Qinghai Province, China. The mean elevation of Tianjun County is more than 4000 m in the territory. This region has a plateau continental climate with low temperatures and an uneven precipitation distribution. The alpine meadow is the main land cover type. The 58 soil moisture and temperature measurement stations were installed in the Tianjun network. The soil moisture and temperature at depths of 5 cm, 10 cm, and 30 cm were observed at each station. The data collection interval is 30 min. We selected 19 soil moisture station measurements from the Tianjun network for the soil moisture validation for 2020.

Remote Sensing Data
The Sentinel-1 is composed of two satellites (A and B), carrying a C-band synthetic aperture radar that provides continuous images. In this paper, the ground range detected (GRD) products from the interferometric wide swath (IW) mode in the VV and VH polarizations were employed to inverse the surface soil moisture. We performed the preprocessing Remote Sens. 2023, 15, 153 6 of 21 steps (updating of orbit metadata, removal of border noise, removal of thermal noise removal, radiometric calibration, terrain correction, normalization of incident angle, and noise filtering) for Sentinel-1 on the Google Earth Engine (GEE) platform. We used the range-Doppler approach for the geometric terrain correction, and we introduced 7 × 7 Lee wave filtering to remove the noise.

Advanced Integral Equation Model (AIEM)
Although IEM can simulate real surface backscattering characteristics within a broad range of ground roughness, its main disadvantages are the dependence on the local incident angle and the inaccurate description of the actual surface roughness. Therefore, Chen proposed an AIEM by modifying the IEM. In this study, we established the surface microwave scattering database with the AIEM. The detailed expression of the AIEM is presented as follows: where pq represents the polarization mode; k 1 is the free-space wave; s is the root-meansquare height; W n (k sx -k x , k sy -k y ) is the n factorial Fourier transform of the surface correlation function (k z = kcosθ i ; k sz = kcosθ s ; k x = ksinθ i cosϕ; k sx = ksinθ s cosϕ s ; k y = ksinθ i sinϕ; k sy = ksinθ s sinϕ s ); ϕ is the incident azimuth; θ and ϕ s are the scattering angle and scattering azimuth, respectively; F pq and f pq are the functions related to the Fresnel reflectance.

Machine Learning Algorithms
In this study, the four machine learning algorithms including backpropagation neural network (BPNN), support vector machine (SVM), K-nearest neighbor (KNN), and random forest (RF) are introduced to retrieve soil moisture.
The backpropagation neural network (BPNN) is one of the common neural networks, and it is a multilayer feedforward network that is trained by an error backpropagation algorithm [33]. A complete BPNN consists of three parts: the input layer, hidden layer, and output layer. The input layer receives the external massage and transports it to the hidden layer, where the message transformation process is achieved. The output layer outputs the result. The error backpropagation process is conducted when the actual output does not match the expected output. The BPNN is continuously adjusted until the variance in the initial system output and desired output is minimized.
The support vector machine (SVM) is a supervised learning approach that researchers commonly employ for classification analyses and regression modeling [34]. The principle is to construct the best fragmenting lineoid in the character interspace based on the framework risk minimization fundamentals, which globally optimizes the algorithm and places a particular limit on the anticipated risk in the entire example interspace. Generally, researchers use SVMs to solve the linear separability problem, for which the linearly inseparable sample of the lower-dimension input interspace is converted to the higher-dimension characteristic interspace based on a kernel function. The commonly used kernel functions are the polynomial, Gaussian, and radial basis kernel functions. In this study, we used the radial basis kernel function (RBKF) for the analysis because, according to previous results, it achieves more satisfactory effects than the other kernel functions [35].
The K-nearest neighbor (KNN) is a theoretically mature machine-learning algorithm [36]. The basic idea of this method is as follows: When the training data are certain, the K examples that are closest to the new input example are found in the training data. If the majority of these K examples are classified into a certain class, then the input example can also be classified into this class. In addition to classification, we can also use KNNs for regression. In the regression process, the K-nearest samples of the target sample are found, and the average value of these neighbor samples is assigned to the target sample.
Random forest (RF) is one of the typical ensemble algorithms. The samples are obtained from the raw data collection using the bootstrap resampling approach, and the decision tree is employed to calculate each bootstrap sample. Then, the prediction results of the multiple decision trees are combined, and finally, the predicted outcome is obtained by majority voting [37]. We can use the RF algorithm to solve multidimensional information and nonlinear issues without making feature selections, and it is also able to overcome noise and avoid the overfitting issue in practical applications.

Establishment of Surface Microwave Scattering Database with AIEM
The AIEM is deemed to be a theoretical model that can present the actual situation of the ground scattering well. Therefore, the numerical simulation using the AIEM is conducted to establish the database of the ground microwave scattering. The input parameters in the AIEM were as follows: a soil temperature of 20 • C; a frequency of 5.405 GHz; sand and clay contents of 40% and 10%, respectively; a soil moisture content range from 1% to 40%, with a step of 5%; an incident angle range from 20 • to 50 • , with a step of 5 • ; a root-mean-square height range from 0.1 cm to 2.9 cm, with a step of 0.4 cm; a correlation length range from 4 cm to 18 cm, with a step of 2 cm; and the surface autocorrelation function is the exponential autocorrelation function. The simulation of AIEM is shown in Figure 3. According to the results, there is a substantial logarithmic correlation among the backscattering coefficient, soil moisture, and combined roughness. In addition, if the surface roughness is given, then this logarithm relationship is only related to the incident angle. We present the detailed expression with different polarization patterns as follows: where pq is the polarization pattern; A pq is the coefficient that is not related to the surface roughness when the incident angle is known; f(s,l) is the known surface roughness. Ground roughness is one of the critical parameters in the process of microwave surface scattering, and it mainly includes two unknown parameters: the correlation length (l) and root-mean-square height (s). The backscattering coefficient is affected by both the land s, and it is difficult to distinguish between their influences on it. Therefore, researchers have proposed a new parameter that combines the l and s [38,39] to decrease the error of the soil moisture inversion. Zribi [40] found that the model outputs and backscattering had good consistency under different experimental conditions by combining parameters: the Zs (Z S = S 2 /l) and soil moisture. According to the result, there was a substantial logarithmic relationship between the backscattering coefficient and Zs in the VV and HH polarization patterns. The detailed expression is presented as follows: where pq represents the polarization pattern; B pq is the coefficient that is not related to the soil moisture when the incident angle is known; f(m v ) is the known soil moisture content. The relationships among the soil moisture, combined roughness, and backscattering coefficient in the VV and VH polarizations are shown in Figure 4. According to the results, there was a substantial logarithmic correlation among the backscattering coefficient, soil moisture, and combined roughness. If the combined roughness is known, then the backscattering coefficient increases as the soil moisture increases. When the soil moisture is 30%, the change becomes stable, and the sensitivity of the backscattering coefficient to the soil moisture decreases. When the soil moisture is known, the trend of the backscattering coefficient increases as the combined roughness initially increases and then decreases. Consequently, the backscattering coefficient increases with the increase in the soil moisture content. Moreover, the sensitivity of the backscattering coefficient to the soil moisture gradually decreases with the increase in the combined surface roughness. face roughness is given, then this logarithm relationship is only related to the incident angle. We present the detailed expression with different polarization patterns as follows: where pq is the polarization pattern; Apq is the coefficient that is not related to the surface roughness when the incident angle is known; f(s,l) is the known surface roughness.
. 2023, 15, x FOR PEER REVIEW 8 of 20 Ground roughness is one of the critical parameters in the process of microwave surface scattering, and it mainly includes two unknown parameters: the correlation length (l) and root-mean-square height (s). The backscattering coefficient is affected by both the land s, and it is difficult to distinguish between their influences on it. Therefore, researchers have proposed a new parameter that combines the l and s [38,39] to decrease the error of the soil moisture inversion. Zribi [40] found that the model outputs and backscattering had good consistency under different experimental conditions by combining parameters: the Zs (ZS = S 2 /l) and soil moisture. According to the result, there was a substantial logarithmic relationship between the backscattering coefficient and Zs in the VV and HH polarization patterns. The detailed expression is presented as follows: where pq represents the polarization pattern; Bpq is the coefficient that is not related to the soil moisture when the incident angle is known; f(mv) is the known soil moisture content. The relationships among the soil moisture, combined roughness, and backscattering coefficient in the VV and VH polarizations are shown in Figure 4. According to the results, there was a substantial logarithmic correlation among the backscattering coefficient, soil moisture, and combined roughness. If the combined roughness is known, then the backscattering coefficient increases as the soil moisture increases. When the soil moisture is 30%, the change becomes stable, and the sensitivity of the backscattering coefficient to the soil moisture decreases. When the soil moisture is known, the trend of the backscattering coefficient increases as the combined roughness initially increases and then decreases. Consequently, the backscattering coefficient increases with the increase in the soil moisture content. Moreover, the sensitivity of the backscattering coefficient to the soil  Ground roughness is one of the critical parameters in the process of microwave su face scattering, and it mainly includes two unknown parameters: the correlation length and root-mean-square height (s). The backscattering coefficient is affected by both the lan s, and it is difficult to distinguish between their influences on it. Therefore, researche have proposed a new parameter that combines the l and s [38,39] to decrease the error the soil moisture inversion. Zribi [40] found that the model outputs and backscatterin had good consistency under different experimental conditions by combining parameter the Zs (ZS = S 2 /l) and soil moisture. According to the result, there was a substantial log rithmic relationship between the backscattering coefficient and Zs in the VV and HH p larization patterns. The detailed expression is presented as follows: where pq represents the polarization pattern; Bpq is the coefficient that is not related to th soil moisture when the incident angle is known; f(mv) is the known soil moisture conten The relationships among the soil moisture, combined roughness, and backscatterin coefficient in the VV and VH polarizations are shown in Figure 4. According to the result there was a substantial logarithmic correlation among the backscattering coefficient, so moisture, and combined roughness. If the combined roughness is known, then th backscattering coefficient increases as the soil moisture increases. When the soil moistu is 30%, the change becomes stable, and the sensitivity of the backscattering coefficient the soil moisture decreases. When the soil moisture is known, the trend of the backsca tering coefficient increases as the combined roughness initially increases and then d creases. Consequently, the backscattering coefficient increases with the increase in the so moisture content. Moreover, the sensitivity of the backscattering coefficient to the so moisture gradually decreases with the increase in the combined surface roughness.

Construction of Empirical Model
Overall, the relationships between the soil moisture, combined roughness, and backscattering coefficient in the VV and VH polarization are as follows: where σ represents the backscattering coefficient with different polarizations; A(θ), B(θ), and C(θ) are the coefficients that are only related to the incident angle (we obtained their values by simulating them in the AIEM database). Although surface roughness is a critical parameter in soil moisture retrieval, it is difficult to measure the ground roughness in the actual application. Moreover, the measurement accuracy of the surface roughness also cannot be ensured. Therefore, if the surface roughness is replaced by other known parameters in the establishment of the empirical model, then this critical parameter has a substantial influence on and reduces soil moisture retrieval. In other words, the precision of soil moisture inversion will be improved by reducing the quantity of the unknown parameters or inaccuracy factors in the model. According to the simulation results of AIEM, the relationships between the backscattering coefficient, soil moisture, and surface roughness in the VV and VH polarizations are shown in Equations (5) and (6). When the backscattering coefficients of the VV and VH polarizations are known, the surface roughness (Zs) will be eliminated by combining Equations (5) and (6), and the final empirical model of the soil moisture retrieval can be obtained. The detailed expression is drawn as follows: where m v is the soil moisture content; σ VV and σ VH are the backscattering coefficients of the VV and VH polarizations, respectively; A VVVH , B VVVH , and C VVVH are the coefficients that are simulated by the modeling data.

Soil Moisture Retrieval Using the Empirical Model
Although we collected soil moisture measurements from 2015 to 2021 at the Naqu station, we employed the Sentinel-1 synthetic aperture radar data from 2017 to 2019 and the soil moisture measurements from the corresponding time to establish the soil moisture retrieval models, which is because the Sentinel-1 images from 2015 to 2016 at the Naqu station were missing. Finally, we used 240 Sentinel-1 images of the VV and VH polarizations to construct the soil moisture retrieval models, and we obtained the soil moisture results from the Naqu station for 2020 from 2020 Sentinel-1 images by using retrieval models. In Section 3.2, we proposed the empirical models for the ascending and descending orbits based on the soil moisture measurement data of 5 cm and the backscattering coefficient of the VV and VH polarizations from Sentinel-1 images from 2017 to 2019 at the Naqu station. We introduced the least-squares method to calculate the A VVVH , B VVVH , and C VVVH values. We present the detailed expressions in Equations (8) and (9), respectively: The backscattering coefficients of the VV and VH polarizations from the Sentinel-1 images for 2020 from the Naqu station are put into the empirical models of the ascending and descending orbits to retrieve the soil moisture content, respectively. The inversion results of the soil moisture for the ascending and descending orbits at the Naqu station for 2020 are presented in Figures 5 and 6, respectively, and the comparisons of the soil moisture between the measured values and retrieved values for the ascending and descending orbits are shown in Figure 7. According to the results, we can use the empirical models to retrieve the surface soil moisture, with an R 2 of 0.609, RMSE of 0.08, and MAE of 0.064 for the ascending orbit model, and an R 2 of 0.554, RMSE of 0.086, and MAE of 0.071 for the descending orbit model. When the soil moisture is higher than 0.3 m 3 /m 3 , the empirical models underestimate the soil moisture so that it is markedly contrasted with the measured values. The simulation results of the ascending orbit are better than those of the descending orbit, which is also consistent with the results of Dabrowska-Zielinska [41], who retrieved the soil moisture from the Sentinel-1 imagery over wetlands and found that the retrieval result of the soil moisture achieved a satisfactory performance by using data from the ascending orbit of the Sentinel-1 images. The regions with high soil moisture are mainly distributed in mountainous areas, and the regions with low soil moisture are distributed among the flat terrain areas. The soil moisture contents in June, July, August, and September are substantially higher than in other months because the Naqu network climate is mainly influenced by the south Asian monsoon, and the precipitation falls between June and September. results of the soil moisture for the ascending and descending orbits at the Naqu station for 2020 are presented in Figures 5 and 6, respectively, and the comparisons of the soil moisture between the measured values and retrieved values for the ascending and descending orbits are shown in Figure 7. According to the results, we can use the empirical models to retrieve the surface soil moisture, with an R² of 0.609, RMSE of 0.08, and MAE of 0.064 for the ascending orbit model, and an R² of 0.554, RMSE of 0.086, and MAE of 0.071 for the descending orbit model. When the soil moisture is higher than 0.3 m 3 /m 3 , the empirical models underestimate the soil moisture so that it is markedly contrasted with the measured values. The simulation results of the ascending orbit are better than those of the descending orbit, which is also consistent with the results of Dabrowska-Zielinska [41], who retrieved the soil moisture from the Sentinel-1 imagery over wetlands and found that the retrieval result of the soil moisture achieved a satisfactory performance by using data from the ascending orbit of the Sentinel-1 images. The regions with high soil moisture are mainly distributed in mountainous areas, and the regions with low soil moisture are distributed among the flat terrain areas. The soil moisture contents in June, July, August, and September are substantially higher than in other months because the Naqu network climate is mainly influenced by the south Asian monsoon, and the precipitation falls between June and September.

Soil Moisture Retrieval Using Machine Learning Algorithms
To further improve the accuracy of the soil moisture retrieval, the machine learning algorithms of SVM, BPNN, KNN, and RF are introduced to inverse the surface soil mois-

Soil Moisture Retrieval Using Machine Learning Algorithms
To further improve the accuracy of the soil moisture retrieval, the machine learning algorithms of SVM, BPNN, KNN, and RF are introduced to inverse the surface soil moisture of the Naqu network with the AIEM. In the process of machine learning modeling, the physically meaningful radar parameters in the AIEM are introduced to machine learning algorithms to establish the soil moisture retrieval model. The backscattering coefficient of the VV and VH polarizations and the incidence angle are the independent variables. The dependent variable is the soil moisture measurement data. Therefore, the measurements (soil moisture, backscattering coefficient of the VV and VH polarizations, and incidence angle) from the Naqu station for 2017-2019 are employed as the ensemble of training and testing samples, and the training and testing samples are set to 70% and 30% of the total number of samples, respectively. The model performances of the machine learning of the ascending and descending orbits are presented in Table 1. According to the results, the RF performance is better than those of the other machine learning algorithms, with an R 2 of 0.753, RMSE of 0.045, and MAE of 0.034 in the ascending orbit, and an R 2 of 0.671, RMSE of 0.049, and MAE of 0.038 in the descending orbit. In addition, the accuracies of the machine learning approaches in the ascending orbit are also better than those in the descending orbit. For the model application, the surface soil moisture contents for the ascending and descending orbits for 2020 from the Naqu network are retrieved by using different machine learning algorithms. The inversion results of the soil moisture retrieval with machine learning algorithms for the ascending and descending orbits of Naqu station for 2020 are presented in Figure 8, and we present the comparisons of the soil moisture retrieval of the different models for the ascending and descending orbits in Figure 9. The result indicates that the performances of the machine learning algorithms are substantially superior to the empirical model. For the ascending orbit, the coefficients (R 2 ) of the BPNN, KNN, SVM, RF, and EM (empirical) models are 0.615, 0.666, 0.626, 0.714, and 0.609, respectively. and the RMSE coefficients of these models are 0.076, 0.070, 0.078, 0.065, and 0.080, respectively. For the descending orbit, the coefficients (R 2 ) of the BPNN, KNN, SVM, RF, and EM (empirical) models are 0.590, 0.612, 0.588, 0.677, and 0.554, respectively, and the RMSE coefficients of these models were 0.080, 0.078, 0.083, 0.072, and 0.086, respectively. According to these results, the combination of the AIEM and machine learning algorithms can further enhance the precision of soil moisture retrieval. The inversion accuracies of the soil moisture with different machine learning algorithms in the ascending orbit are also better than those in the descending orbit. In addition, the accuracy of the RF algorithm is better than those of the BPNN, SVM, and KNN models.  Although the soil moisture inversion results with the RF in the Naqu network indicate a satisfactory performance, we also retrieve the soil moisture contents of the Maqu network in 2018 and of the Tianjun network in 2020 using the RF algorithm to further validate the precision of the soil moisture retrieval. Figures 10 and 11 present the soil moisture inversion result for the ascending and descending orbits of the Maqu network for 2018, respectively. The soil moisture inversion results for the ascending and descending orbits of the Tianjun network for 2020 are shown in Figures 12 and 13, respectively. The validations of soil moisture for the ascending and descending orbits in the Maqu and Tianjun networks are presented in Figures 14 and 15, respectively. The results indicate that the RF algorithm achieves a satisfactory performance for the ascending and descending orbits in both the Maqu and Tianjun networks. In the Maqu network, the R 2 , RMSE, and MAE values for the ascending orbit are 0.696, 0.062, and 0.052, respectively, and the values of these coefficients for the descending orbit are 0.648, 0.075, and 0.064, respec-  Although the soil moisture inversion results with the RF in the Naqu network in cate a satisfactory performance, we also retrieve the soil moisture contents of the Ma network in 2018 and of the Tianjun network in 2020 using the RF algorithm to furth validate the precision of the soil moisture retrieval. Figures 10 and 11 present the s moisture inversion result for the ascending and descending orbits of the Maqu netwo for 2018, respectively. The soil moisture inversion results for the ascending and descen ing orbits of the Tianjun network for 2020 are shown in Figures 12 and 13, respective The validations of soil moisture for the ascending and descending orbits in the Maqu a Although the soil moisture inversion results with the RF in the Naqu network indicate a satisfactory performance, we also retrieve the soil moisture contents of the Maqu network in 2018 and of the Tianjun network in 2020 using the RF algorithm to further validate the precision of the soil moisture retrieval. Figures 10 and 11 present the soil moisture inversion result for the ascending and descending orbits of the Maqu network for 2018, respectively. The soil moisture inversion results for the ascending and descending orbits of the Tianjun network for 2020 are shown in Figures 12 and 13, respectively. The validations of soil moisture for the ascending and descending orbits in the Maqu and Tianjun networks are presented in Figures 14 and 15, respectively. The results indicate that the RF algorithm achieves a satisfactory performance for the ascending and descending orbits in both the Maqu and Tianjun networks. In the Maqu network, the R 2 , RMSE, and MAE values for the ascending orbit are 0.696, 0.062, and 0.052, respectively, and the values of these coefficients for the descending orbit are 0.648, 0.075, and 0.064, respectively. In the Tianjun network, the R 2 , RMSE, and MAE values for the ascending orbit are 0.709, 0.069, and 0.057, respectively, and the values of these coefficients for the descending orbit are 0.638, 0.074, and 0.063, respectively. Moreover, the inversion accuracies of the soil moisture for the ascending orbit are also higher than those for the descending orbit for both the Maqu and Tianjun networks. soil moisture for the ascending orbit are also higher than those for the descending orbit for both the Maqu and Tianjun networks.

Discussion
The surface roughness is an important parameter in the soil moisture inversion process. The measurement of the surface roughness is difficult in practical experiments for natural and manmade reasons. In addition, the measurement accuracy also substantially differs from that of the actual conditions. Reducing the input of unknown or unobservable parameters is one of the major methods for optimizing the model. Therefore, the surface roughness is replaced by other known parameters, and the empirical models for the ascending and descending orbits are proposed by combining the equations of the VV and VH polarizations based on the AIEM model to decrease the impact of the surface roughness. Four machine learning algorithms (BPNN, SVM, KNN, and RF) are used to further improve the soil moisture retrieval precision in the Naqu network, and these algorithms are commonly applied but have different learning strategies. To further verify the model accuracy, the surface soil moisture for the ascending and descending orbits of the Maqu network for 2018 and Tianjun network for 2020 are retrieved using the RF algorithm, respectively. We found that the retrieval results of these machine learning algorithms are more consistent compared with the empirical model. However, due to the different learning schemes, there are still some minor distinctions in the results of the four algorithms. According to the results of this paper, the RF performance is superior to the other machine learning approaches because the RF model obtains independent regression trees by randomly testing training data [37]. Therefore, the model can overcome noise and avoid the overfitting issue in practical applications. Chen [42] estimated the soil moisture of winter wheat farmlands during the vegetative season based on the machine learning algorithms of support vector regression, random forests (RF), and gradient boosting regression tree, the results also indicated that the performance of the RF algorithm is better than those of the other algorithms. The 12 advanced statistical and machine learning algorithms were used to estimate soil moisture using the Sentinel-1 data [43], and the result indicated that the RF algorithm has satisfactory performance compared with those of the other models.
The AIEM is a forward model that is used to calculate the backscattering coefficient of the bare ground with high estimated precision and low predicted consumption. In this study, we only considered the single scattering situation and ignored multiple scattering ones, which is one of the main reasons for the errors in the AIEM simulation process. Zeng [44] presents the scattering results between the numerical simulations and experimental measurements with the AIEM, which also indicated that multiple scattering has a certain

Discussion
The surface roughness is an important parameter in the soil moisture inversion process. The measurement of the surface roughness is difficult in practical experiments for natural and manmade reasons. In addition, the measurement accuracy also substantially differs from that of the actual conditions. Reducing the input of unknown or unobservable parameters is one of the major methods for optimizing the model. Therefore, the surface roughness is replaced by other known parameters, and the empirical models for the ascending and descending orbits are proposed by combining the equations of the VV and VH polarizations based on the AIEM model to decrease the impact of the surface roughness. Four machine learning algorithms (BPNN, SVM, KNN, and RF) are used to further improve the soil moisture retrieval precision in the Naqu network, and these algorithms are commonly applied but have different learning strategies. To further verify the model accuracy, the surface soil moisture for the ascending and descending orbits of the Maqu network for 2018 and Tianjun network for 2020 are retrieved using the RF algorithm, respectively. We found that the retrieval results of these machine learning algorithms are more consistent compared with the empirical model. However, due to the different learning schemes, there are still some minor distinctions in the results of the four algorithms. According to the results of this paper, the RF performance is superior to the other machine learning approaches because the RF model obtains independent regression trees by randomly testing training data [37]. Therefore, the model can overcome noise and avoid the overfitting issue in practical applications. Chen [42] estimated the soil moisture of winter wheat farmlands during the vegetative season based on the machine learning algorithms of support vector regression, random forests (RF), and gradient boosting regression tree, the results also indicated that the performance of the RF algorithm is better than those of the other algorithms. The 12 advanced statistical and machine learning algorithms were used to estimate soil moisture using the Sentinel-1 data [43], and the result indicated that the RF algorithm has satisfactory performance compared with those of the other models.
The AIEM is a forward model that is used to calculate the backscattering coefficient of the bare ground with high estimated precision and low predicted consumption. In this study, we only considered the single scattering situation and ignored multiple scattering ones, which is one of the main reasons for the errors in the AIEM simulation process. Zeng [44] presents the scattering results between the numerical simulations and experimental measurements with the AIEM, which also indicated that multiple scattering has a certain effect on backscattering, and that influence on the HH polarization is higher than that on the VV polarization. Although the penetrability of the C band does not lead to intensive volume scattering, its influence on the actual scattering process is not neglected.
The vegetation water content is a substantial parameter that affects the soil moisture retrieval accuracy. In this study, we selected the Naqu, Maqu, and Tianjun soil moisture monitoring networks as the research areas to achieve surface soil moisture inversion using Sentinel-1 data. The Sentinel-1 data is a C-band (5.405 GHz) synthetic aperture radar that provides dual-polarization, and it also has the ability to penetrate the sparse and low vegetation on the ground surface. Alpine meadow is the main land cover type in the research areas due to the climate, and the soil moisture inversion result is less affected by the surface vegetation. However, there is still vegetation water content interference. Moreover, plant growth is a dynamic process, and its structure and morphology will change significantly over time; however, we could not use the empirical constants to reveal the dynamic changes in the vegetation information, which led to some uncertainty regarding the estimated results.
Although the influence of the surface roughness in this study is reduced by combining the empirical equations of the soil moisture and VV, and VH polarizations, the surface roughness is still a key factor in the soil moisture retrieval process. The issue of surface roughness has received broad attention in recent years, and researchers have proposed relevant models [38][39][40]. However, uncertainties still exist in the research on the roughness parameterization scheme. The main reason is that the different models are usually developed by using different experimental data; in other words, the soil type, soil texture, and moisture content parameters, and the rough conditions in each model, are different, as are the hypothesized conditions of the model development (for example, the calculated method selection of the soil dielectric constant and soil effective temperature). Overall, every roughness model has its comparative advantages and constrained conditions, and no model can perform well in all circumstances. Further research on the roughness parameterization schemes that can be applied to the complex soil conditions of different soil roughnesses, moistures, soil types, and correlation functions is essential.
The radar response to the soil moisture content is closely related to critical parameters, such as surface roughness, microwave frequency, and incident angle. Ulaby [45] found that the radar response seems to be linear within a range of 15-30% moisture content for all angles, frequencies, polarizations, and surface conditions. Theoretically, the Sentinel-1 images could be employed to inverse soil moisture content well over the range of 15-30% moisture content. When the soil moisture content is higher than 0.3 m 3 /m 3 , the empirical model markedly underestimates the soil moisture compared to the observation data. Bruckler [46] also confirms this result. Although machine learning algorithms can improve the inversion results, how to further improve the inversion accuracy of the soil moisture with high water content is the next issue to be explored.

Conclusions
We select Naqu, Maqu, and Tianjun soil moisture monitoring networks on the QTP as the research areas. The database of the surface microwave scattering is obtained using the AIEM to analyze the response of the surface parameters and radar signal. The soil moisture retrieval models of the empirical and machine learning algorithms for the ascending and descending orbits are proposed by using the Sentinel-1 and soil moisture measurements. Finally, the soil moisture retrieval accuracies of the different models are validated in these research areas.
The major conclusions of this study are abstracted as follows: (1) The empirical models for the ascending and descending orbits can estimate the surface soil moisture in the Naqu network, but the soil moisture content is markedly underestimated in empirical models when the soil moisture is high. The simulation results of the ascending orbit are better than those of the descending orbit.
(2) The combination of the AIEM and machine learning algorithms can further enhance soil moisture inversion precision. The performances of the machine learning algorithms are substantially superior to that of the empirical model, and the accuracy of the RF model is higher than those of the BPNN, SVM, and KNN models. The inversion accuracies of the soil moisture with the different machine learning algorithms in the ascending orbit are also better than those in the descending orbit.
(3) The RF algorithm achieves a satisfactory performance for the ascending and descending orbits for both the Maqu and Tianjun networks. The rationality and accuracy of the RF algorithm at different locations and times on the QTP are further verified.
Author Contributions: The specific contributions of each author in the paper are listed. L.D. processed data and wrote the manuscript; W.W., R.J. and F.X. contributed important ideas and considerations; R.J. and Y.Z. carried out the soil moisture observation experiment. All authors have read and agreed to the published version of the manuscript.