1. Introduction
Soil moisture is an extremely important parameter influencing many hydrological and climatic processes. It influences infiltration rates, surface runoff, flooding, and evapotranspiration [
1]. It is also a key factor in agriculture, determining crop yields and thus influencing food security.
The widespread importance of soil moisture in many processes makes it crucial to model this variable for large areas with high spatial and temporal resolution. There are a number of remote sensing methods for measuring soil moisture [
2], primarily using passive radiometers [
3,
4] and active SAR sensors [
5,
6,
7]. However, there are also other solutions based, e.g., on physics-informed machine learning (ML) [
8]. None of these methods alone are currently sufficient to address the challenges of providing soil moisture information for very large areas with good spatial resolution and satisfactory accuracy. Methods based on radiometers and scatterometers, such as SMOS [
4,
9,
10] or ASCAT [
3] are very sensitive to soil moisture and can provide information even on a global scale, but are characterized by poor spatial resolution, on the order of kilometers. Soil moisture obtained by these methods is a very important element of hydrological and climate models on a national, continental, and global scale [
11,
12,
13], however, it is completely insufficient for many other purposes, such as agriculture, where obtaining accurate soil moisture with good spatial resolution at field scale is an extremely important factor.
This domain is primarily filled by methods based on active SAR data. However, they only provide information from the thin top layer of the soil. In the case of C-band data, this is only 1–2 cm [
14]. Problems with measuring soil moisture using SAR data include sensitivity of the signal to soil roughness and texture [
15,
16,
17,
18], as well as vegetation covering the soil [
19,
20,
21,
22]. Current knowledge about these parameters is often insufficient. Their high spatial variability causes, in many cases, SAR-based models to only be valid locally or applicable to a certain range of soil parameters. There are methods that attempt to combine all these challenges, such as the SSM Copernicus [
23]. It combines data from the active Sentinel-1 SAR sensor and the ASCAT scatterometer, resulting in a relatively high spatial resolution of 1 km. However, this is still far from the 10-m resolution of Sentinel-1 SAR data. Another example of a globally available 1 km spatial resolution soil moisture model is the solution based on physics-informed machine learning [
8]. The best results regarding combining high spatial resolution of soil moisture products with a large coverage are presented in [
24]. This solution combines high-resolution land surface modeling, radiative transfer modeling, machine learning, SMAP satellite microwave data, and in-situ observations. The resulting soil moisture dataset with 30 m resolution is available for the conterminous United States.
Many algorithms relating SAR backscattering to soil moisture consider soil roughness as an additional parameter [
5,
25]. These models are based on three main approaches: physical, empirical, and semi-empirical models. The Oh [
26] and Dubois [
27] models are semi-empirical models, while the integral equation model (IEM) [
28] is a physical model. Soil roughness itself is a rather difficult parameter to measure in the field. It can be measured with a mesh board profiler, laser profilometer, laser scanner, pin profiler, needle, or 3D photogrammetry [
29,
30]. However, obtaining a sufficiently precise soil micro-relief model using these methods is very time consuming and labor intensive. Soil roughness is also difficult to parameterize using models. The IEM model uses relative height variability and correlation length to parametrize soil roughness. There is also a semi-empirical calibration of the IEM model, which allows bare agricultural soils to be characterized solely by RMS height and soil moisture. A very detailed analysis of previous approaches to modeling soil roughness from SAR data was carried out by Lee [
31]. Studies have shown that soil roughness can reduce uncertainty in soil moisture modeling, however, these models often have very limited applicability due to their validation only for a limited roughness range (usually not exceeding a few cm) or their reliance on very local studies. In many of these works, soil moisture is modeled using machine learning [
32,
33,
34,
35,
36].
Most SAR soil moisture studies are based on the modeling of the SAR backscattering coefficient, however, there are also solutions taking into account polarimetric signal decomposition methods [
19,
28,
37,
38,
39]. The aim of these studies is to determine the scattering mechanisms dominating in the radar signal and thus provide additional information on the structure of the objects, including its roughness [
40].
Relatively few soil moisture modelling algorithms consider soil texture as an additional parameter. It has a significant impact on soil water retention and, therefore, the ability of SAR data to penetrate deeper into the soil. However, there are a number of studies that examine the relationship between soil texture and microwave radiation [
16,
17,
41,
42,
43] or microwave and optical radiation [
44,
45]. Some of these also incorporate soil texture into moisture modeling [
18,
46,
47,
48,
49].
The aim of this study is to develop a soil moisture model for bare soils from Sentinel-1 C-band SAR data that would be characterized by high spatial resolution at field scale and would be universal enough to be applicable to large areas characterized by various soil types and various roughness, from relatively smooth to very rough, e.g., after ploughing. This goal can be achieved using Sentinel-1 SAR data by incorporating soil texture into soil moisture modeling. The solution proposed in this study utilizes publicly available soil maps at a scale of 1:5000 for the entire territory of Poland, containing information on soil species, which can be used to approximate soil texture. Analyses were conducted on nearly 600 points with widely varying soil textures, from sand to clay, for which soil moisture measurements were taken in the field. Modeling was performed using three independent machine learning methods: the random forest (RF) regressor, support vector regressor (SVR), and XGBoost (XGB) regressor. All three methods yielded similar results and demonstrated significant improvements in the accuracy of determining soil moisture, using only approximate soil texture information without the need to consider soil roughness in the modeling. This allows avoiding the laborious and time-consuming measurement of soil roughness or its determination from SAR data, which in many cases does not produce the expected results. The advantages of the proposed solution include its simplicity and ease of inversion for many areas for which soil maps containing approximate texture information are available, as well as good accuracy of results even for soils characterized by very high roughness. So far, the model has not been tested in areas covered with vegetation. This study takes into account only the total content of silt and clay fractions (particles < 0.02 mm) in soils. The work also analyzes whether Sentinel-1 dual polarimetric decompositions can be helpful in determining soil roughness and texture.
3. Results
3.1. Analysis of Soil Roughness Using Polarimetric Decomposition
For soil roughness analysis, the fields characterized by visually differentiated soil roughness and similar soil moisture measured in the field (19–25%) were selected to minimize the impact of this factor on radar penetration depth, backscattering coefficient, and polarimetric signatures. Because soil roughness was assessed only visually and divided into three classes, this study only provides a rough analysis of the correspondence of soil surface roughness with polarimetric signal decomposition channels. A more detailed analysis of relations between polarimetric channels and soil roughness using precise soil DTM derived from drones, taking into account also soil texture, is being prepared as a separate publication.
Visual analysis of soil roughness based on images taken during field measurements, with polarimetric signal decomposition products, revealed a lack of correspondence between soil surface roughness and polarimetric signal decomposition channels. In many cases, even very rough soil surfaces corresponded to polarimetric channel values characteristic for smooth surfaces (Bragg surfaces). Soils with relatively smooth surfaces could also be rough to radar (high volume scattering values). Examples of such situations are shown in
Figure 5.
Analysis of these inconsistencies, together with the knowledge about the study area acquired during field measurements and soil maps, shows that soils that were heavily loosened (e.g., prepared for sowing) or characterized by a lower silt and clay fraction, were considered rough by radar, as expressed by an increase of volume scattering. This seems to indicate that radar soil roughness, expressed through polarimetric channels, is related more to the ability of microwave radiation to penetrate below the soil surface, than to the roughness of the soil surface itself. These are preliminary conclusions resulting from the initial data analysis and require further investigation, based on additional data on surface roughness, obtained, for example, from high resolution DTM from drones, and a more detailed analysis of soil parameters such as soil structure and texture, which is beyond the scope of this study.
3.2. Modeling Soil Moisture Without Soil Parameters
Figure 6 shows the results of soil moisture modeling using only the backscatter coefficient values. The best results were achieved by considering only the VV and VH channels and the local incidence angle without SPAN channel and the VH/VV ratio. The results are relatively poor. Clearly, there is an overestimation of values for low soil moisture and an underestimation for high soil moisture. The accuracy of moisture determination decreases with increasing humidity, which is likely due to, among other factors, the significantly smaller number of measurements with higher soil moisture. Due to the frequent drought phenomena in Poland in recent years, such data are increasingly difficult to obtain. The best accuracy of soil moisture determination compared to field measurements was achieved using support vector regression: 6.65% at R
2 = 0.49.
Similar results were achieved when dual-polarimetric signal decomposition was added to the modeling (
Figure 7). In this case, the best results were achieved using C11, C22 matrix elements and Volume_g, Surface_r, Ratio_b channels from the dual-pol model-based decomposition and projected local incidence angle. The best accuracy of soil moisture determination compared to field measurements was also achieved using support vector regression (RMSE 6.71% at R
2 = 0.486). The overestimation of low values and underestimation of high values is slightly greater when using polarimetry.
3.3. Modeling Soil Moisture with Soil Parameters
By far the greatest improvement occurred after including soil texture, expressed as the percentage of silt and clay (particles < 0.02) in the soil, in the model. The modeling results are presented in
Figure 8 for backscattering data and
Figure 9 for polarimetric data.
It can be clearly seen that the soil texture significantly improved modeling accuracy. The results obtained with and without polarimetric channels are similar. The result obtained using the backscattering coefficient is even better. The best soil moisture accuracy obtained using XGBoost modeling was −5.27% with R2 = 0.69 and 5.36% with R2 = 0.67 for backscattering and polarimetric channels, respectively.
Finally, a correlation analysis was conducted between the silt and clay content for selected points characterized by small diversity of soil moisture (19% < SM < 25%) and products of polarimetric decompositions. The analyses performed did not show any significant correlations between the content of silt and clay particles and any of the dual-polarimetric products.
3.4. RF Feature Importances
Feature importances for the RF regressor for models trained on the backscattering coefficient dataset are gathered in
Table 3. VV_dB was the strongest predictor for both tested datasets, although its importance decreased after the inclusion of silt and clay content.
Table 4 presents feature importances for the RF regressor for models trained on the polarimetric channels dataset. Without silt and clay content, the dominant features were C11_dB, Surface_r_MB, and Volume_g_MB. The addition of the assumed silt and clay content uniformly reduced the importance of the polarimetric features, while silt and clay content became the strongest predictor.
3.5. Residual Analysis
Residual analysis was performed for ML models trained on datasets with soil parameters included. Residuals were calculated as a difference between the measured and predicted soil moisture. In
Figure 10, histograms of residual values for models trained on backscatter coefficient values are shown.
In all cases, the distributions are centered around zero, but slightly shifted to the left. The spread of values for RF and SVR is slightly wider than for XGB. Occasional large under- and overpredictions are present in tails.
Histograms of residual values for regressors trained on the polarimetric channels data are shown in
Figure 11.
For all regressors, the residual distributions are centered close to zero, but in general, the spread is wider compared with models trained on backscattering coefficient data. SVR shows the broadest distribution, with more frequent large residuals present on both tails. The narrowest and the most symmetrical distribution was obtained for the XGB regressor.
Table 5 presents residual statistics for the different regressors and input features.
For all cases, the mean residuals are negative and close to zero, indicating a slight tendency for underestimate the predicted soil moisture, but without a systematic bias. In all cases, MAE was lower than RMSE, therefore some relatively large errors were present. A positive skew in all cases except SVR and XGB trained on backscattering coefficient data show that underpredictions are typically higher than overpredictions. Overall, it can be said that all models performed well, but the most accurate predictions (lowest MAE and RMSE for both sets of input features) were obtained using the XGBoost regressor.
4. Discussion
The results of the study show that soil surface roughness cannot be simply related to “radar” terrain roughness expressed using polarimetric channels. Dual-polarimetric products rather reflect “radar soil roughness”, which is not identical to surface roughness due to the possibility of microwave radiation to penetrate into the soil, especially in highly loosened soils with low moisture content. In such cases, volume scattering increases due to the penetration of microwave radiation below the soil surface, which can occur in both smooth and rough soil surfaces. These results do not, of course, mean that knowledge of soil surface roughness is not an important parameter in modeling soil moisture from SAR data. Many authors point to improved accuracy in soil moisture modeling when roughness is taken into account in the case of various radar wavelengths [
5,
25,
27,
28]. It should also be noted that presented studies cover a much wider range of soil roughness variations than the soil roughness limits, which are the boundary conditions for models described in the literature [
21].
Visual analysis of products of polarimetric decompositions also revealed that there can be quite significant differences, e.g., of Alpha parameter from H-Alpha decomposition within a single field (
Figure 5). However, analysis of soil moisture differences from available points for such fields and visual assessment of the roughness of these fields during terrain campaigns do not allow for the drawing of clear conclusions about the causes of these differences. This is likely a result of various soil parameters, which, in addition to moisture, texture, and degree of loosening, probably also include differences in organic matter content and local differences in soil structure. Further detailed study is required based on precise field measurements of soil parameters and surface roughness using high-resolution DSM.
The results of the machine learning modeling also showed that, in the absence of knowledge about soil parameters, the availability of dual-polarization polarimetric channels cannot improve the accuracy of soil moisture determinations. It should be noted, however, that in these analyses, a relatively small number of points were available for wet soils compared to soils with low and medium moisture content. This is undoubtedly a significant factor, even though during the random forest analysis, points were weighted based on their number in specific moisture ranges. Significantly increasing the number of measurements from soils with high moisture content would be very beneficial for these analyses.
The fact that good soil moisture estimation can be achieved using texture rather than roughness is crucial for developing soil moisture models as surface roughness can vary over time due to agricultural practices. Soil texture is a much more stable property over time, and as these studies demonstrate, even relatively imprecise knowledge of the sum of silt and clay content alone can significantly improve the results. It can be clearly seen that radar data alone are insufficient to accurately determine soil moisture with good spatial resolution for areas characterized by high environmental variability. This is because soil moisture is only one of many factors influencing the radar signal. Adding additional input parameters to the model, such as soil texture, significantly improves the accuracy of the obtained results. There appears to be potential for further improvement of these results by incorporating more precise texture information from field measurements, which separately consider clay, silt, and sand fractions in the soil. Other authors also point to the significant importance of texture in soil moisture modeling [
18].
The analyses conducted in this study do not take into account another very important soil parameter: soil organic matter content. As many authors point out, soil organic matter also plays a significant role in soil water retention capacity [
55], and thus can have a significant impact on the results of soil moisture modeling based on SAR data. It is unclear whether adding this parameter to a texture-based soil moisture estimation model could significantly improve the obtained results, as soil organic matter content is correlated with soil silt and clay content [
56]. A separate problem is the relatively low accuracy of remote sensing methods for determining organic matter [
57]. A comprehensive review of remote sensing methods for determining this parameter from remote sensing data is provided in Vaudour [
57]. These methods are primarily based on optical data, although there are also relatively few studies using SAR [
58]. The relatively poor accuracy of current methods for determining organic matter content in soil complicates the use of this parameter in soil moisture modeling, particularly in the context of model inversion. This does not change the fact that developing remote sensing methods for modeling soil organic matter content is crucial, regardless of soil moisture modeling, as it is a key parameter determining soil fertility.
It should be noted that the conducted studies did not account for all existing soil types, and not all textures were represented to the same or sufficient degree. Therefore, these results cannot be extrapolated to other areas where these soil parameters are different. The proposed method appears to be a good alternative to soil moisture studies that take soil roughness into account. The conducted studies do not answer the question of why good accuracy in soil moisture modeling can be achieved using soil texture while completely ignoring soil roughness, although many previously mentioned studies demonstrate a very significant role of soil roughness in moisture modeling. Part of the answer to this question is probably the significant impact of silt and clay content in the soil on its ability to retain water. It is also important to note that actual surface roughness does not always reflect “radar” soil roughness. This appears to be related to the varying depth of radar wave penetration depending on soil moisture, texture, and loosening. This varying depth of microwave wave penetration may be the reason why, in many cases, soil roughness simulated using SAR imagery models poorly matches surface roughness. The interdependence of all these parameters makes a thorough understanding of this issue very difficult and requires analysis of the independent effects of soil texture, roughness, and moisture on the SAR signal. This is a difficult goal to achieve because it requires numerous, simultaneous measurements of these features in carefully selected study areas. Further research is needed to better understand this issue. It is also essential to incorporate the influence of vegetation cover into analyses.
These conclusions are consistent with other studies, which indicate that incorporating high-resolution spatial information on various environmental parameters, including soil texture [
16,
18,
41] and/or various advanced modeling methods [
24] into the determination of soil moisture from satellite data is a very promising direction of development. This allows for combining high spatial resolution with extensive coverage. This applies not only to SAR data but also to radiometers and other sensors [
8,
24]. Direct comparison of the results of this study with these solutions is not possible due to differences in environmental conditions, factors considered, such as vegetation, or validation methods. Nevertheless, their combined analysis leads to the conclusion that obtaining a high-quality, broadly applicable soil moisture model requires high-resolution spatial information, preferably derived from satellite data, on other environmental parameters, including soil. Soil texture appears to be a key element preferable to roughness, because it does not change dramatically over time, especially in relatively flat areas. Therefore, developing methods for determining it from satellite data appears to be an important step toward better and more accurate soil moisture models.
5. Conclusions
Surface soil roughness has a very limited relationship with the soil roughness observed by the SAR system, expressed using polarimetric signal decompositions. It appears that “polarimetric” soil roughness is related more to the ease and depth of microwave penetration into the soil than to real soil surface roughness. The penetration depth of this radiation appears to be primarily determined by soil moisture, soil species (texture), as well as agricultural practices that influence the degree of soil loosening. Silt and clay content in the soil is a more important driver that should be considered when modeling soil moisture with SAR than soil roughness. It is possible to obtain high accuracy soil moisture measurements using a single model for a wide variety of soil species and soil roughness, from very smooth to even very rough, without any knowledge of this parameter, provided at least approximate information about the sum of silt and clay content in the soil is available. This appears to be related to the significant impact of silt and clay content in soil on the soil’s ability to retain water, and thus on the observed moisture range for a given soil species under specific climatic conditions and water relations occurring in a given area, related to the topography and the amount and intensity of atmospheric precipitation. The novelty of the proposed solution lies in the use of publicly available materials, available, among others, for the whole Poland, that contains information on soil species which can be used to approximate soil texture. The use of simple and readily available machine learning methods is also significant. These features, along with the lack of the need for surface roughness testing, make this model convenient and easy to invert, while also producing very satisfactory results. There is a need for further soil moisture studies using SAR, taking into account actual field measurements of soil texture, rather than just their approximate values obtained from soil maps based on soil species. It also seems important to explore remote sensing methods for determining soil texture.