Digital Soil Texture Mapping and Spatial Transferability of Machine Learning Models Using Sentinel-1, Sentinel-2, and Terrain-Derived Covariates

: Soil texture is an important property that controls the mobility of the water and nutrients in soil. This study examined the capability of machine learning (ML) models in estimating soil texture fractions using different combinations of remotely sensed data from Sentinel-1 (S1), Sentinel-2 (S2), and terrain-derived covariates (TDC) across two contrasting agroecological regions in Southwest Germany, Kraichgau and the Swabian Alb. Importantly, we tested the predictive power of three different ML models: the random forest (RF), the support vector machine (SVM), and extreme gradient boosting (XGB) coupled with the remote sensing data covariates. As expected, ML model performance was not consistent regarding the input covariates, soil texture fractions, and study regions. For example, in the Swabian Alb, the SVM model performed the best for the sand content with S2 + TDC (RMSE = 3.63%, R 2 = 0.42), and XGB best predicted the clay content with S1 + S2 + TDC (RMSE = 6.84%, R 2 = 0.64). In Kraichgau, the best models for sand (RMSE = 7.54%, R 2 = 0.79) and clay contents (RMSE = 6.14%, R 2 = 0.48) were obtained using XGB and SVM, respectively. Moreover, the results indicated that TDC were critical in estimating soil texture fractions, especially in Kraichgau, which indicated that topography plays an important role in deﬁning the spatial distribution of soil properties. In contrast, the contribution of remote sensing data better predicted the silt and clay content in the Swabian Alb. The transferability of a region-speciﬁc model to the other region was low as indicated by poor predictive performance. The resulting soil-texture-fraction maps could be a signiﬁcant source of information for efﬁcient land resource management and environmental monitoring. Nonetheless, further research to evaluate the added value of the Sentinel imagery and to better analyze the spatial transferability of machine learning models is highly recommended


Introduction
Precise information on the spatial variability of agricultural soil properties has a vital role in designing novel farming systems for efficient management to prevent soil degradation [1][2][3]. Soil texture, among other factors, controls the water retention, mobility, and dissolved chemicals in soil, which, therefore, influences crop yield and the nutrient equilibrium in the rhizosphere [4,5]. Soil texture is a critical input for environmental and agricultural modeling and assessment [6,7]. Existing soil texture maps often have a coarse resolution and lack sufficient detail for efficient resource management in croplands and modeling [3,8].
Ground-based soil texture measurements rely on labor-intensive and costly procedures. Therefore, many scientists have dedicated tremendous efforts to developing robust and cost-effective approaches to provide updated and improved soil texture maps [9,10]. A combination of remote sensing (RS) data linked to digital soil mapping (DSM) has been shown to be a strong rapid-throughput method for mapping different soil properties. RS technology is efficient in overcoming the difficulties raised in conventional soil mapping. It vastly reduces field and laboratory labor [10][11][12][13][14][15][16]. Space-borne multispectral satellite data in the optical and near-infrared ranges (VNIR-SWIR) increasingly provide opportunities to quantify different soil attributes through the appropriate empirical models with varying success [17][18][19]. The recent release of the high-resolution Sentinel-2 (S2) satellite presents several advantages for DSM. Several studies achieved acceptable accuracy in the quantitative estimation of some soil properties such as silt and clay contents using multispectral satellite data from S2, ASTER, Landsat-7, and Landsat-8 imagery [17][18][19][20]. Recently, Vaudour et al. [21] and Gomez et al. [15] investigated the spectral behavior of clay contents with varying degrees of accuracy using multispectral S2 data.
Moreover, there was considerable technical progress in developing new RS missions in the microwave spectral range. Because of their capability in data collection in all weather conditions, day and night, the microwave sensors are of great interest for monitoring the soil and vegetation properties in agricultural areas [22,23]. Due to longer wavelengths than sensors in the optical and infrared ranges, radar sensors allow information retrieval from the top few cm of bare soil and provide spectral information over partly vegetated soil surfaces [24]. Generally, the spectral response of soil varies based on the soil particle fractions, allowing the estimation of soil properties [24][25][26][27][28].
Despite the substantial number of studies of RS applications in estimating soil attributes, findings in the literature indicate that the applicability of both multispectral VNIR/SWIR and radar RS in accurately estimating soil texture fractions is in an early stage, and studies focusing on the synergistic use of multisource RS encompassing both optical and microwave domains are rare and mainly limited to the field or local scale. Due to the current and future demand for DSM, especially at large scales, the potential of explicitly spectral bands from recently released microwave Sentinel-1 (S1) and multispectral S2 sensors in predicting and mapping soil properties needs further investigation [3,15,21,29,30].
Machine learning (ML) models are more efficient than traditional statistical models when multisource datasets are applied [31][32][33][34]. According to Ma et al. [35], ML models quantify the relative importance of covariates from different sources that control soil variability. However, findings indicate that the prediction results from different models might vary substantially when the same data source is applied [31,32,36]. Therefore, when dealing with multisource datasets in areas offering meaningful intrinsic and extrinsic heterogeneity, it is necessary to assess different models' abilities to select the best-performing model with the highest accuracy and the lowest uncertainty [32]. Thus, an approach that simultaneously uses multisensor RS data combined with environmental covariates may allow soil texture estimates with a better accuracy at a higher spatial resolution.
This study examined the potential effects of RS imagery (S1 and S2) and terrain-derived covariates (TDC) in estimating soil texture fractions (clay, silt, and sand contents) across two contrasting agroecological regions, Kraichgau and the Swabian Alb, in Southwest Germany. In this context, the aims of the present study were (1) to evaluate the performance of ML algorithms, including the random forest-RF, the support vector machine-SVM, and extreme gradient boosting-XGB, in mapping soil texture fractions using S1 and S2 data individually and combined with terrain-derived covariates; (2) to explore the most effective covariates to estimate soil texture fractions; and (3) to evaluate the transferability of predictive models outside of the initial training region in predicting soil texture fractions.

Study Areas and Soil Sampling
The study sites were two agroecological regions, Kraichgau (K) and Swabian Alb (SA), in Southwest Germany ( Figure 1). Each study site covered an area of approximately 1600 km 2 centered on 49 • 04 N, 8 • 48 E and 48 • 25 N, 9 • 29 E, respectively. The Kraichgau region is located at 100-400 m a.s.l. with soils formed mainly from loess. It is a part of the Neckar River watershed with an annual mean temperature of more than 9 • C and mean annual precipitation of 720 to 830 mm. The Swabian Alb is a low mountain plateau in Southwest Germany located at an altitude of 500-850 m a.s.l. with an annual average temperature of 6-7 • C and mean annual precipitation between 800 and 1000 mm. Its soils developed mainly on Jurassic limestone and have a high clay content due to strong decalcification and weathering. Further details can be found in Fischer et al. [37], Ingwersen et al. [38], and Ali et al. [39]. Bulk soil samples were taken (0-30 cm) from agricultural fields during field campaigns via a probability-based sampling design (e.g., simple randomized and regular grid sampling) across the two regions [40]. A sample size of 75 (for K) and 105 (for SA) of these archived soils were analyzed for soil particle size fractions (e.g., sand, silt, and clay) based on German classification (2000-63 µm of sand, 63-2 µm of silt, and < 2 µm of clay) [41] following the pipette method [42]. Briefly, 20 g of < 2 mm soil samples were first checked for CaCO 3 by adding a few drops of 10% HCl. Soil organic carbon was removed by adding 6 mL of 30% hydrogen peroxide (H 2 O 2 ) solution and was heated to 80 • C. Then, samples were rinsed and centrifuged. Sand was first separated out using wet sieving with a 63 µm sieve size. The remaining silt and clay contents were then determined by measuring the rate of settling of these two separates from the suspension in water. The time required for silt and clay to settle was calculated using Stokes law. Next, sand, silt, and clay contents were converted into the World Reference Base particle size classification system (2000-50 µm of sand, 50-2 µm of silt, and < 2 µm of clay) [43] using the R package "soil texture" [44]. transferability of predictive models outside of the initial training region in predicting soil texture fractions.

Study Areas and Soil Sampling
The study sites were two agroecological regions, Kraichgau (K) and Swabian Alb (SA), in Southwest Germany ( Figure 1). Each study site covered an area of approximately 1600 km 2 centered on 49°04′N, 8°48′E and 48°25′N, 9°29′E, respectively. The Kraichgau region is located at 100-400 m a.s.l. with soils formed mainly from loess. It is a part of the Neckar River watershed with an annual mean temperature of more than 9 °C and mean annual precipitation of 720 to 830 mm. The Swabian Alb is a low mountain plateau in Southwest Germany located at an altitude of 500-850 m a.s.l. with an annual average temperature of 6-7 °C and mean annual precipitation between 800 and 1000 mm. Its soils developed mainly on Jurassic limestone and have a high clay content due to strong decalcification and weathering. Further details can be found in Fischer et al. [37], Ingwersen et al. [38], and Ali et al. [39]. Bulk soil samples were taken (0-30 cm) from agricultural fields during field campaigns via a probability-based sampling design (e.g., simple randomized and regular grid sampling) across the two regions [40]. A sample size of 75 (for K) and 105 (for SA) of these archived soils were analyzed for soil particle size fractions (e.g., sand, silt, and clay) based on German classification (2000-63 µm of sand, 63-2 µm of silt, and < 2 µm of clay) [41] following the pipette method [42]. Briefly, 20 g of < 2 mm soil samples were first checked for CaCO3 by adding a few drops of 10% HCl. Soil organic carbon was removed by adding 6 mL of 30% hydrogen peroxide (H2O2) solution and was heated to 80°C. Then, samples were rinsed and centrifuged. Sand was first separated out using wet sieving with a 63 µm sieve size. The remaining silt and clay contents were then determined by measuring the rate of settling of these two separates from the suspension in water. The time required for silt and clay to settle was calculated using Stokes law. Next, sand, silt, and clay contents were converted into the World Reference Base particle size classification system (2000-50 µm of sand, 50-2 µm of silt, and < 2 µm of clay) [43] using the R package "soil texture" [44].

Remote Sensing Data (RS)
In this study, RS data from two freely available missions operated by the European Space Agency (ESA), S1 and S2, were used for estimating soil texture in the 0-30 cm depth [45,46]. S1 allows full polarimetric imaging, i.e., horizontal transmit and receive (HH), horizontal transmit and vertical receive (HV), vertical transmit and horizontal receive (VH), and vertical transmit and receive (VV), at a resolution of 5-20 m every 12 days. This study used dual-polarized SAR level-1 ground range detected products (VH, VV) in 'IW' mode as the original data was at a resolution of 10 m. Multiploidization SAR allows a better understanding of soil surface properties compared only to the backscatter coefficients of a single polarization. Several preprocessing algorithms, including radiometric and geometric correction, speckle filtering, and thermal noise removal, were applied to each SAR image to draw out the backscatter coefficient at any polarization mode. These allow reducing error propagation in the following processes and analyses. S2 is a space-borne multispectral spectral imager with a five-day revisit time, which started its mission first in June 2015 as part of the "Copernicus" program. The S2 image comprises 13 spectral bands in the optical domain (VNIR/SWIR) at spatial resolutions of 10, 20, and 60 m. After excluding B1 (Coastal Aerosol), the remaining 11 bands (i.e., B2 (490 nm), B3 (560 nm), B4 (665 nm), B5 (705 nm), B6 (740 nm), B7 (783 nm), B8 (842 nm), B9 (945 nm), B10 (1375 nm), B11 (1610 nm), and B12 (2190 nm)) were extracted for further use ( Table 1). The present study acquired the median of S1 and cloud-free multitemporal S2 images for three periods of time, including April-May (T1), August-September (T2) and October-November (T3) 2015-2016 from Google Earth Engine [47]. Table 1. Terrain-derived covariates and remote sensing data used for predicting soil texture fractions.

Aspect
The down slope direction of the maximum rate of change Length-slope factor (LS) Combined slope length and slope angle

Terrain-Derived Covariates (TDC)
In this study, a free digital elevation model (DEM) from the Shuttle Radar Topography Mission (SRTM) [48] was used for deriving terrain-derived covariates (TDC). There are several TDC that can be extracted from DEMs which are directly or indirectly related to soil properties. According to this, the 10 terrain attributes involved in controlling the spatial distribution of texture fractions were derived from a preprocessed DEM with a 30 m resolution using SAGA GIS software [49] to explain the spatial variability of soil texture fractions (i.e., clay, silt, and sand). The included TDC were related to local-scale terrain morphology (e.g., elevation and aspect), hydrological characteristics (e.g., length-slope factor, topographic wetness index), and landscape-scale morphometry (Table 1).

Random Forest (RF)
Initially introduced by Breiman [50], RF is an ensemble learning algorithm used for regression and classification tasks. RF is an adjusted version of bagged decision trees made of large decorrelated trees that require a few tuning parameters [51]. RF is composed of a reimplementation of the classification and regression tree (CART) model based on the bootstrap sampling. Bootstrapping is a resampling method that generates several datasets equal to the size of the original data [52]. In bootstrapping, some records will be selected multiple times due to the allowance of sampling with replacement, whereas others will not be selected. Unselected records, known as out-of-bag errors, assess model performance. Within the RF algorithm, the mtry parameter was tuned, which denotes the number of explanatory variables for splitting at each tree node.

Support Vector Regression (SVR)
Introduced by Cortes and Vapnik [53], support vector machines (SVMs) are learning models frequently used for distribution estimation, regression, and classification tasks. SVMs transform original independent variables into a higher or infinite dimensional feature space using nonlinear techniques. They aim to create a better separation [54]. The regression task is called support vector regression (SVR); nevertheless, the goal of the classification and regression problem is to minimize error. The kernels used were linear (LN), polynomial (PL), radial basis function (RBF), and sigmoid (SIG); each requires specific parameters, and the proper selection and parameterization of these kernels control the accuracy of the SVR [53]. Tuning parameters were C, known as the penalty factor, to avoid overfitting and to regulate the trade-off between the margins and training errors [55]. The degree of nonlinearity was controlled by the kernel width or gamma (γ) parameter.

Extreme Gradient Boosting (XGB)
XGB is a scalable tree boosting method known for its performance and speed [56]. XGB constructs several consequent decision trees based on the criterion of residuals of the previous tree model or of prediction errors; thus, the algorithm principally marks the samples with higher uncertainty, and, lastly, the generated results are added to calculate the final output [57]. XGB has numerous tuning parameters that make the model complicated, most of which are similar to other tree-based models, plus some hyperparameters intended to lessen the risk of overfitting, reducing prediction variability, and thus increasing the accuracy [51]. Tuning parameters in XGB [51] used were: nrounds, to determine the maximum number of iterations; max_depth, to control the depth of the tree; eta, to control the learning rate for capturing the patterns in data; gamma, to control regularization to prevent overfitting; colsample_bytree, the number of variables provided for a tree; min_child_weight, the tree splitting stops when the leaf node has a minimum sum of instance weight lower than min_child_weight in the classification task; and the last tuning parameter is subsample which controls the number of observations provided for a tree.

Model Evaluation
Identifying the optimum parameters and model performance was conducted using a 10-fold cross-validation approach to assess the performance of the calibrated model on new data. Ten-fold cross-validation is a resampling process in which all data are randomly divided into ten equal folds; at each run, one-fold is set out for validation, and the remaining k − 1 folds are used for calibration [58]. Then, the final accuracy is computed from the average accuracy of all folds. Three performance indicators were used to evaluate the prediction accuracy of models: R 2 , root-mean-square error (RMSE), and the mean absolute error (MAE) of cross-validation. All machine learning modeling was done in R Studio with the "Caret" package [59].

Summary Statistics of Soil Texture Data
The descriptive characteristics of the measured soil texture fractions for each study region are shown in Table 2. In the K region, the silt contents were highest, with a mean value of 56.7%, whereas the clay and sand fractions were 23.1% and 20.3%, respectively. The sand content varied from 3.4 to 84.3% with a standard deviation (SD) of 22.8%, while the clay and silt contents varied from 10.2 to 76.6% (SD = 18.7%) and 5.6 to 65.4% (SD = 10.2%), respectively. According to the WRB soil texture classes, considering the high silt contents, the soils would typically be classified as silt loam and silty clay loam in the K region. In contrast, the clay and silt sizes dominated in the SA region, representing an average of 46% and 45.2%, respectively, followed by a sand content of 8.8%. Unlike the K region, the silt and clay contents showed the largest variability with SDs of 10.39 and 10.2%, respectively, while the sand contents ranged from 4.1% to 38.4% with an SD of 5.3%. Given that the clay content of the field observation in the entire SA region was over 40%, the soils would typically be classified as clay and silty clay textures. There was a wide range of soil texture classes in the SA region, which included 9 out of the 12 possible texture classes. However, the dominant texture classes across both regions were silt loam (27%), silty clay (25%), silty clay loam (22%), and clay (17%). The remaining 10 percent of measured samples represented other soil texture types ( Figure 2).

Model Performance
Three different ML models were used for predicting soil texture fractions and were evaluated using a 10-fold cross-validation strategy. The model accuracy and predictor selections for the SA and K regions are shown in Table 3 and Table 4, respectively. The prediction accuracy varied among different models and selected predictors in terms of the RMSE, R 2 , and MAE. As shown, the application of TDC and RS data (S1 + S2) alone re- The variability of the texture within the two regions was mainly due to both intrinsic (e.g., parent material, climate, mineralogy, and soil-forming processes) and extrinsic (e.g., Remote Sens. 2022, 14, 5909 7 of 17 land use type and management practices) factors [60]. The Rhine Valley typically had sandy soil texture classes surrounded by soils with higher silt and clay contents in the K region. Farming with different crop types demands differing plowing practices, resulting in variability in texture fractions within the two regions. Topography, erosion, and deposition also may result in high variability in texture fractions [61].

Model Performance
Three different ML models were used for predicting soil texture fractions and were evaluated using a 10-fold cross-validation strategy. The model accuracy and predictor selections for the SA and K regions are shown in Tables 3 and 4, respectively. The prediction accuracy varied among different models and selected predictors in terms of the RMSE, R 2 , and MAE. As shown, the application of TDC and RS data (S1 + S2) alone resulted in the lowest accuracies with some exceptions. In the SA region, the synergic application of all covariates (S1 + S2 + TDC) generally increased model performance. However, an exception was the SVM model where the S2 + TDC predictor selection strategy was most precise for sand predictions, resulting in an RMSE = 3.6%, R 2 = 0.42, and MAE = 2.6%. Although the S1 + S2 + TDC predictors yielded an improved accuracy with the XGB model than other predictor selection strategies, the model performance was still significantly lower (RMSE = 4.4; R 2 = 0.33) compared to the RF and SVM models for sand prediction.
Regardless of the ML model, the S1 + S2 + TDC predictor selection strategy resulted in the best model performance when predicting silt contents. Indeed, again, the SVM model using S1 + S2 + TDC was most accurate, with an RMSE of 7.3% and an R 2 of 0.54. For clay estimates, the XGB model was the most accurate, with an RMSE = 6.8% and an R 2 = 0.64, using the S1 + S2 + TDC predictors.  Relatively similar trends in the model accuracy of the K region were observed by changing the model input predictors, where the S1 + S2 + TDC predictors increased clay and sand model performance using the SVM and XGB models, respectively. For sand prediction, the XGB model was most robust with S1 + S2 + TDC predictors, yielding an RMSE = 7.5 and an R 2 = 0.79. In terms of the clay predictions, the SVM with S1 + S2 + TDC predictors was most accurate, resulting in an RMSE = 6.1 and an R 2 = 0.48. For the silt content, XGB showed the highest performance again but with S1 + TDC predictors and RMSE and R 2 values of 7.9 and 0.85, respectively. Figure 3 gives the relative importance of predictor covariates for the best-fitting ML models used for texture fraction prediction. Only the top 10 predictors for model fitting are shown. The relative importance of predictor covariates differed both by region and by model type. The selected predictors were ordered in the selection process by their influence or by the order in which they were contributed. S2 and TDC explained 17.5% and 82.5% of the sand variation in the SA region, respectively. The elev. (27.7%) and CNBL (23.7%) were the most strongly relevant covariates in fitting the learning models, followed by less relevant predictors such as CI, CND, PLC, B2, B3, and B4. S2 covariates accounted for 88.8% of the total variation for predicting the silt content, followed by TDC with a relative importance of only 11%. The S2 features in SWIR, such as B11 and B12, were the most essential variables in model fitting. The TWI was the only terrain-derived covariate contributing to model fitting, accounting for only 11% of silt variation. The best-fit models for clay and silt contents had relatively similar selected variables. As shown in Figure 3, S1, S2, and TDC accounted for 6.5%, 69%, and 24% of clay variation. The model identified mainly B12 (41%) and TWI (19%) as the most relevant covariates for clay predictions. Remote Sens. 2022, 14, x FOR PEER REVIEW 10 of 19 Figure 3. Importance of covariates for predicting the best-fitted models for soil texture fractions in SA (a-c) and K (d-f) regions. (Refer to Table 1 for TDC.)

Mapping of Soil Textural Classes within the Training Region
The best model, described in Section 3.2, was applied to create the spatial prediction map of texture fractions and, later, the texture classes for both training regions separately. Figure 4 shows the spatial prediction of texture fractions and texture maps in the SA region. The visual inspection of the continuous maps displayed a relatively homogenous distribution of the sand content in the SA region, ranging mainly from 5.8% to 25.8%. For the silt content, a mosaic pattern with three different classes was obtained: silt content < 30%, between 30 and 60%, and >60%. Clay contents between 30 and 50% were dominating, covering mainly the center and partially covering the southwestern region. Interestingly, the areas with a higher clay content corresponded with relatively high altitudes. The SA region was classified mostly into three soil texture classes, including silty clay, clay and silty clay loam, corroborating the field observations. The silty clay texture was dominant, covering roughly 70% of the entire region. The silty clay loam texture dominated mainly in the eastern areas of the study region. The clay texture class was observed mainly in the central part of the region.
In the K region, the soils in the western regions represented high sand contents >70%. While the silt contents between 30 and 60% were dominant in the SA, there was a higher silt content (>60%) in the K region. The clay content could be split into three different classes as follows: <30%, 30-50%, and >50%. Clay contents of less than 30% dominated the region. Silt loam was the dominant texture class, covering roughly 67% of the entire area. A small area at the western edge of the study region in the Rhine Valley represented a In the K region, the contributions of S1, S2, and TDC in describing sand variation were 5%, 12%, and 82%, respectively. The CNBL had the highest importance, accounting for 67% of the total sand variation. Likewise, the CNBL (57%) and elev. (30%) were the most important covariates for silt prediction, followed by the RSP, VV, and VH. Comparably, TDC accounted for 64% of the clay variation, while S1 and S2 explained 28 and 8% of clay variation, respectively. The results showed that the LS (13%), VV (13%), CNBL (12%), Elev. (11%), and CL (11%) were the most critical covariates describing clay variation.

Mapping of Soil Textural Classes within the Training Region
The best model, described in Section 3.2, was applied to create the spatial prediction map of texture fractions and, later, the texture classes for both training regions separately. Figure 4 shows the spatial prediction of texture fractions and texture maps in the SA region. The visual inspection of the continuous maps displayed a relatively homogenous distribution of the sand content in the SA region, ranging mainly from 5.8% to 25.8%. For the silt content, a mosaic pattern with three different classes was obtained: silt content < 30%, between 30 and 60%, and >60%. Clay contents between 30 and 50% were dominating, covering mainly the center and partially covering the southwestern region. Interestingly, the areas with a higher clay content corresponded with relatively high altitudes. The SA region was classified mostly into three soil texture classes, including silty clay, clay and silty clay loam, corroborating the field observations. The silty clay texture was dominant, covering roughly 70% of the entire region. The silty clay loam texture dominated mainly in the eastern areas of the study region. The clay texture class was observed mainly in the central part of the region. heterogenous textures with relatively high sand contents, including sandy loam, loam and clay loam texture classes ( Figure 5).  In the K region, the soils in the western regions represented high sand contents >70%. While the silt contents between 30 and 60% were dominant in the SA, there was a higher silt content (>60%) in the K region. The clay content could be split into three different classes as follows: <30%, 30-50%, and >50%. Clay contents of less than 30% dominated the region. Silt loam was the dominant texture class, covering roughly 67% of the entire area. A small area at the western edge of the study region in the Rhine Valley represented a heterogenous textures with relatively high sand contents, including sandy loam, loam and clay loam texture classes ( Figure 5).

Spatial Transferability of Best-Fitted Models Outside the Training Region
To test the model transferability, the best-fit models using RS data and the terrainderived predictors in each training region (shown in Section 3.2) were further applied separately to predict the texture fractions for croplands in the other region with the prediction accuracies in Table 5. The soil-texture-fraction prediction accuracy for the K region dramatically decreased when the best-fit SA region models were used. Generally, the per- Figure 5. Estimated soil texture fractions (% of sand, silt, and clay) and WRB soil texture classes in 0-30 cm based on best-fitted machine learning models for the K region.

Spatial Transferability of Best-Fitted Models Outside the Training Region
To test the model transferability, the best-fit models using RS data and the terrainderived predictors in each training region (shown in Section 3.2) were further applied separately to predict the texture fractions for croplands in the other region with the prediction accuracies in Table 5. The soil-texture-fraction prediction accuracy for the K region dramatically decreased when the best-fit SA region models were used. Generally, the performance of all the best fit models was poor in predicting texture fractions in a new geographic region. For sand prediction in the K region, the RMSE increased to 25.2%, which was six times higher than in the transferred model (SA_Sand_SVM_S2 + TDC). The accuracy also declined for silt and clay prediction in K with an increased RMSE of 24.3% and 19.8% compared to those in the original models, i.e., SA_Silt_SVM_S1 + S2 + TDC and SA_Clay_XGB_S1 + S2 + TDC, respectively. Likewise, the R 2 decreased, and the MAE increased. In the SA region, in sand prediction with the K_Sand_XGB_S1 + S2 + TDC model, the RMSE value increased to 6.1%, and the R 2 declined to 0.003. For silt and clay contents, similar trends in the RMSE were observed again with an increase to 19.6 and 10.2, respectively. Thus, the models performed best for both regions when the calibration points were located within the respective application area (within the training region; Section 3.4).

Variable Importance for ML Models
Our study evaluated the potential use of TDC and RS data as covariates in five combinations (S1 + S2, TDC, S1 + TDC, S2 + TDC, and S1 + S2 + TDC) to predict soil texture fractions using three different ML models (RF, the SVM, and XGB) across two contrasting agroecological regions (SA and K). Our results indicated that, due to the environmental and soil differences as well as the RS data quality, the performance of the trained models was different in predicting texture fractions within the individual regions (Tables 3 and 4), which was consistent with the results of previous studies dealing with estimating soil properties using RS data [31,32,36,62]. Importantly, the addition of TDC to RS data (i.e., S1 + S2 + TDC) outperformed the use of any of TDC or RS data alone. Apart from the specific ML model type, the prediction accuracies in the combined mode of RS and TDC for sand, silt, and clay contents varied with an R 2 ranging from 0.49 to 0.64, with roughly 20 and 22% improvements compared to the application of TDC or RS data alone, respectively. The results corroborated previous studies focusing on the combined use of RS data in estimated soil carbon as well as texture fractions [28,62,63]. However, it should be highlighted that the sensitivity of optical RS to the atmospheric effects (clouds, particles, and gas molecules) and soil surface conditions affecting spectral reflection might be the leading causes of their lower accuracies than those of combined modes. Furthermore, the spectral reflection from the soil is primarily affected by surface crop residues after harvesting or plowing as well as the soil moisture condition [3,15,28,64,65]. Thus, a diverse range of estimation accuracies has been reported in the literature depending on both climate and farming practices.
Furthermore, our results indicated the significant contribution of SWIR bands (B11 and B12) in the optical domain and TDC data (CNBL, Elev., TWI, and LS) in model training for silt and clay prediction. However, variable selection was dramatically affected by the training regions. While RS data had a significant role in model training in the SA region, TDC contributed markedly to predicting silt and clay contents in the K region. Similar to our findings are those reported by Meyer et al. [66], Marzahn and Meyer [24], and Taghizadeh-Mehrjardi et al. [62] who found TDC to be the most important predictors in soil texture fraction prediction. Bousbih et al. [28] and Gholizadeh et al. [29] revealed that the SWIR bands of S2 (B11 and B12) had a considerable potential to estimate clay contents, which also corroborates the findings in the current study. It was concluded that the Sentinel SWIR bands assigned to soil minerals provide a spectral predictor for different mineral constituents, including texture fractions [67]. In general, the contribution of S1 data was relatively lower than S2 and TDC in predicting texture fractions, ranging from 0 to 28% depending on the best-performing models and representing the relatively moderate importance of VV and VH polarizations in model fitting. Several studies have already reported that VH and VV polarizations are sensitive to soil moisture dynamics and surface roughness and that they mainly contribute to describing different soil properties [68]. In contrast, Matti et al. [69] reported that the HH polarization has a higher sensitivity to soil properties than VV and VH.
Despite the limited studies focusing on radar data in this context, it seems to improve the accuracy in estimating soil surface properties, specifically when combined with either optical or TDC data. It can collect spectral data through the clouds and give other signals from the surface soil due to its longer wavelength [23,28]. Furthermore, backscattered radar signals are assigned to the soil moisture content as a robust representative of soil moisture conditions. Soil with a higher clay content tends to retain moisture longer and to dry slowly. Thus, the estimation of soil textures, specifically clay content, could improve when backscattered radar signals are applied [26,28]. In our study, the sensitivity of SAR data was observed by other authors when quantifying different soil properties. Marzahn and Meyer [24] estimated the soil texture precisely in sandy loam soil at the field scale with a mean RMSE of 2.42 (Mass-%) using SAR data at 1.3 GHz linked to geostatistics. An RMSE of 1.2% for the clay content was also obtained by Zribi et al. [26]. The best-fit models in the current study for soil texture had an RMSE of 5.1 and 7.2% in the SA and K regions, respectively. The higher RMSE in our study might be due to the large-scale sample collection across the two regions on farms, representing different surface conditions in terms of soil moisture, retained crop residues, and partial vegetation cover, which further affected the quality of the acquired RS data.

Accuracy Assessment of ML Models
Usually, ML models outperform simple models and increase the predictive power of models in understanding the specification of the soil horizon in a soil profile [70], explaining soil variability [31], and helping to understand the causes of soil variability [35]. However, ML model robustness demands more data and meaningful predictors to avoid overfitting or misleading results and to improve the interpretability of the model and, consequently, our understanding of soil. Thus, it is worth adopting more predictor covariates from multiple data sources, including categorical maps, climatic data, terrain attributes, and remote sensing data. In the current study, texture fractions were predicted with the best model, with R 2 values of 0.49-0.64 and 0.48-0.85 for the SA and K regions, respectively (Tables 3 and 4). In comparison, using optical S2 data resulted in clay fraction predictions in the model, with R 2 values of 0.39-0.42 and R 2 < 0.22 for silt and sand using the partial least squares regression [21]. Bousbih et al. [28], in contrast, using derived soil moisture contents from S1 combined with S2, provided improved estimates of clay contents with an overall accuracy of R 2 = 0.55, even in areas covered by vegetation and in areas under different climatic conditions. Moreover, Castaldi et al. [71] demonstrated that the prediction of clay contents was more accurate when the soil moisture content increased using clay indices in the optical domain. More robustly, Rosero-Vlasova et al. [72] reported R 2 values of 0.54 and 0.7 in predicting clay contents using Landsat and S2 data, respectively.
Generally speaking, the range of the prediction accuracies for the best-performing models within any region (Figures 4 and 5) indicated a comparable performance to those studies previously cited. Additionally, in the same study regions, lab-based proximal sensing (midinfrared spectroscopy) resulted in an RMSE of 2.7, 4.6, and 3.1% for sand, silt, and clay contents, respectively [73]. But the transferability of the region-specific model to the other region strongly decreased the predictive performance. The possible reason might be attributed to the environmental, climatic, terrain, and soil differences between the two study regions, resulting in different variable contributions in model developments ( Figure 3). Thus, a region-specific trained model may not encompass a new, unknown area's intrinsic and extrinsic spatial heterogeneity. Therefore, the trained models within each training region could not describe the soil texture fraction variability outside their training region. Our findings were in line with the findings of the other researchers [74][75][76] who indicated that ML models have a low accuracy in spatial extrapolations due to the complexity of the spatial variation in soil and the difficulty of matching soil-forming factors exactly between two areas. It is worth noting that the similarity of the environmental covariates between two regions (i.e., soils of two areas are mainly controlled by similar soil-forming factors) plays an important role in achieving the transferable ML models for predicting soil properties and soil classes. Nevertheless, splitting the regions into finer geographic scales, such as the slope-aspect class, may improve the predictive results of transferability [77,78]. An alternative option is the use of more advanced ML models, such as semi-supervised learning which is a branch of ML that benefits from both supervised and unsupervised learning, which can be more effective in the spatial extrapolation of soils [79,80].

Conclusions
This paper examined the capability of satellite imagery (S1 and S2) and terrain-derived covariates (TDC) in estimating soil texture fractions and evaluated the spatial transferability of three machine learning models. Due to the complexity of the influential factors of soil properties (e.g., intrinsic and extrinsic) across the study regions, model performance was not consistent, even for different models in the same area. Taking sand prediction as an example, the SVM performed the best in the SA, while XGB was most accurate in the K region. Thus, the soil texture was mapped in each training region based on the most accurate model for each texture fraction. Regardless of the region or model, RS (S1 and S2) data alone did not have an adequate capability in estimating the texture fractions in the training regions. Due to the spectral response of soil, which is controlled by several factors (such as soil moisture, organic matter, surface roughness, atmospheric effects, structural effects, soil management, and vegetative coverage) and the technical constraints on the acquisition of RS data (specifically S2), using RS data alone to estimate soil texture fractions is still challenging. However, the model derived from combined data (RS and TDC) outperformed the models developed according to their sole application. This confirmed the dependency of texture fractions on both terrain-derived covariates and intrinsic soil attributes. While the ML models improved the texture fraction accuracy with RS data, the best-performing models in each training region led to poor performances in predicting texture fractions when transferred to the other region where the models were not trained. To increase the prediction accuracy, further studies should involve other sources of covariates such as climatic variables and multipolarization L-band SAR data in the microwave domain. To conclude, first, our results indicated that ML models are highly acknowledged because they enhance the prediction accuracy of soil while reducing the margin of error. Nonetheless, due to the heterogeneity of landscapes and influential factors, it is necessary to combine RS data with TDC for soil texture mapping at the regional scale. Second, region-specific calibrated models cannot be applied to other regions without a considerable loss of prediction accuracy. Thus, further research to evaluate the added value of Sentinel imagery, to better analyze the spatial transferability of the machine learning models, and to reveal the unclear problem is highly recommended.