Machine Learning to Estimate Surface Roughness from Satellite Images

: We apply the Support Vector Regression (SVR) machine learning model to estimate surface roughness on a large alluvial fan of the Kosi River in the Himalayan Foreland from satellite images. To train the model, we used input features such as radar backscatter values in Vertical–Vertical (VV) and Vertical–Horizontal (


Introduction
Surface soil roughness is an important parameter in many environmental applications, such as: agronomy, geomorphology, hydrology, meteorology, and climate change modeling [1,2]. By definition, surface roughness is a non-zero Gaussian random process that is parameterised by the Root Mean Square (RMS) height (s), and correlation length proposed a semi-empirical model to estimate surface soil roughness over a heterogeneous terrain. They validated their model using the results of the IEM single-scattering model. They found a good correlation between these models for small-or medium-range surface roughness with incidence angle (>35 • ). Their model accurately estimates the surface roughness over the homogeneous surface and overestimates in the region characterised by a high degree of roughness. Rahman et al. [38], applied the IEM model on Envisat ASAR images to estimate surface roughness and soil moisture. Baghdadi et al. [39] proposed an inversion model based on multi-layer perceptron to estimate soil parameters using C-band SAR data over bare agricultural plots. They trained the neural network model using simulated datasets generated from IEM models over a valid range of input parameters. They reported a precision of 0.5 cm (RMSE) for surface soil roughness below 2 cm. Sawada et al. [40] proposed a novel algorithm to retrieve the surface soil roughness through a fusion of SAR and optical data. They fused the Moderate Resolution Imaging Spectroradiometer (MODIS) and Advanced Microwave Scanning Radiometer 2 (AMSR2) data to determine surface roughness through the radiative transfer model. Baghdadi et al. [4] proposed an inversion technique to estimate the surface soil roughness from Sentinel-1 images. They generated synthetic data by training a neural network model through IEM model simulations. More recently, Mirmazloumi et al. [41] proposed a new empirical model to estimate surface soil attributes (i.e., soil moisture and surface roughness) from AIRSAR and RADARSAT-1 images.
Zribi and Dechambre [42] proposed an empirical model to estimate surface soil roughness by using two SAR images acquired at different incidence angles. They selected a small incidence angle, i.e., 23 • and a large incidence angle, i.e., 39 • to estimate the surface soil roughness. Srivastava et al. [43] proposed an empirical regression model to estimate surface soil roughness from SAR (Envisat-1) images. They observed that the linear data fusion of VH and VV in the form of VH-VV polarisation is more sensitive to surface soil roughness. Later, Marzahn et al. [24] proposed a novel approach to estimate surface roughness over an agricultural field using a photogrammetric acquisition system. They generated the surface models from digital images. They reported that RMS roughness of s ≤ 2 could reliably be estimated from a 2 m 2 acquisition areas. Recently, Ullmann and Stauch [44] evaluated the relationship between the various mono-and multi-temporal Sentinel-1 (SAR) features with the surface soil roughness. They concluded that the surface soil roughness is more sensitive to the vertical variation of the profile than the horizontal. More recently, Azizi et al. [45] developed a computerised approach to estimate surface soil roughness based on the stereo vision technique. They computed the elevation component to reconstruct the 3-D model of the images taken from the field.
Accurate estimates of surface roughness through backscattering models are mainly limited by the model assumptions. All the studies discussed above assume ideal soil characteristics. Under this assumption, the soil roughness is explainable by a single-scale stationary process (i.e., parameters do not change over time). Such assumptions are overruled due to the complex geometry of inherent soil surface [3]. Furthermore, most of the backscattering models are validated at fine scales under a controlled environment [46,47]. They do not accord well over a region that exhibits large intra-field variability.
To overcome the issues with the existing methods, we propose a novel machinelearning approach to predict surface soil roughness solely from the publicly available remote-sensing data. We trained thirty-five variants of seven different machine-learning algorithms using relevant features and in-situ-measured surface roughness. We extract these features from Sentinel-1, Sentinel-2, and SRTM data. Once the models are trained, we evaluated their performance using robust performance metrics in terms of their accuracy and computational time complexity. Finally, we used the best model to generate a surface roughness map.

Site Characteristics
We conducted this study on a large alluvial fan of the Kosi River on the Himalayan Foreland in north Bihar plain, India. This fan has been active since the Holocene, and resulted from the gradual migration of the Kosi River. During this process, the Kosi River deposited its sediment, carried from the Himalayas, and built a large conical sedimentary structure, the Kosi Fan [48][49][50][51]. The Kosi Fan is one of the largest fluvial fans, built over an area approximately 10, 351 km 2 , and has a radius of about 115-150 km [48,51]. Its surface is composed of homogeneous quartz sediments with a median grain size, varying in a range from 300 µm in its proximal to about 100 µm in the distal part [51,52]. The dominant soil types are sandy, sandy loam, loam, and silty loam. The aerial view of the Kosi fan appears nearly conical. Elevation of the Kosi Fan from the mean sea level varies between 110 m and 80 m in the proximal and 50 m and 30 m in the distal part. The surface slope varies gently from 8 × 10 −4 at the apex to 6 × 10 −5 near the toe [51].
About 84% of the total fan area is agricultural lands, 9% wetlands and water bodies, and 7 % built-up [53]. The Kosi Fan is very fertile for agriculture. The agriculture is practiced in two crop seasons; the autumn (last weak of May-October) and spring (December-April), also called "Kharif" and "Rabi", respectively. The landuse, geology, grain sizes, slope, and flat terrain, together, make the Kosi Fan an ideal field site for this study.

Satellite Data
We have used Sentinel-1, Sentinel-2 satellite images, and SRTM Digital Elevation Model (DEM) to set up the machine-learning models to estimate surface roughness (Table 1). These data are freely available; they can be downloaded from the official website of European Space Agency (https://scihub.copernicus.eu/; accessed on 17 December 2020) and US Geological Survey (https://earthexplorer.usgs.gov; accessed on 17 December 2020) respectively. The European Space Agency (ESA) launched the Sentinel-1A (March 2014) and −1B (March 2016) satellite missions under the Copernicus program.
The Sentinel-1 (1A & 1B) satellites consist of C-band SAR. They operate at a frequency of 5.405 GHz and measure the uninterrupted backscattered signals from the earth's surface in all weather conditions. Depending on the soil type and moisture conditions, at this frequency, the SAR signals can penetrate up to 5 cm of the topsoil surface [54,55]. Sentinel-1 satellites have a temporal resolution of 12 days, that jointly (1A and 1B) result in a 6-day repeat pass over the equator [56,57]. Sentinel-1 acquires images in four different modes: Stripmap (SM), Interferometric Wide swath (IW), Extra-Wide swath (EW), and Wave (WV). Based on the acquisition mode, they record the signals in co-polarisation (i.e., VV) or crosspolarisation (i.e., VH) at 10 m × 10 m cell size with 250 km swath. The incidence angle ranges between 29 • and 46 • in near-and far-range, respectively [56]. For our purpose, we have downloaded the dual polarised (VV & VH) Ground Range Detected (GRD) product (Table 1). Finally, we processed the Sentinel-1 images using the Sentinel Application Platform (SNAP) v8.0 Earth Observation processing tool to obtained the backscatter values (VV and VH). The processing steps include the radiometric calibration, multi-looking (with a multi-look factor of 6), speckle noise removal/minimising using refined Lee filter, and terrain correction. After processing, the resulting backscatter image has the cell size 60 × 60 m.
We downloaded Sentinel-2 images of level-2A processing (Table 1). These images are atmospherically corrected for Bottom-Of-Atmosphere (BOA) [58]. Sentinel-2, mission is a constellation of two satellites: Sentinel-2A and Sentinel-2B. They acquire images of the earth's surface in 13 spectral bands at different spatial resolutions (10-60 m) in the optical region of the electromagnetic spectrum. Sentinel-2 (2A and 2B) satellites have the temporal resolution 10 days that together result in a 5-day revisit period [59]. We used band 4 (0.64-0.68 µm) and band 8 (0.77-0.90 µm) of Sentinel-2 to obtain the NDVI. To do this, we subtracted band 5 from band 4 and divided it by the summation of band 4 and band 5. The resulting NDVI image has spatial resolution 10 × 10 m and its value ranges from −1 to +1. Larger values of the NDVI represent healthy vegetation condition [60]. To know the elevation and topography of the study area, we used a DEM obtained from the SRTM. This mission was launched in February 2000 in a joint venture between NASA, the National Geospatial-Intelligence Agency, and the German and Italian Space Agencies, with the objective to generate a high-resolution DEM of the Earth [61]. SRTM employed two synthetic aperture radars, C (λ = 5.6 cm) and X -band system (λ = 3.1 cm). It provides the earth surface elevation sampled over a grid of 1 × 1 arc sec (30 × 30 m) and about 15 m vertical accuracy [62,63]. These data are freely available in the public domain and can be downloaded from the official website of the US Geological Survey (https://earthexplorer.usgs.gov/; accessed on 17 December 2020).

In-Situ Measurement
In a field campaign during 11-20 December 2019, we measured the RMS surface roughness (s) at 78 different locations on the Kosi Fan ( Figure 1). To measure surface roughness, we designed a one-dimensional pin-profiler ( Figure 2a). This is a rectangular iron frame of length = 115 cm, width = 106 cm, and height = 105 cm. At one end along the width, a whiteboard (width = 106 cm and height = 65 cm) is attached with the help of two thin metal strips welded on either side to the arms of the frame. These strips have 100 holes (each 6 mm) at a regular spacing of 1 cm. They are used to place the aluminium pins (diameter = 5 mm and length = 60 cm) so that they can freely move vertically. The top end of these pins is painted in red, and the other end is made flat. This helps to clearly detect the position of pins on the board and also prevent them from pricking into the ground at the bottom end. A metal scale is attached vertically at the margin of the whiteboard by calibrating the instrument over a perfectly smooth surface. When this instrument is placed on the earth surface, these pins will adjust themselves according to the surface undulation. The amount of undulation can be measured by reading the position of each pin on the scale.   In the field, we used the random grid sampling approach to measure the in-situ surface roughness. We divided the study area into several square grids of size 4 km × 4 km each ( Figure 1). The circles marked in different colors ( Figure 1) represent an average surface roughness value from 3 to 5 sampling sites (within a pixel size of the Sentinel-1) that are separated at least 20 m from each other. This is done to incorporate the effect of any spatial heterogeneity (variation) present at the length scale of a two-dimensional satellite pixel (i.e., 60 m). This minimises the measurement uncertainty and enables direct pointto-pixel comparison for training and testing the machine learning model [64,65]. Further, to measure surface roughness, we placed the pin-profiler at the location on the ground where surface roughness has to be recorded. We leveled the instrument properly using spirit levels on the two arms and on the top of the whiteboard to avoid any unintentional tilt (Figure 2a). While conducting measurements, we ensure that the pin-profiler is placed parallel to the acquisition direction of the Sentinel-1 satellite (Figure 2b). This ensures that we are measuring the same surface that is illuminated and recorded by the satellite sensors. This process ensures the qualitative measurements of surface roughness in terms of measurement directions. We have reported the directions for all the in-situ sites in Table A1. Once the instrument is laid over the surface, the pins are gently released until they touch the top surface. We take photographs by keeping the camera horizontal at the frame arm located in front of the whiteboard to capture the undulation of the pin's position. We record the coordinates (latitude and longitude) at each sampling location using a Garmin-64s handheld GPS. At each sampling location, we have also measured the surface soil moisture using TDR-Probe (Theta-Probe). Before taking the measurements, we have calibrated the Theta-Probe (ML3 sensor) for our field using the procedure described in Singh et al. [56].
We then process the photograph that was automatically taken to detect surface roughness in MATLAB ® considering the red tip of the pin as a reference ( Figure 3). In doing so, we first calibrate the image to its natural size using the scale embedded on the instrument. We identify the red tips of the pins and digitise them. Once the photographs are processed, we compute the RMS surface roughness according to: where n p is the number of vertical pins, x p is the recorded height of p th pin, and x is the average height. The average values of the surface roughness and measurement directions are listed in Table A1 (Appendix A). Additionally, based on these values, we have classified the roughness into four major classes on the Kosi fan: stubble field, harrowed field, ploughed field, and furrow field ( Figure 4).

Support Vector Regression
We use SVR algorithms to estimate soil roughness from satellite images. We preferred the interpretable regression-based machine-learning algorithms over the black-box models [66]. In black-box machine learning models, the prediction processes are not clear, whereas in the interpretable models, it is clear how predictions are made. Recently, the use of interpretable models has increased in machine learning [67].
The objective of a regression-based machine learning model is to obtain mapping functions that can predict the response variables. Parameters of such mapping functions are obtained from the training data. They are the initial data used to train the algorithms by fitting and tuning the parameters of a mapping function. This is usually complemented by a set of unseen datasets called the testing data and used to validate the trained machine-learning model. The SVR models are widely used to solve various problems in earth sciences, such as real-time flood stage forecasting, snow-depth retrieval, drought prediction, and landuse/landcover change analysis. [68][69][70][71][72][73][74][75][76]. The SVR has an excellent generalisation capability with optimal accuracy that makes it applicable to the solution of various problems in earth sciences, image processing, wireless sensor networks, and blockchain [77][78][79][80].  Vapnik [81] introduced the support vector machine in statistical learning theory. Support vectors that deal with the regression are known as support vector regression [82]. A mathematical explanation of the SVR is provided below.
Given a sample set, S = [(x 1 ,y 1 ), (x 2 ,y 2 ), ..., (x i ,y i ), ..., (x N ,y N )], where x i is the Ndimensional input data, y i is the corresponding output variable, and N is the total samples.
Using SVR, we can estimate the dependent variable, y(x) for a given independent variable, x, according to Equation (2); where φ(x) is the high-dimensional feature spaces that are non-linearly mapped with respect to the independent variable, x. The coefficients w and b are the weight vector and bias term, respectively. The value of w and b are obtained by minimisation of the empirical risk function R(C) according to Equation (3): where L (y(x), y i ) is the loss function ( Figure 5b) given by; Error-support vector Decision boundary The risk function (Equation (3)) is modified by introducing the relaxation variables ξ and ξ * according to: where is the loss factor and C is the penalty factor. To extend the SVR for non-linear function (Figure 5a), the risk function and its constraints can be rewritten in their dual form by introducing the dual set of variables using the Lagrange function.
where α i and α * i are the Lagrange multipliers and K is the kernel function given according to; We used the Homogeneous Polynomial kernel because it is a non-stationary kernel that is well-suited to standardised training data.
where γ and d represent the structural parameter and degree of the Polynomial function respectively. The final risk function for the non-linear SVR reads;

Model Setup
We setup the SVR model to estimate surface roughness. This includes feature generation from the input features, selection of training and validation data, optimisation of the training model, and finally, model evaluation. A detailed flow chart of the model setup is illustrated in Figure 6.
We identified five input features, namely incidence angle, backscatter values (σ 0 ) Sentinel-1 for both (VH and VV) polarisation, NDVI from Sentinel-2, and elevation from the SRTM DEM. The spatial resolution of these features are different, for example; spatial resolution of the processed backscatter value (σ 0 ) of the Sentinel-1 image is available at 60 × 60 m, NDVI at 10 m, and the surface elevation at 30 m. We applied the nearestneighbor algorithm to resample the NDVI and elevation grids at 60 × 60 m; a grid size comparable to the backscatter images. Finally, these input features were used to train the model and predict surface roughness. We evaluated the in-sample error (i.e., MSE) in the input features. They constitute an in-sample error of about 4 × 10 −15 . Furthermore, from a linear combination/fusion of input features, we generated two more features by taking the ratio of VH/VV and difference of VH-VV polarisation [83]. All seven features together reduce the in-sample error (MSE) to 1.8 × 10 −22 . Now, we individually evaluate the relative importance of input features through regression tree ensemble in the prediction of surface roughness. We applied the Least-Squares Boosting (LSBoost) ensemble aggregation method with tree learner for 100 learning cycles to train the regression ensemble. We then calculated the predictor importance by adding changes in the MSE (created due to a split in the tree learner) and normalised it using the total branch nodes. The higher value of this ratio corresponds to high importance for the ensemble. Furthermore, we estimated predictive measures of association (i.e., feature association matrix) through Pearson's linear correlation approach. The input features of a machine learning model should not be highly correlated. This makes the machine learning models unstable and highly sensitive [84].  We applied the Principal Component Analysis (PCA) to select the most uncorrelated information from the feature data. We selected the first five principal components of the feature data. This explains about 99% variance and reduces the time and space complexity required for the training and evaluation of the model. We then standardised the features to find the optimal scaling methods to predict surface roughness. We applied different methods, i.e., Not Standardised (NS), Center Mean (CM), Z-score Mean (ZM), Min-Max (MM), and Scale (S), to standardise our input features. Table 2

Method Formulation Description
Not Standardised It converts the features to a common scale with zero mean and unit standard deviation. It has same skewness and kurtosis as that of original data. Note: x s is the standardised data, x is the original data, x is the mean of x, σ is the standard deviation of x, x min is the minimum value of x and x max is the maximum value of x.

Hyperparameter Optimisation
The hyperparameters ( and C) of the SVR model determine the predictive efficiency and training error in the model. A condition where the overall residual is greater than , the hyperparameter C penalises the model output. A lower value of C results in computational complexity and a higher value of C in model under-fitting. To overcome these problems, we use the universal grid search algorithm to optimise the hyperparameter (C) by keeping the fixed in the SVR model. The Gird search algorithm needs an objective function to estimate the optimal value of the hyperparameter. In a regression model, the MSE is the most commonly used objective function [85]. The objective function (MSE) reads; where n is the size of testing samples, Y i is the observed values, Y i is the predicted values. The grid search algorithm minimises the objective function to find the optimal value of C. Table 3 reports the optimal values of hyperparameters.

Parameters Values
Penalty factor (C) 0. The bar plot (Figure 8a) shows relative importance of our input features. A feature that has higher relative importance score is considered more important in the model for predicting surface roughness. Among all the input features we used for training SVR models, the incidence angle has the lowest, and DEM has the highest feature importance score. This indicates incidence angle has less impact, and DEM has more impact in predicting the surface soil roughness. Additional features (VH/VV and VH-VV) generated from the linear data fusion have more importance than the native features. It is important to highlight that the feature importance only calculates the relative importance, without the segregation of irrelevant and relevant features [86]. It is a common practice in machine learning to eliminate noisy, redundant, and irrelevant features, as they can reduce the prediction accuracy and increase the computational cost.
We applied the backward elimination to identify the irrelevant features in our model [87][88][89][90][91]. It is an iterative approach that eliminates the least important features and re-calculates the model loss (i.e., MSE) in each iteration. If the model loss for a feature decreases, we consider that feature irrelevant and eliminate it from the computation. This process continues until the model loss is constant. Alternatively, if the model loss increases, that feature is relevant and the process is terminated. As the incidence angle has the least feature importance score, we eliminated it using the backward elimination approach and re-calculated the model loss. We observed that, after the elimination of incidence angle, the model loss increased (MSE = 7.9 × 10 −16 ). This indicates that the incidence angle is not an irrelevant feature in the prediction of soil roughness. Similarly, the elimination of other features degrades the result. This suggests that all seven features are relevant in predicting surface soil roughness. Among them, DEM is the most important. Figure 8b shows the Pearson correlations of the features. All the input features are uncorrelated, which suggests the good reliability of our model.

Feature Sensitivity
Feature importance does not examine if a feature is positively or negatively affecting the models. To evaluate this, we performed a sensitivity analysis of our features when predicting the surface roughness. We generated the Partial Dependence Plot (PDP) [92] of each feature with their corresponding histogram (Figure 9). PDP measures the average marginal effect of all features on the predicted variable. The PDP of DEM shows a high partial dependency for surface roughness (Figure 9). The incidence angle and DEM are positive, whereas the NDVI has a negative effect on the surface roughness. The backscatter values (σ 0 ) for VV and VH/VV have a fluctuating positive, and VH-VV has a fluctuating negative effect. We did not observe any clear trend for VH. Furthermore, to explore the localised explanation of individual features at each instance, we plotted the Individual Conditional Expectation (ICE) in the same plot [93]. This is considered a non-linear sensitivity analysis that disaggregates the averaging effects and evaluates the model at each instance. The average of all the ICE lines provides the PDP plot [94][95][96]. The averaging effect of PDP conceals any heterogeneous relationship present at any particular instance. For example, some instances in the ICE of DEM (between 50 and 55 m) behave differently compared to the majority instances (i.e., PDP).

Surface Roughness
We used the different variants of our SVR model to estimate surface roughness and compared the result with the ground measurement. We plot the model surface roughness against the in-situ values ( Figure 10

Comparison with the Benchmark Machine Learning Models
Our results show that the PCA-MM-SVR predicts surface soil roughness more accurately compared to the other variants of SVR models. However, any conclusion based solely on a comparison of the different variants of the same machine-learning model may lead to a biased result. To overcome this, we compare the performance of the SVR model with the benchmark machine-learning algorithms (i.e., GPR, GRNN, BDT, Bragging Ensemble Learning, and Boosting Ensemble Learning). Apart from these benchmark algorithms, we also compared the performance of the SVR models with the recently emerged automated machine learning (AutoML) algorithms [97]. An AutoML module is embedded in MATLAB ® and can be accessed through fitrauto library. It automatically selects the regression model (i.e., SVR, GPR, linear regression, BDT, and ensemble learning) with optimised hyperparameters. This uses Bayesian optimisation to iteratively tune the model through parallel computing by assuming log(1 + MSE) as an objective function.
In machine learning, it is customary to use R and RMSE values to evaluate the model performance. These metrics are suitable for estimating the accuracy of a single model, but not for comparing the performance of different machine-learning models [98][99][100]. To ensure a fair evaluation, we use performance metrics such as: Akaike's Information Criterion (AIC), corrected AIC (AICc), and Bayesian Information Criterion (BIC) [101] (Appendix B). These metrics penalise the model for a higher number of model parameters to select the best model [102,103]. The model with a lower value of AIC, AICc, and BIC is preferred. Table 4 reports the performance of other machine learning models, evaluated using the same training and testing datasets. The SVR performs relatively well compared to the other machine-learning models. The PCA-MM-SVR exhibits the lowest values of AIC, and BIC amongst all other methods; this indicates the best goodness-of-fit. We also evaluate the model performance in terms of computational time complexity (using CPU @ 3.3 GHz, 10 cores). We observed that the computational time complexity of PCA-MM-SVR is optimal. The time complexity of the non-standardised variant of SVR (i.e., PCA-NS-SVR) is relatively high.  Other than SVR, the GRNN performs well and ranks second in terms of performance evaluation. In GRNN, the mix-max scaling variant (i.e., PCA-MM-GRNN) performs better (R = 0.67, RMSE = 0.04 cm, AIC = 509.3, AICc = −356.7, and BIC = 1241). This model is computationally more efficient than its other variants. Among the AutoML variants, we found that PCA-ZM-AutoML ranks third in terms of performance evaluation by outperforming all other variants, with R = 0.59, RMSE = 0.20 cm, AIC = 291.05, AICc = −225.70, and BIC = 784.04. We observed that the time complexity of mix-max and non-standardised scaling variant behave in a similar way for all the machine-learning models. The performance of mix-max scaling is the best in terms of computational time complexity and relatively low for the non-standardised scaling.

Comparison with Backscatter Models
We also compared the PCA-MM-SVR model's result with the different empirical, semi-empirical backscatter, and regression models. We applied modified Dubois, modified Oh 2002, and modified Oh 2004 models [28][29][30] to estimate the surface roughness of the Kosi Fan from dual polarised Sentinel-1 images [56,104,105]. We inverted these models to obtain surface roughness from single co-polarised (i.e., VV) and single cross-polarised images (i.e., VH), and in-situ soil moisture. The inversion of different backscatter models is explained in Appendix C. Surface roughness can be estimated from SAR images by using calibrated regression curves. For example, Srivastava et al. [43] have proposed that the empirical coefficients of the linear regression model retrieve surface roughness for the Indian soils from the multi-polarized Envisat-1 ASAR images. We use this regression model to estimate the surface soil roughness of the Kosi Fan.
The surface roughness from backscatter and regression models is subjected to systematic errors and model biases. To obtain a fair comparison between the SVR, backscatter, and regression models, we use un-bias RMSE (ubRMSE) instead of RMSE. Table 5 reports a comparison of modelled surface roughness with the ground measurements. Among all the models discussed above, the PCA-MM-SVR machine learning performs better (R = 0.75, ubRMSE = 0.08 cm, and MSE = 0.04 cm 2 ). The accuracy of SVR is relatively high compared to the backscatter and regression models. This is probably because the SVR considers a higher number of input features to predict the soil roughness as compared to the backscatter and regression-based models.

Surface Roughness of the Kosi Fan
Finally, we used the PCA-MM-SVR model to predict the surface roughness of the Kosi Fan for two consecutive satellite passes (11-12 and 17-18 December 2019) of the Sentinel-1. Figure 11 shows the spatial and temporal variation in surface roughness and its anomaly on the Kosi Fan Surface. The time difference between the two consecutive passes of Sentinel-1 A and B satellites is six days; we do not expect much change in the surface roughness. This is clearly reflected by the cross-section profiles drawn at a common region of the surface roughness maps of two different dates ( Figure 11). We observed the negative surface roughness anomalies where surface roughness values were less than 1.5 cm and positive anomalies where the surface roughness values were greater than 1.5 cm.
Interestingly, we observed some spatial patterns in the surface roughness of the Kosi Fan. Visually, it appears that the surface roughness is high near the apex of the fan and decreases towards the toe. Based on the elevation variation, we categorised the fan surface into proximal (110-70 m) , middle (70-50 m), and distal (50-30 m) part. We drew the surface roughness profile along a longitudinal transect from the apex to the fan toe ( Figure 12). We can clearly see that the surface decreases non-linearly (approximated using a nonlinear second-order polynomial equation) along the transect. The histogram of surface roughness in the proximal, middle, and distal parts appears to be normally distributed. We found a mild decrease in the average surface roughness from the proximal (1.7 ± 0.5 cm), middle (1.2 ± 0.3 cm), to the distal (0.9 ± 0.2 cm) part of the Kosi Fan. This is consistent with the values measured in the field. Further, it is important to note, on the Kosi fan, that the elevation (110-30 m) and median grain size (300-100 µm) gradually decreases from the proximal to distal part. This indicates a possible control of the grain size of the soil sediments and elevation on the surface roughness.
Further, we observed that the dependency of the surface soil roughness on different features is highly dynamic and unclear. We observed no clear trend between the surface soil roughness and the features (Figure 9). However, an overall trend (or impact) is visible, with few features. For example, we observed a positive impact of DEM and incidence angle with the surface soil roughness. This observations are in consistent with the recent studies [40,106].

Sensitivity Analysis
Finally, we performed a sensitivity analysis of the PCA-MM-SVR machine learning model to assess the impact of individual features on the surface soil roughness. At every iteration, we estimated surface roughness by introducing a small uncertainty (±5% and ±10%) in any one input feature at a particular time and keeping the remaining features constant. This was carried out for all the input features and the results were compared. We observed that an introduction of ±10% errors in the input features of the PCA-MM-SVR model resulted in an approximately ±1% change in the output (Figure 13).

Conclusions
We compared the accuracy of surface roughness estimated from SVR models with six different benchmark machine-learning algorithms (i.e., GPR, GRNN, BDT, BAGG, BOOST, and AutoML) and three backscatter models (modified Dubois, modified Oh 2002, andmodified Oh 2004). We conclude that the PCA-MM-SVR model outperforms all the different variants of SVR, different benchmark machine learning, backscatter, and empirical regression models in terms of accuracy and computational time complexity. The PCA-MM-SVR model is relatively more sensitive to uncertainly in the VV polarisation as compared to the other input features. On the Kosi Fan, the surface roughness appears to be more in the proximal and decreases gradually towards the distal part of the fan. Although it is not clear at this stage, we suspect that this could be associated with the elevation (110-30 m) and median grain size variation (300-100 µm) from proximal to the distal part of the fan.
This study provides a robust approach to estimate surface soil roughness from optical and SAR remote-sensing data. A comprehensive work using multiple SAR sensor data fusion may be examined in the future to assess the prediction of surface roughness using different machine-learning models. The result of this study can be used in various applications, such as: to study soil erosion, surface soil moisture, infiltration, overland flow, sediment detachment, and many other applications in earth sciences. Data Availability Statement: All the satellite images used in this study are publicly available. We have provided the in-situ measurements in Table A1. The algorithms used in this study can be made available upon a reasonable request to the corresponding and first author.

Appendix A
Surface roughness measurements on the Kosi Fan (Table A1).

Appendix B
The performance metrics are estimated according to; where, SSE is the sum of squares of errors, SST is the sum of squares of total, y obs is the observed or in-situ values, and y sat is the satellite derived or predicted values. AIC = n train · ln SSE n train + 2 · p (A5) BIC = n train · ln SSE n train + p · ln (n train ) (A6) AICc = n train · ln SSE n train + (n train + p) where n train is the number of training samples and p is the number of parameters that the machine learning model evaluates internally.
where A and B are the empirical constants.