Optimizing Soil Moisture Retrieval: Utilizing Compact Polarimetric Features with Advanced Machine Learning Techniques

: Soil moisture plays a crucial role in various environmental processes and is essential for agricultural management, hydrological modeling, and climate studies. Synthetic Aperture Radar (SAR) remote sensing presents signiﬁcant potential for estimating soil moisture due to its ability to operate in all weather conditions and provide day-and-night imaging capabilities. Among the SAR conﬁgurations, the Compact Polarimetric (CP) mode has gained increasing interest as it relaxes sys-tem constraints, improves coverage, and enhances target information compared to conventional dual polarimetric SAR systems. This paper introduces a novel approach for soil moisture retrieval utilizing machine learning algorithms and CP SAR features. The CP SAR features are derived from a series of RADARSAT Constellation Mission (RCM) CP SAR imagery acquired over Canadian experimental sites equipped with Real-Time In Situ Soil Monitoring for Agriculture (RISMA) stations. This study employs a diverse dataset of compact polarimetric SAR features and corresponding ground truth soil moisture measurements for training and validation purposes. The results of our study achieved a Root Mean Square Error (RMSE) of 6.88% with a coeﬃcient of determination R 2 equal to 0.60, which corresponds to a correlation R between true and predicted soil moisture values of 0.75, using optimized Ensemble Learning Regression (ELR) with a decision-tree-based model. These results improved, yielding an RMSE of 5.67% and an R 2 equal to 0.73 (R = 0.85), using an optimized Gaussian Process Regression (GPR) model.


Introduction
Synthetic Aperture Radar (SAR) has emerged as a compelling technology for highresolution soil moisture content observation [1][2][3][4].SAR remote sensing has been utilized in soil moisture retrieval using either conventional single, dual [1], or full polarimetric SAR imagery [4].The observation of temporal and spatial pa erns of soil moisture holds significant importance for agriculture, as it is linked to crop health, drought and flood risk, and water supply management.The RADARSAT Constellation Mission (RCM) is a Canadian SAR mission launched in 2019 as a successor to the RADARSAT-1&2 satellites.The RCM not only ensures C-band SAR data continuity for RADARSAT-2 users but also introduces new applications made possible through the constellation approach [5,6].A unique feature of RCM is its Compact Polarimetric (CP) SAR configuration, offering operational CP SAR imagery in all imaging modes.The CP option in the RCM is achieved through the transmission of a right-circularly polarized radar signal and the reception of two mutually coherent orthogonal horizontal (RH) and vertical (RV) linear polarizations [7].Limited research has explored the use of CP SAR imagery for soil moisture retrieval.The pioneer study in this field was conducted by [8], which proposed a two-component model of polarimetric coherency matrix for estimating soil moisture over bare soil.Another study by [9] simulated long wavelength P-band CP measurements and found a slight degradation in soil moisture estimation compared to conventional co-polarized HH (horizontal transmit and receive) and VV (vertical transmit and receive) data.
In [10], the sensitivity of simulated RH and RV data to the soil moisture content was examined under constant surface roughness conditions.Evaluating the potential of simulated CP SAR data for soil moisture content estimation in the presence of vegetation, [11] developed and assessed a time series data cube retrieval algorithm.They found a minor degradation in the soil moisture content estimation using CP SAR data compared to the full polarimetric SAR.Furthermore, [12] developed a semi-empirical model for soil moisture estimation using CP SAR imagery acquired by the RISAT-1 mission.That study focused on soil moisture retrieval at a high radar incidence angle.The capability of RISAT-1 for soil moisture retrieval was also investigated in [13], where the soil moisture retrieval was achieved through the implementation of a methodology that combines a data decomposition method and a surface component inversion.
The potential of the RCM for soil moisture retrieval was investigated by [14].They simulated CP SAR data from the ScanSAR medium resolution 30 m (SC30M) and 50 m (SC50M) imaging modes, and used the Integral Equation Model (IEM) calibrated for RH and RV for the soil moisture retrieval approach.The results showed a promising performance of the RCM with a correlation of over 0.70 between the measured and predicted soil moisture and an unbiased Root Mean Square Error (ubRMSE) be er than 6%.Confirming the potential of the RCM for soil moisture monitoring, [15] simulated and analyzed a set of CP features for their sensitivity to soil moisture.Herein, [15] achieved a correlation of over 0.80 and an RMSE be er than 6% between the measured and predicted soil moisture using CP features.The correlation further improved to over 0.90 (RMSE < 5%) when combining both linear and CP features.The first study to investigate the soil moisture retrieval by means of RCM CP imagery was presented in [16].The study focused on the potential of the primary RCM intensity products of RH and RV for soil moisture retrieval using several Machine Learning (ML) models.The results indicated that with data augmentation, the Gaussian Process Regression (GPR) achieves the best prediction performance with RMSE = 4.05% and R 2 = 0.81.
The innovative characteristic of this study lies in the fact that it is the first to explore the potential of numerous CP features extracted from RCM CP imagery for soil moisture retrieval.A framework has been developed for the optimal selection of CP features.Through the implementation of the framework, a subset of CP features is extracted consisting of less correlated CP features significant for soil moisture retrieval.In our study, two ML models are developed for the soil moisture retrieval based on GPR and Ensemble Learning Regression (ELR).The Bayesian optimization strategy is employed for fine-tuning the hyperparameters of both models.Multiple combinations of CP features are used as input features for the training and testing of both ML models.The performance of both ML models is repetitively evaluated, and the subset of CP features with the lowest RMSE and the highest coefficient of determination R 2 is identified.

Theoretical Background
The sca ering vector for a compact SAR configuration transmi ing right-circular polarization signals and coherently receiving linear (horizontal and vertical) backsca ered signals is given by where T denotes the transpose operator and RH and RV are the complex elements of the sca ering vector defined as From (2), three CP features could be obtained: the backsca ering coefficients σ and σ and the phase difference delta = − [10].From (2), one can calculate the linear polarization ratio of the backsca ering coefficients σ and σ : Considering a right circular transmission, the two opposite circular receptions can be synthesized from (1) as follows [9]: From ( 4), one can calculate the circular polarization ratio of the backsca ering coefficients σ and σ In [17], the four Stokes elements s0, s1, s2, and s3 are defined as where 〈… 〉 denotes a spatial ensemble averaging and * denotes the complex conjugate.
Re and Im are the real and imaginary parts of a complex number.s0 is equal to the total average received power, s1 is equal to the power in the linear horizontal (s1 > 0) or vertical (s1 < 0) polarized components, s2 is equal to the power in the linearly polarized components at a tilt angle of 45° (s2 > 0) or a tilt angle of 135° (s2 < 0), and s3 is equal to the average power received in left-circular (s3 > 0) or right-circular (s3 < 0) polarization.From the elements of the Stokes vector, the degree of polarization (m), the degree of linear polarization (ml), and the degree of circular polarization (mc) can be estimated [18]: The features m, ml, and mc take values between 0 and 1, indicating a completely depolarized and polarized returned signal, respectively.
Another parameter named alpha, which is related to the ellipticity of the compact sca ered wave, can also be derived [19]: with range between 0° and 90°.Two methods are widely used to decompose CP SAR imagery into sca ering mechanisms; namely, the mchi and mdelta decompositions.The mdelta decomposition method is based on the degree of polarization m and the phase difference delta, and it is given by [10]: where mdelta_vol is related to volume sca ering, mdelta_surf is related to surface scattering, and mdelta_dbl is related to double bounce sca ering.The mchi decomposition method is based on m and the degree of circularity sin 2χ = −s3/(m s0), and it is given by mchi_vol where mchi_vol is related to volume sca ering, mchi_surf is related to surface sca ering, and mchi_dbl is related to double bounce sca ering.One should note that the volume sca ering mechanism formulas of both mchi and mdelta decompositions are identical.
The coherency matrix T2 of the sca ering vector in (1) can be used to define the Shannon Entropy (SE).The intensity component of the SE (SEI) has the form [20] where Tr(.) denotes the matrix trace.The SEI is proportional to the total backsca ered power.Therefore, it is a scaled value of s0.The polarimetric component of the SE (SEP) has the form [20] SEP = log where det(.)denotes the determinant of the matrix.The SEP depends on the Barakat degree of polarization.The SE can be defined as A coherency parameter (mu) can also be extracted from the elements of the Stokes vector, as follows:

Experimental Sites and Data Availability
We selected two well-known Canadian experimental sites located in Ontario and Manitoba.Both sites are equipped with Real-Time In Situ Soil Monitoring for Agriculture (RISMA) stations.These stations include Stevens HydraProbe sensors that record the soil temperature and the real part of the dielectric constant, which is converted to a volumetric soil moisture value [21].The first site is situated within the South Nation River watershed, in close proximity to the town of Casselman, southeast of O awa.This site encompasses one RISMA network with four stations (Figure 1).The second site comprises two RISMA networks located in southern Manitoba.The first network includes nine RISMA stations situated near the towns of Carman and Elm Creek, southwest of Winnipeg.The second network consists of three stations, located immediately northwest of Winnipeg within the Sturgeon Creek watershed (Figure 1).Both the first and second test sites share a common characteristic of intensive agriculture activities, predominantly focusing on annual crops [22].
In our study, we considered the integrated soil moisture from a depth of 0 to 5 cm measured by the RISMA stations.However, during the early spring thaw, we switched to the measured soil moisture at a 5 cm depth instead.The reason for this change is that the 0-5 cm sensor probes, which are inserted vertically into the soil surface, are affected by frost during the thaw, causing them to be partially pushed out of the ground.As a result, the exposed probe tines interact with the air, leading to lower dielectric values.This, in turn, causes an underestimation of the integrated soil moisture measured from a depth of 0 to 5 cm.To address this issue, Agriculture and Agri-Food Canada (AAFC) conducts necessary maintenance of the stations annually by rese ing the surface probes that might have been displaced, usually before the middle of May.
The experimental dataset consisted of 31 RCM images acquired using the SC30M Compact Polarimetric (SC30MCP) imaging mode with a 30 m spatial resolution over the two chosen sites.The RCM images were acquired during the spring (15 April-28 June) and fall (15 September-27 October) of 2022.Consequently, the fields were characterized by unvegetated bare soil with a relatively smooth random roughness state.Consequently, the backsca ered radar signal should be associated with the real part of the soil's dielectric constant.Additionally, the weather data gathered by the RISMA stations were utilized to verify the absence of snow and frozen soil conditions during spring and fall.

SAR Processing
The acquired RCM images were processed as Multi-Look Complex (MLC) products, providing the backsca ered signal amplitude and phase information.To minimize speckle noise, a 3x3 boxcar filter was applied to the acquired images.Next, we extracted the 25 CP features described in Section 2. All extracted features were sampled at the location of the RISMA stations.Subsequently, an experimental sample dataset was constructed consisting of 236 samples.Each sample corresponds to the values of the extracted CP features at the location of a RISMA station, as well as the local radar incidence angle and the average recorded soil moisture content at the time of the SAR image acquisition.
The constructed experimental dataset of our study is characterized by a variety of soil moisture conditions, ranging from 4.9% (very dry conditions) to 51.3% (very wet conditions).However, most of the dataset samples had medium soil moisture values in the range of 20-40%.The minimum radar incidence angle in the sample dataset is 21.9°, while the maximum is 41.4°.This is intentional following a recommendation by the RCM's calibration and validation team, confirming the minor impact of the imperfect emi ed RCM circular polarization signal triggered by a dissimilarity between the H and V antenna gains for a radar incidence angle between 20° and 43°.Within this range, the axis ratio of the transmi ed signal is < 0.5 dB [16].

Feature Selection
In this study, we propose a strategy to select the input features with low correlation between each other and higher importance for soil moisture retrieval.This strategy is depicted in the flowchart in Figure 2. At first, we calculated the absolute value of the Spearman Correlation Coefficient (SCC) for all input CP features.The SCC, a nonparametric correlation coefficient with a range of -1 to 1, reveals the monotonic relationship between variables, enabling us to avoid the need for assumptions about the statistical distribution of the CP features.Next, the Univariate Feature Ranking F-test statistics was applied to examine the importance of the CP features.The Univariate Feature Ranking F-test is a statistical test that compares the null hypothesis that the response values grouped by the predictor variable values are drawn from populations with the same mean against the alternative hypothesis that the population means differ.A test statistic with a small -value indicates that the corresponding predictor is important.An Importance Score (IS) is calculated as IS = -( ).Therefore, a high IS indicates that the associated predictor is significant.In this work, each CP parameter was treated as a predictor for soil moisture by testing whether the variations In soil moisture as a dependent variable were accounted for by the predictor as an independent variable.Figure 3 shows the ranking of the features according to the F-test statistics.To implement the proposed feature selection strategy, we selected a threshold for the absolute SCC to categorize the correlation between each pair of predictors as either strongly correlated or weakly correlated features.Herein, the absolute SCC threshold was set to 0.90.If the absolute SCC between two predictors was ≥ 0.90, we assumed the two predictors to be strongly correlated [23][24][25].In this case, we compared their F-test importance score and selected the feature with the highest importance score to be included in the output feature set (OF-Set).On the other hand, if the correlation value was < 0.90, we did not perform the IS test, and the feature was added directly to the OF-Set.The feature selection process started with two initial predictors and the radar Incidence Angle (IA) to form the first input feature set, denoted as IF-Set1.In each subsequent step, one additional input feature was incorporated.For instance, in the second step, we introduced IF-Set2, comprising the outcome feature set from the preceding step, OF-Set1, and the currently introduced input feature.This iterative procedure involved conducting correlation tests and, if deemed necessary, applying the F-Test, continuing until all input features had been thoroughly examined.Each OF-Set was used for the training and validation of the GPR and ENL models.

Machine Learning Implementation
In this study, the Gaussian Process Regression (GPR) and the Ensemble Learning Regression (ELR) were utilized for retrieving the soil moisture.The hyperparameters of these ML regressors were fine-tuned using the Bayesian optimization technique.Bayesian optimization determines the hyperparameter values that minimize a loss or objective function [26].The Mean Squared Error (MSE) between the predicted and actual objective values was used as the loss function in the current study.The Bayesian optimizer uses the expected improvement per second as the acquisition function [27] to determine the next iteration's hyperparameter set.The set of model hyperparameters that minimized the upper confidence interval of the MSE objective function was deemed the optimal set, and the corresponding model was used to predict the soil moisture.An eight-fold cross-validation scheme was employed to train and evaluate the machine learning models.Consequently, the sample dataset was divided into eight subsets.During each iteration of this scheme, the models were trained using seven of these subsets, while the remaining one was held out for testing.The reported performance of the models represents the average performance metrics computed across the test subsets over the eight iterations of the cross-validation scheme.In the following sections, we provide a concise description of the GPR and ELR used in this study to retrieve soil moisture.

Gaussian Process Regression
The GPR is a supervised nonparametric ML technique based on the formation of Gaussian process-based time series prediction models [28].The GPR model requires several hyperparameters to be specified.These hyperparameters consist of the fundamental function of the prior mean function of the GPR, the kernel function that models the correlation in the response variable, the kernel scale that determines the initial kernel parameters, and the standard deviation (Sigma) of the sample noise.In our research, the Bayesian optimization method selects the optimal hyperparameters from the ranges shown in Table 1.The optimization curve of the best GPR model and its optimal hyperparameters are presented in the next section.

Ensemble Learning Regression
The ELR is an ML approach that employs multiple ML models instead of a single model to resolve nonlinear regression problems [29].An ensemble of decision-tree-based models (weak learners) is generated and integrated to form a robust prediction model.In our study, we investigated Boosted trees and Bagged trees using Bayesian optimization for the regression problem.In the Boosted trees, the ensemble method is Least Squares Boosting (LSBoost) with Regression Trees (RT) learners.On the contrary, Bootstrap bagging (Bag) with RT learners is the ensemble style of Bagged trees.The minimum leaf size, learning rate, number of learners, and number of predictors to sample are the ensemble models' optimizable hyperparameters.The ranges of these hyperparameters to be sought by the Bayesian optimization method are presented in Table 2.The optimization curve of the best ELR model and its optimized hyperparameters are depicted in the next section.

Results and Discussion
The feature selection process started with two features which we randomly selected: s3 and mchi_surf.The absolute correlation between s3 and mchi_surf was 0.96, indicating a strong correlation between the two features.we examined the importance scores, finding that the importance score of mchi_surf was higher than that of s3.Consequently, we selected mchi_surf to be included with the IA, forming the first output feature set (OF-Set1), as shown in Table 3.In the second step, the mu feature was added to the OF-Set1 to form IF-Set2, which consisted of the mu and mchi_surf predictor, in addition to the IA.The SCC between mu and mchi_surf was low, which indicates low dependency between these two features.Therefore, both features were kept in the OF-Set2.In third step, we compared the absolute SCC values of the mc predictor with those of OF-Set2, specifically mu and mchi_surf.The comparison showed a high correlation between the mc and the other two features.Therefore, the IS of mc was assessed in comparison to that of mu and mchi_surf, revealing that mc was of lower importance than both features.Accordingly, mc was discarded, and the OF-Set3 had the same features as the OF-Set2 (Table 3).All the OF-Sets obtained from repeating this procedure for all CP features are presented in Table 3.Moreover, Table 3 shows the estimated RMSE and R 2 for each model during the training and validation of the GPR and ELR models in an eight-fold cross-validation scheme using each OF-Set.The highest prediction performance was recorded for the optimized GPR classifier when trained with OF-Set5.The RMSE and R 2 values of this model were 5.67% and 0.73, respectively, and the feature set was composed of mchi_surf, mu, delta, SE, and IA.The optimal hyperparameters of this model had a Zero fundamental function, a Nonisotropic Exponential kernel function, and a Sigma value of 107.69.These hyperparameters were selected at the observed minimum MSE value of 32.14.Table 3. Performance of the ML soil moisture prediction models trained using a feature set formed through the proposed feature selection approach.Entries of the best performance GPR and ELR are depicted in bold.Feature sets that stayed the same as in their previous steps were not used for training or evaluation the ML models.Therefore, the performance metrics entries are shaded in grey.# indicates the number.On the other hand, the optimized ELR model achieved its best performance when it was trained with the OF-Set11, including the features: mchi_surf, mu, alpha, σ , s2, s0, and IA.This prediction model recorded RMSE = 6.88% and R 2 = 0.60.The hyperparameter optimization of this model was obtained at an observed minimum MSE value of 49.3892.The optimal model emerged as an LSBoost ensemble, configured with a minimum leaf size of 1 and a total of 500 learners.The model learning rate was equal to 0.047032 and the predictor-to-sample ratio was equal to 3.
Figure 4 presents the sca er plots, residual plots, and response plots with error bars for the top-performing GPR and ELR models.Additionally, it shows the minimum MSE plots across 30 iterations of the Bayesian-based hyperparameter optimization process.The sca er plots reveal the goodness-of-fit between the true response variable values and the ML prediction models.It is noticeable that a greater number of predicted soil moisture values from the GPR model align with the diagonal perfect-fit line, compared to those predicted by the ELR model.This is consistent with the R values of their corresponding models, as depicted in Table 3 (0.73 for the best GPR model against 0.60 for the best ELR model).The residual plots present the difference between the true and predicted soil moisture values for the input samples (Figure 4).The residual plot of the GPR model shows a random pa ern with an error range of ±15% about the horizontal line at 0, with few outliers.This is also the case for the ELR residual plot (Figure 4).The sca er plots along with the residual plots give insight into the correlation between the true soil moisture values and the values predicted by both models.This is further indicated in the response plots in Figure 4, which show the true and predicted response values for the data samples.The error bars in the response plots show the difference between the true and predicted values generated by the corresponding ML models.The ELR model achieves a correlation R between true and predicted soil moisture values equal to 0.75, while the GPR achieves a correlation equal to 0.85.From the minimum MSE plots of the ML models, we can see that the optimum ELR hyperparameters recorded an observed minimum MSE of 49.4 at the 15th iteration, while the GPR optimal hyperparameters were recorded at the 19th iteration, and its corresponding observed minimum MSE value was 32.13.

Conclusions
In this study, we investigated the potential of the RCM CP mode for soil moisture retrieval over bare soil.A dataset comprising 25 CP features along with the radar incidence angles sampled from acquired RCM images at the location of RISMA stations in ON and MB was used.A framework was developed for the optimal selection of CP features.Through the implementation of the proposed feature selection framework, subsets of CP features were extracted, consisting of less correlated CP features significant for soil moisture retrieval.Two ML models were examined for the soil moisture retrieval: GPR and ELR.The Bayesian optimization strategy was employed for fine-tuning the hyperparameters of both models.The results of our study reveal the encouraging performance of the optimized GPR model for soil moisture retrieval using four CP features: mchi_surf, mu, delta, and SE.The GPR model recorded an RMSE value of 5.67% and R 2 value of 0.73.The optimized ELR model achieved its highest performance with a combination of six CP features: mchi_surf, mu, alpha, σ , s2, and s0, resulting in RMSE = 6.88% and R 2 = 0.60.Both models included the radar IA with the CP features.The findings of our study emphasize the potential for soil moisture retrieval through the utilization of the RCM SC30MCP mode in conjunction with ML techniques.

Figure 1 .
Figure 1.Map of the study area.Red dots indicate the locations of the RISMA stations.

Figure 2 .
Figure 2. Flowchart of the proposed feature selection approach for Soil Moisture (SM) retrieval.

Figure 3 .
Figure 3. Plot of the importance score ranking of the CP features according to the F-test statistics.

Figure 4 .
Figure 4. Sca er and residual plots for the top-performing GPR and ELR models.Right: GPR plots; Left: ELR Plots; Upper row: Sca er plots between predicted and true Soil Moisture (SM) values; Second row: Residual plots; Third row: Response plot with error bars between predicted and true SM values; Fourth row: plots of the minimum MSE versus optimization iterations.

Table 1 .
The ranges of the GPR optimizable hyperparameters.

Table 2 .
The ranges of the ensemble model's optimizable hyperparameters.is the number of samples and is the number of input predictors.