Next Article in Journal
WTDBNet: A Wavelet Transform-Based Dual-Stream Backbone Network for Fine-Grained Ship Detection
Previous Article in Journal
Real-Time Regional Ionospheric Total Electron Content Modeling Using the Extended Kalman Filter
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Soil Moisture Inversion Using Multi-Sensor Remote Sensing Data Based on Feature Selection Method and Adaptive Stacking Algorithm

1
College of Information and Communication Engineering, Dalian Minzu University, Dalian 116600, China
2
National Space Science Center, Chinese Academy of Sciences, Beijing 100190, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2025, 17(9), 1569; https://doi.org/10.3390/rs17091569
Submission received: 22 January 2025 / Revised: 22 April 2025 / Accepted: 25 April 2025 / Published: 28 April 2025

Abstract

:
Soil moisture (SM) profoundly influences crop growth, yield, soil temperature regulation, and ecological balance maintenance and plays a pivotal role in water resources management and regulation. The focal objective of this investigation is to identify feature parameters closely associated with soil moisture through the implementation of feature selection methods on multi-source remote sensing data. Specifically, three feature selection methods, namely SHApley Additive exPlanations (SHAP), information gain (Info-gain), and Info_gain ∩ SHAP were validated in this study. The multi-source remote sensing data collected from Sentinel-1, Landsat-8, and Advanced Spaceborne Thermal Emission and Reflection Radiometer Global Digital Elevation Model (ASTGTM DEM) enabled the derivation of 25 characteristic parameters through sound computational approaches. Subsequently, a stacking algorithm integrating multiple machine-learning (ML) algorithms based on adaptive learning was engineered to accomplish soil moisture prediction. The attained prediction outcomes were then juxtaposed against those of single models, including Random Forest (RF), Adaptive Boosting (AdaBoost), Gradient Boosting Decision Tree (GBDT), Light Gradient Boosting Machine (LightGBM), Extreme Gradient Boosting (XGBoost), and Categorical Boosting (CatBoost). Notably, the adoption of feature factors selected by the Info_gain algorithm in combination with the adaptive stacking (Ada-Stacking) algorithm yielded the most optimal soil moisture prediction results. Specifically, the Mean Absolute Error (MAE) was determined to be 1.86 Vol. %, the Root Mean Square Error (RMSE) amounted to 2.68 Vol. %, and the R-squared (R2) reached 0.95. The multifactor integrated model that harnessed optical remote sensing data, radar backscatter coefficients, and topographic data exhibited remarkable accuracy in soil surface moisture retrieval, thus providing valuable insights for soil moisture inversion studies in the designated study area. Furthermore, the Ada-Stacking algorithm demonstrated its potency in integrating multiple models, thereby elevating retrieval accuracy and overcoming the limitations inherent in a single ML model.

1. Introduction

Soil moisture (SM) plays a crucial role in the Earth’s hydrological cycle, rendering it of immense significance. It not only influences the allocation and sustainable utilization of water resources but also exerts a profound impact on ecosystems and human livelihoods [1,2,3]. SM datasets have significant application value in water resources management and various other fields, including earth science, hydrology, and agriculture. By monitoring and analyzing SM, SM datasets facilitate a better understanding of surface hydrological processes, water resource distribution, and utilization. They have been instrumental in monitoring fluctuations in groundwater and surface water, as well as in monitoring SM levels in agricultural lands. These datasets play a pivotal role in helping farmers optimize irrigation practices to enhance crop yield and quality. Overall, the utilization of SM datasets has far-reaching implications and significant importance in the realm of water resources management and its associated disciplines [4,5,6,7]. Therefore, it is crucial to collect high-precision, large-scale SM data for efficient agricultural production. Remote sensing techniques in SM retrieval have demonstrated superior performance compared to traditional measurement methods due to their ability to provide rapid and long-lasting coverage of soil surface conditions [8,9,10]. Recent advances in Earth observation technology, including the utilization of active and passive RS imagery, have been dedicated to addressing the challenge of dynamic inversion of SM in farmland [11,12,13,14]. However, in practical implementations, SM demonstrates a high degree of sensitivity to soil properties, land cover, and meteorological conditions, giving rise to substantial spatial heterogeneity. Depending solely on SM retrieval data from a single source proves inadequate for addressing the requirements of real-world applications. In essence, the inversion of SM should incorporate multiple characteristic variables.
In recent years, the utilization of ML techniques to extract feature factors for prediction purposes has gained significant traction. Unlike conventional approaches, ML and deep-learning methods are not reliant on specific assumptions, granting them exceptional capability in capturing nonlinear relationships [15]. This proves particularly valuable as traditional models often struggle to obtain complex parameters, leading to errors in inversion results. By leveraging the nonlinear information conveyed by relevant remote sensing signals, ML methods can yield more accurate SM inversion results. Over the past few years, the field of SM inversion has witnessed a significant increase in the application of ML methods and remote sensing signal multi-feature selection techniques [16]. These approaches not only offer a multitude of characteristic parameters crucial for inversion but also demonstrate a remarkable ability to accurately capture nonlinear relationships by leveraging advanced ML techniques [17]. This distinct advantage enables ML methods to outperform traditional models in the domain of SM inversion by effectively harnessing the nonlinearity inherent in remote sensing signals. Mehdi Jamei et al. employed a multi-variable integrated approach, utilizing Boruta Gradient Boosted Decision Tree (BorUTA-GBDT) feature selection, Variational Mode Decomposition (VMD), and advanced ML (ML) models, including Bidirectional Gated Recurrent Unit (Bi-GRU), Cascaded Forward Neural Network (CFNN), Adaptive Boosting (AdaBoost), Genetic Programming (GP), and classical Multilayer Perceptron Neural Networks (MLP). They applied this framework to predict surface SM in arid and semi-arid regions of Iran, using NASA’s SM Active Passive (SMAP) satellite dataset [18]. Thu Thuy Nguyen et al. presented a novel approach that combines advanced ML (ML) models and multi-sensor data fusion to enhance the performance of SM regression. Specifically, the study utilized the Extreme Gradient Boosting Regression (XGBR) algorithm along with a genetic algorithm (GA) optimizer for feature selection and optimization. This approach aimed to improve SM predictions for bare land areas. The research compared and examined the performance of the proposed approach with various scenarios and ML models [19]. Lin Chen et al., conducted a comparison of three feature selection methods: Pearson correlation, support vector machine recursive feature elimination, and random forest (RF). Additionally, they evaluated the performance of three advanced ML models: support vector regression, RF, and gradient boosting regression tree. The objective of the study was to estimate SM during the winter wheat growing season [20]. Zhang et al., developed an SM retrieval method, Multi-MDA-RF, utilizing RF, which incorporated 29 features derived from passive microwave remote sensing data, optical remote sensing data, land surface models (LSMs), and other auxiliary data. The objective was to assess the significance of these features in SM retrieval. To achieve this, the researchers compared 10 different filter or embedded feature selection methods [21]. The feature selection method for a vegetation-covered area based on remote sensing images, developed by Gao et al., utilized multiple boosting algorithms, including Gradient Boosting Decision Tree (GBDT), eXtreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), and CatBoost, for estimating SM [22]. Through recent research conducted both domestically and internationally, numerous studies have focused on the retrieval of SM using single ML methods. However, the performance of these methods tends to vary. In order to enhance the accuracy of each method, we aim to propose an approach that combines multiple single ML models. The integration of learning algorithms in stacking has gained widespread popularity due to its high-performance flexibility and generalization capability [23].
Suitable remote sensing eigen-factors play a crucial role in enhancing the accuracy of SM inversion estimations. However, it should be noted that not all remote sensing feature factors exhibit a strong correlation with SM, and certain features may introduce noise that affects estimation accuracy. Additionally, some feature factors may interact with each other, leading to a reduction in the accuracy of the model. Incorporating too many features may contribute to overfitting and hinder the generalization of the model, resulting in poor performance.
Therefore, the objective of this study is to identify and select feature factors that demonstrate a strong correlation with SM. This selection process ensures the accuracy of the estimation model before proceeding with modeling. In summary, the proposed methodology encompasses two steps: firstly, screening and selecting appropriate feature factors and, subsequently, developing the SM estimation model.

2. Materials and Methods

2.1. Study Area

The study area is situated within the Xiaoluanhe River Basin, an important agricultural region located in the eastern part of Hebei Province. The basin primarily cultivates crops such as wheat, corn, soybeans, and others. However, the Xiaoluanhe River basin also confronts environmental challenges, including issues of water scarcity and soil erosion. SM plays a crucial role in the overall water resources within the basin. Therefore, the inversion of SM holds significant applicability not only within the Xiaoluanhe River Basin but also in other similar regions. This inversion process yields valuable information that can support agricultural decision-making, facilitate water resource management, enable disaster monitoring, enhance scientific research, and ultimately contribute to the sustainable development and ecological preservation of the river basin. Figure 1 shows the study area.
At the same time, we selected the surface and soil temperature and humidity data measured synchronously during the 2018 Luanhe River Basin SM Remote Sensing experiment (SMELR) aviation flight test to verify the “true value” of remote sensing inversion. We chose 0–5 cm of SM. The surface sampling sites are distributed in the upper reaches of the Luan River (including the Lightning River Basin and the Xiaoluan River Basin), and the sampling date was 19 September 2018 [24,25].

2.2. Dataset

2.2.1. Sentinel-1

Sentinel-1 is a group of radar satellites launched by the European Space Agency (ESA) for Earth observation and monitoring. The Sentinel-1 satellites carry Synthetic Aperture Radar (SAR) sensors, which can acquire information about the surface of the Earth under any weather conditions, such as rainfall, cloud cover, night, etc. Sentinel-1 uses radar technology for surface observation, is not affected by sunlight, clouds and atmospheric interference, and can provide all-weather, all-time remote sensing data. Sentinel-1 consists of two polar-orbiting satellites, 1A and 1B (https://search.asf.alaska.edu, accessed on 23 February 2023) [26,27]. The sensors on the two satellites are synthetic aperture radar (SAR), which is an active microwave remote sensing satellite. The sensor is equipped with a C-band. We chose the Ground Range Detected (GRD) level-1 product of Sentinel-1, which has multiple apparent intensity data and is related to the backscattering coefficient. It can be used for SM inversion. We used SNAP 10.0.0 (https://step.esa.int/main/toolboxes/snap/, accessed on 23 February 2023) to apply orbit file, thermal noise removal, radiometric calibration, multilooking, specular filtering, terrain correction, and data conversion processing of Sentinel-1. The spatial resolution of the processed radar images is 30 m. In our study, we selected Sentinel-1 data for 19 September and 26 September 2018. Finally, we obtain the radar backscattering coefficient (σ0) after the polarization decibels of VV and VH.

2.2.2. Landsat-8

Landsat-8 is part of the esteemed Landsat satellite program jointly operated by the United States Geological Survey (USGS) and the National Aeronautics and Space Administration (NASA). Serving as the eighth satellite in the series, Landsat-8 plays a crucial role in collecting Earth observation data and offering remote sensing imagery and surface information of exceptional quality. Onboard Landsat-8, two remarkable sensors, namely the Operational Land Imager (OLI) and the Thermal Infrared Sensor (TIRS), take center stage [28,29]. These sensors possess the capability to capture multi-band remote sensing data, spanning various wavelength ranges encompassing the visible, infrared, and thermal infrared spectrums. Their high sensitivity allows for the detection and analysis of subtle surface features across different wavebands, including but not limited to land cover classification, vegetation health assessment, and water body mapping, among others. Hence, for the purpose of this study, the dataset chosen consisted of bands b1–b7 from the Landsat-8 Operational Land Imager (OLI) data. It is important to note that the Landsat data undergo necessary geometric and topographic corrections, thus requiring only radiometric and atmospheric corrections for our analysis. The spatial resolution of all the bands utilized in the study was set at a uniform 30 m. Similarly, the calculated vegetation indices, computed through band arithmetic, also possessed a resolution of 30 m. The derivation of various vegetation indices was conducted to serve as feature selection parameters for this investigation. We selected the Landsat-8 data from 18 September and 25 September 2018 for our study.

2.2.3. ASTGTM DEM

The analysis carried out in this paper relies on data obtained from the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) onboard the ASTER remote sensing satellite. One of the key datasets utilized is the ASTER Global Digital Elevation Model (ASTER GDEM), which provides precise elevation information on the Earth’s surface [30,31]. This valuable resource finds applications in various fields, including terrain analysis, geological studies, water resource management, environmental modeling, and more. The ASTER GDEM V2, available at http://www.gscloud.cn/, accessed on 23 February 2023, represents the most comprehensive global elevation dataset offering high precision, with a resolution of approximately 30 m. Specifically, for our study, we extracted Slope, Aspect, and Digital Elevation Model (DEM) values from the ASTER GDEM, all at a resolution of 30 m. A comprehensive summary of all the data employed in this research is presented in Table 1. Table 2 provides a comprehensive overview of the specific vegetation index types employed and their corresponding calculation formulas.

2.3. Method

2.3.1. Feature Selection Algorithm

Feature selection plays a pivotal role in the field of ML as it involves the meticulous identification and selection of essential features that exhibit high consistency, low redundancy, and strong relevance to the specific ML model under consideration. By carefully curating the feature set, the size and complexity of the dataset can be effectively reduced, thereby facilitating quicker training and more efficient inference. The inclusion of a smaller subset of features in simple ML models enhances comprehensibility and interpretability. Furthermore, feature selection serves as a preventive measure against overfitting, a phenomenon characterized by excessive complexity in the model due to an excessively large number of features. Such complexity leads to a dimensionality disaster, where errors increase proportionally with the feature count. Consequently, feature selection becomes instrumental in managing this issue. For this study, we have opted to employ the SHApley Additive exPlanations (SHAP) method and information gain (Info-gain) technique as our feature selection approaches.
SHAP methodology offers a valuable approach to addressing the interpretability of models. By leveraging the concept of Shapley values, SHAP serves as a feature selection method that quantifies the influence of individual features on the model’s output. Consequently, it allows for the assignment of importance scores to each feature. The underlying principle of SHAP feature selection revolves around calculating the Shapley value for every feature. This calculation enables the evaluation of the feature’s contribution to the model’s output within various combinations of features. More specifically, the Shapley value determines the average contribution of a feature when combined with different subsets of features, providing insights into its relative importance. By assessing the Shapley value for each feature, a ranking of their importance can be established [45,46].
Info-gain is a widely utilized method for selecting features, providing a way to evaluate the predictive ability of a feature regarding the target variable. By calculating the mutual information, or information gain, between features and the target, we can determine the significance of the features and select those with the highest information gain. A higher information gain suggests a stronger explanatory ability of the feature for the target variable, thereby emphasizing its importance. Lastly, we select the features that are common to both feature selection methods in order to explore the results obtained by utilizing these intersecting elements as feature inputs [47,48].

2.3.2. Ada-Stacking Algorithm

The stacking algorithm—also referred to as ensemble stacking—combines the predictions of multiple models using a meta-regressor and then performs prediction again. By replacing the weighted average of the bagging algorithm with a meta-learner, the stacking algorithm achieves further generalization, effectively reducing estimation bias and model variance. As a result, the algorithm’s robustness is enhanced. However, the stacking algorithm, despite being an ensemble learning algorithm that combines multiple models, does not fully exploit the advantages of each individual model. To address this limitation, we propose an adaptive learning approach based on the stacking algorithm. Our proposed method, termed adaptive learning stacking (Ada-Stacking), combines the strengths of both adaptive learning and stacking. Ada-Stacking is an iterative algorithm that constructs multiple weak classifiers by assigning weights to the training dataset and then combines them into a strong classifier. In contrast, the stacking method integrates the predictions of multiple base models as inputs and trains a meta-model to generate the final prediction. The steps involved in the Ada-Stacking algorithm are as follows:
  • Initialize the weights of the training data.
  • Utilize adaptive to iteratively train a set of weak classifiers. Each weak classifier is trained using the current sample weights.
  • Predict the training data using the set of weak classifiers and obtain the prediction results for each sample.
  • Aggregate the prediction results of the weak classifiers as new features and combine them with the original features.
  • Apply the stacking method to train a meta-model using the combined features as input. An adaptive weight adjustment mechanism is proposed in this study to enhance the performance of the base learner. This mechanism dynamically adjusts the weights assigned to the base learner based on the comparison between the predicted outcomes and the actual labels. Weight adjustments are made by increasing the weights for base learners that exhibit smaller prediction errors and decreasing the weights for those with larger prediction errors.
  • Utilize the trained meta-model to make predictions on the test data.
The Ada-Stacking algorithm improves the accuracy and robustness of the model in classification or regression tasks. It combines the weighted training of adaptive and the predictive power of stacking, allowing for better utilization of the data feature-model relationship and enhancing the model’s generalization ability.

2.3.3. Overall Soil Moisture Retrieval Framework

The overall framework for retrieving SM in this study can be divided into two main parts. The first part focuses on feature selection, where we consider two feature extraction methods and three forms of feature selection: (1) SHAP, (2) Info-gain, and (3) SHAP and information gain. We define three feature selection methods with varying numbers of features as input, specifically 5, 10, 15, and 20.
The second part involves SM inversion using the Ada-Stacking algorithm. The specific implementation steps of the Ada-Stacking algorithm are outlined as follows:
  • Dataset preparation: Partition the original dataset into a training set and a test set, ensuring dataset balance and representativeness.
  • Initialization of training data weights: Initialize the weights of the training samples, either by evenly distributing them or adjusting them based on the sample categories.
  • Iterative training of adaptive weak classifiers: Employ adaptive learning to train weak classifiers iteratively. In each round of training, a weak classifier is trained using the current sample weights. Typically, weaker-performing models are chosen as weak classifiers. The training of each weak classifier involves adjusting the sample weights based on the classification error rate from the previous round. This ensures that more attention is given to the misclassified samples in the subsequent rounds.
  • Prediction using weak classifiers: Utilize the trained weak classifiers to predict both the training and test data.
  • Construction of new features: Incorporate the prediction results of the weak classifiers as new features and combine them with the original features. These new features can be utilized in subsequent meta-model training.
  • Meta-model training using the Ada-Stacking method: Employ the stacking method to train a meta-model, also known as a combined model, which takes the merged features as input for predicting the target variable. The first layer consists of six models: RF, AdaBoost, GBDT, LightGBM, XGBoost, and CatBoost. The second layer employs a multilayer perceptron for prediction. In our research, we have implemented an automated weight adjustment scheme within the stacking algorithm. This mechanism dynamically adjusts the weights assigned to each ML algorithm incorporated in the stacking ensemble. In the study of soil moisture inversion using ensemble learning algorithms, we employed a ten-fold cross-validation approach to evaluate the model performance. Specifically, we divided the entire dataset into training and validation sets. In total, 60% of the data were used as the training set to train the ensemble learning model, while the remaining 40% was allocated as the validation set to assess the model’s generalization ability and prediction accuracy.
  • Meta-model prediction: Utilize the trained meta-model to predict the test data, obtaining the final predicted SM.
These ensemble methods (RF, AdaBoost, GBDT, LightGBM, XGBoost, and CatBoost) demonstrate superior performance in soil moisture retrieval by effectively handling nonlinear relationships, mitigating overfitting in high-dimensional remote sensing data, and robustly integrating multi-source features under varying terrain conditions.
We selected RF, AdaBoost, GBDT, LightGBM, XGBoost, and CatBoost as base models based on the following considerations: First, all these algorithms belong to ensemble learning methods, exhibiting excellent predictive performance and anti-overfitting capabilities. Second, the stacked ensemble (stacking) algorithm constructed in this study is essentially a higher-order ensemble learning framework. By systematically comparing the performance of single ensemble algorithms versus the stacked ensemble algorithm in the study area, this research aims to address the following scientific question: Under complex surface conditions, which ensemble strategy (single model or meta-ensemble) can more accurately and stably retrieve soil moisture? This comparative study will provide methodological references for remote sensing-based soil moisture inversion.
Figure 2 illustrates the overall research framework of this study. Figure 3 shows the generated framework of SM retrieval using multi-source data.

2.4. Model Evaluation Indicators

In this study, four evaluation indices were employed to assess the accuracy of the SM inversion results, namely the Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and correlation coefficient (R2). These indices are calculated using the following formulas:
MAE = 1 n i = 1 n y i y ^ i
R M S E = 1 n i = 1 n y i y ^ i 2
R 2 = 1 i ( y ^ i y i ) 2 i ( y ¯ i y i ) 2
where y i is the true value of SM. y ^ i is the predicted value of SM. y ¯ i is the average value of SM.

3. Results

3.1. Correlation Analysis of Feature Selection Results and Measured SM

Figure 4 and Figure 5 illustrate the importance of input parameters derived from Sentinel-1, Landsat-8, and DEM datasets using the SHP method and the information gain method, respectively. For the feature selection analysis, a total of 25 features were selected from the aforementioned datasets. It should be noted that the ordering of feature importance varies depending on the feature selection method employed. In this study, the top 20 features based on their importance rankings were selected for further analysis using each feature selection method. According to the feature importance ranking results shown in Figure 3 and Figure 4, DEM demonstrates the highest contribution to soil moisture retrieval. This phenomenon may be closely related to the unique topographic conditions of the study area: the region exhibits significant terrain undulations with distinct sunny and shady slopes. Areas with higher DEM values typically feature lower temperatures, reduced evaporation, and increased precipitation—microclimatic characteristics that collectively contribute to higher soil moisture content. Although the importance rankings of other features vary, their relative contributions remain comparable. Based on this, our study systematically evaluated the performance of different feature combinations and ultimately identified the most sensitive feature combination system for detecting soil moisture variations. Based on the findings in Figure 4, the selected features include DEM, VV, slope, B1, EVI, VH, ARVI, B6, TVI, DVI, B4, aspect, B2, B3, MSI, B5, NDWI2201, NDWI1640, Albedo, and SAVI. Conversely, Figure 5 suggests that the selected features include DEM, DVI, MSI, VV, NMDI, NDWI1640, EVI, VH, B5, TVI, NDWI2201, B6, RVI, B1, SAVI, Albedo, MSR, NDVI, B3, and aspect. Although different feature importance rankings were obtained using the two feature selection methods, it is essential to determine whether the selected features are closely related to SM and can yield more accurate predictions. Consequently, we grouped these features as inputs for the SM prediction method. Furthermore, in addition to combining different numbers of features from each feature selection method, we explored another approach, namely different combinations of feature variables from the intersection of the two methods. For clarity, we defined the following feature combination methods: M1 mode with five feature parameters, M2 mode with ten feature parameters, M3 mode with fifteen feature parameters, and M4 mode with twenty feature parameters. Please refer to Table 3 for the details of the specific feature combination.

3.2. Evaluation and Comparison Under Different Feature Combinations and Different ML Methods

Our findings indicate that the selection of features as input sets for multiple ML algorithms varies depending on the employed feature selection methods. These variations aim to determine which feature selection method, along with the optimal number of selected features, can yield more accurate predictions of SM.
Figure 6 illustrates the results of SM inversion under different feature combinations using the Info_gain feature selection method with various ML algorithms. The MAE ranges from 1.86 to 4.03 Vol. %, RMSE ranges from 2.68 to 5.68 Vol. %, and R2 ranges from 0.46 to 0.95. It is evident that the optimal results can be obtained in the M4 mode across the four quantitatively defined combinations of the Info_gain feature selection method. However, the results differ between the different ML algorithms. Specifically, Figure 6A presents the MAE between the predicted and actual SM calculations. In the M4 mode, the MAEs obtained using the RF, GBDT, XGBoost, and Ada_Stacking algorithms are superior to other modes, with values of 1.86 Vol. %, 2.83 Vol. %, 3.08 Vol. %, and 1.86 Vol. %, respectively. In the case of the AdaBoost, CatBoost, and LightGBM algorithms, M3 achieves relatively low MAEs of 3.03 Vol. %, 2.72 Vol. %, and 3.83 Vol. %, respectively. Figure 6B showcases the RMSE of the SM inversion results. The RF, GBDT, CatBoost, XGBoost, and Ada_Stacking algorithms exhibit lower RMSE values under the M4 mode, with values of 3.65 Vol. %, 3.93 Vol. %, 4.68 Vol. %, and 2.68 Vol. %, respectively. Moreover, Figure 6C demonstrates the calculated R2 values. It is evident that using the Ada_Stacking algorithm consistently yields higher R2 values across all four modes, ranging between 0.94 and 0.95. Based on Figure 5, it can be concluded that when applying the Info_gain feature selection method, the Ada_Stacking algorithm consistently produces superior results. Additionally, the CatBoost algorithm yields slightly inferior results compared to the Ada_Stacking algorithm. Conversely, the SM inversion outcomes obtained with the LightGBM algorithm are the least accurate among the considered ML algorithms.
Figure 7 illustrates the results of SM inversion using the SHAP feature selection method in the M1, M2, M3, and M4 modes, employing multiple ML algorithms. The MAE ranges from 1.89 Vol. % to 3.95 Vol. %, the RMSE ranges from 2.77 Vol. % to 5.5 Vol. %, and the R2 results range from 0.48 to 0.94. Figure 6A–C present the calculation of MAE, RMSE, and R2, respectively. Notably, the accuracy of the SM inversion results obtained using the Ada_Stacking algorithm is consistently superior to several other ML algorithms. Conversely, the overall accuracy of the LightGBM algorithm is lower. However, it is worth mentioning that under the SHAP feature selection algorithm and utilizing different ML algorithms, the optimal results are not consistently obtained in the M4 mode.
Figure 8 displays the intersection set results obtained by combining the Info_gain and SHAP feature selection methods in M1, M2, M3, and M4 modes as the input dataset. This analysis examines whether combining the two feature selection algorithms can yield better results than using them separately when considering the intersection features. In this scenario, the MAE ranges from 2.05 Vol. % to 4.65 Vol. %, the RMSE ranges from 2.82 Vol. % to 7.12 Vol. %, and the R2 varies between 0.38 and 0.94. Specifically, in the M1 mode, relatively high values of MAE and RMSE are observed. This suggests that utilizing only the DEM and VV characteristics does not perform SM inversion as effectively as other models. Furthermore, it is interesting to note that the Ada_Stacking algorithm still produces the best results; however, the outcomes in the M3 case are superior to those in the M4 mode. This finding indicates that a higher number of characteristic parameters does not always result in higher precision outcomes.

3.3. SM Inversion Results Based on Ada-Stacking Algorithm

According to Section 3.2, it is evident that the Ada-Stacking algorithm yields higher accuracy in SM inversion. Scatter plots of predicted and measured SM are presented in Figure 9, Figure 10 and Figure 11. Table 4 provides a specific evaluation of accuracy for SM content using the Ada-Stacking algorithm under different feature selection methods. Additionally, Figure 11 demonstrates a comparison of SM content change trends based on Ada-Stacking under different feature selection methods.
Figure 9 depicts a scatter plot of measured and predicted SM using the Info_gain feature selection algorithm and Ada-Stacking. In Figure 8A, the Ada-Stacking input dataset consists of five features: DEM, DVI, MSI, VV, and NMDI. The MAE, RMSE, and R2 are reported as 2.23 Vol. %, 3.06 Vol. %, and 0.94, respectively. In Figure 8B, ten feature parameters are selected, including DEM, DVI, MSI, VV, NMDI, NDWI1640, EVI, VH, B5, and TVI. Figure 8C expands the feature set to fifteen parameters, including DEM, DVI, MSI, VV, NMDI, NDWI1640, EVI, VH, B5, TVI, NDWI2201, B6, RVI, B1, and SAVI. However, Table 4 reveals that the results under the M2 mode are superior to those under the M3 mode. The MAE in M2 mode is 0.02 Vol. % less than in M3 mode, while the RMSE is 0.12 Vol. % less. Interestingly, the R2 value remains the same at 0.94. Figure 9d utilizes a set of twenty characteristic parameters, resulting in optimal SM inversion results with MAE of 1.86 Vol. %, RMSE of 2.68 Vol. %, and R2 of 0.95.
Figure 10 presents scatter plots of soil water inversion results obtained using the SHAP feature selection algorithm and Ada-Stacking. In Figure 9a, the SHAP feature selection algorithm selects DEM, VV, slope, B1, and EVI as features. The MAE and RMSE, in this case, are 2.07 Vol. % and 2.89 Vol. %, respectively. In Figure 9b, ten features, including DEM, VV, slope, B1, EVI, VH, ARVI, B6, TVI, and DVI, are selected. Figure 9c expands the feature set to fifteen parameters, including DEM, VV, slope, B1, EVI, VH, ARVI, B6, TVI, DVI, B4, aspect, B2, B3, and MSI. Figure 9d utilizes twenty feature parameters, including DEM, VV, slope, B1, EVI, VH, ARVI, B6, TVI, DVI, B4, aspect, B2, B3, MSI, B5, NDWI2201, NDWI1640, Albedo, and SAVI. The MAE values for these modes are 2.08 Vol. %, 1.99 Vol. %, and 1.89 Vol. %, respectively. The corresponding RMSE values are 2.88 Vol. %, 2.9 Vol. %, and 2.77 Vol. %, while the R2 values remain constant at 0.94.
Figure 11 demonstrates the intersection set of features used as the input for the SM inversion algorithm, employing the Info_gain and SHAP feature selection algorithms in each mode. It is observed that the M3 model (DEM, VV, DVI, EVI, VH, TVI, B1) yields optimal results with an MAE, RMSE, and R2 of 2.05 Vol. %, 2.82 Vol. %, and 0.94, respectively. Conversely, the worst accuracy is obtained in the M1 mode, with an MAE of 2.7 Vol. %, which is 0.62 Vol. % higher than that of the M2 mode and 0.61 Vol. % higher than the M4 mode. Similarly, the RMSE is 3.78 Vol. %, which is 0.96 Vol. % higher than the M2 mode and 0.81 Vol. % higher than the M4 mode. The R2 is 0.88, 0.05 lower than the M2 mode and 0.06 lower than the M4 mode.
In summary, the Info_gain feature selection algorithm, coupled with the Ada-Stacking algorithm, yields optimal SM inversion accuracy with the inclusion of twenty parameters: DEM, DVI, MSI, VV, NMDI, NDWI1640, EVI, VH, B5, TVI, NDWI2201, B6, RVI, B1, SAVI, Albedo, ARVI, NDVI, B3, and aspect.

4. Discussion

4.1. Comparison of Info_gain, SHAP and Info_gain ∩ SHAP for Variables Selection

In this study, two feature selection methods, namely Info_gain and SHAP, were employed to select the feature parameters for the SM inversion method. The objective was to investigate the accuracy of SM prediction results under different scenarios by selecting varying numbers of features as inputs through different feature selection methods. Additionally, the two feature selection methods were used to determine the feature parameters of the intersection sets in different modes, with the aim of identifying the selected features that were most strongly correlated with SM in these cases. The Info_gain algorithm was utilized to select features based on their information gain, indicating their ability to effectively differentiate the target variables in the prediction task. By incorporating these important features, both the accuracy and generalization capability of the prediction model could be enhanced [49,50,51]. The Info_gain algorithm’s ability to select features that are highly relevant to the target variable contributes to reducing model complexity and mitigating the risk of overfitting. Overfitting occurs when a model performs well on training data but fails to generalize to new data. By utilizing the Info_gain algorithm to select significant features, the interference caused by irrelevant or redundant features is minimized, thereby improving the model’s adaptability to new data. The features selected by the Info_gain algorithm are typically correlated with the target variable, which enhances the explanatory power of the prediction model. By identifying important features, a deeper understanding of the influence of different features on the target variables can be gained, resulting in more reliable and meaningful predictions [52]. It is important to acknowledge that while the Info_gain feature selection algorithm demonstrates favorable performance in various cases, it possesses certain limitations. Primarily, the algorithm considers only one-way relationships between features and target variables, thereby disregarding intercorrelations among features. Moreover, the Info_gain algorithm is not sensitive to feature distributions and might overlook potentially influential features. Consequently, when employing the Info_gain feature selection algorithm, it becomes paramount to comprehensively evaluate the data characteristics and align them with business requirements, integrating other feature selection methods for a comprehensive analysis.
In contrast, the SHAP feature selection algorithm offers both global and local interpretations of feature importance. It quantifies the contribution of each feature to the overall model prediction, as well as the influence of each feature on the prediction outcome of an individual sample [53,54,55,56].
The SHAP feature selection algorithm is renowned for its fairness and stability. By considering all possible feature permutations, the computation of Shapley values ensures equitable contributions amongst the features [57]. The SHAP feature selection algorithm demonstrates broad applicability across various ML models. By considering all possible permutations of features, the computation of Shapley values ensures fairness in contributions between features.
Nevertheless, it is important to note that the effectiveness of the SHAP feature selection algorithm heavily relies on the feature space. If there is a large number of features or a high correlation among them, it may result in longer computation times, increased memory consumption, and difficulties in interpreting feature importance.
In Figure 12, different feature selection algorithms are employed to select a specific number of features for SM prediction. The results show that in the M4 mode, which entails selecting a larger number of features, SM can be achieved with higher accuracy. However, in scenarios involving limited data samples and data imbalance, it is necessary to adapt the choice of feature selection methods according to the specific circumstances, particularly when utilizing different sources of multi-source remote sensing data.

4.2. Comparison of Ada-Stacking and Other ML Algorithms for SM Inversion

In this study, we have developed a novel stacking algorithm for soil water prediction using adaptive learning techniques. The stacking algorithm incorporates six different ML models, and adaptive learning is implemented within the stacking algorithm. Additionally, we have conducted a comparative analysis using a single ML model to evaluate the performance of our newly proposed Ada-Stacking algorithm. The results show that our Ada-Stacking algorithm achieves higher precision in soil water prediction.
The stacking algorithm is designed to enhance model performance by integrating the prediction outputs of multiple base classifiers. This approach leverages the strengths of different classifiers, leading to improved overall model performance. By utilizing multiple classifiers, the stacking algorithm is able to capture the diversity among them, thereby accommodating various data characteristics and patterns more effectively [58,59,60].
Adaptive learning algorithms possess the ability to dynamically adjust model parameters and structure based on varying environments and data characteristics, thereby facilitating improved performance. The key advantage of adaptive learning algorithms lies in their robustness, as they can maintain superior performance even when faced with anomalies such as incomplete data, noise, and outliers. By autonomously adapting and fine-tuning model parameters and structure, adaptive learning algorithms effectively enhance the robustness and stability of the model, thereby handling abnormal situations adeptly [61,62,63].
Incorporating an adaptive stacking integration model involves training a meta-model that can dynamically adjust the weights and structures of individual base classifiers based on the data characteristics. This adaptive approach allows the stacking algorithm to automatically select the optimal combination of base classifiers for different data scenarios, thereby enhancing the model’s generalization ability and performance. Additionally, the Ada-Stacking algorithm introduces the use of diverse feature combinations fed into individual base classifiers, which facilitates more comprehensive feature representation and improved learning capabilities. Consequently, it enables the discovery of interactions between different features and better capture of nonlinear relationships in the data [64,65].
Figure 6, Figure 7 and Figure 8 clearly demonstrate the superiority of our proposed method, Ada-Stacking, in terms of accuracy compared to other single models. These results underscore the efficacy of the multi-source remote sensing data features selected in this study, as well as the effectiveness of the established Ada-Stacking algorithm for SM prediction.

5. Conclusions

In this study, we conducted analyses using various multi-source remote sensing data, including Sentinel-1, Landsat-8, and ASTGTM DEM. We calculated 25 different feature parameters based on these data. Subsequently, we employed three feature selection algorithms, namely Info_gain, SHAP, and Info_gain ∩ SHAP, to select different combinations of features as datasets. Through these different combinations, we explored the associations between the selected method, feature combinations, and SM. Additionally, we developed an adaptive stacking algorithm for SM inversion prediction, which we compared against a single algorithm to assess the feasibility and effectiveness of our approach.
Analysing our constructed method reveals the following advantages:
  • Combining Info_gain and SHAP for feature selection effectively identifies the most informative and physically meaningful predictors by balancing statistical relevance (Info_gain) with model-specific nonlinear interactions (SHAP), thereby enhancing feature interpretability while reducing redundancy.
  • The Ada-Stacking algorithm integrates adaptive boosting (AdaBoost) with stacked generalization, leveraging ensemble learning to mitigate overfitting and improve generalization across diverse vegetation and terrain conditions, where traditional single-model approaches often fail.
  • This hybrid framework synergizes robust feature selection with advanced ensemble modeling, achieving higher accuracy in soil moisture retrieval—particularly in complex landscapes—by dynamically weighting base learners and optimizing meta-learner performance through iterative error correction.
Our analysis revealed that using the Info_gain feature selection algorithm in M4 mode (consisting of DEM, DVI, MSI, VV, NMDI, NDWI1640, EVI, VH, B5, TVI, NDWI2201, B6, RVI, B1, SAVI, Albedo, ARVI, NDVI, B3, and aspect features) yielded optimal results when employing the Ada-Stacking algorithm. In this case, we achieved the MAE of 1.86 Vol.%, the RMSE of 2.68 Vol.%, and the R2 value of 0.95. Comparing these results with those obtained from single ML models, it is evident that the feature selection method and the established Ada-Stacking algorithm significantly improve inversion accuracy. Therefore, the algorithm proposed in this paper is capable of achieving higher precision results compared to other algorithms. However, it is important to note that our study has limitations. The selected multi-source remote sensing data in this research are not continuous, making it unsuitable for daily SM prediction. For future research, we plan to explore other multi-source remote sensing data for SM monitoring and prediction. In this study, the features selected were primarily based on the existing dataset and the specific research context, influenced by the measured data. A comprehensive and in-depth analysis of diverse climate and geographical scenarios has not yet been conducted. In future research, we will focus on collecting and studying data from different climate types and geographical conditions to thoroughly assess the applicability and limitations of the selected features, thereby further enhancing the model’s generalization ability in a broader range of environments. Similarly, in the subsequent stages, we will apply more rigorous statistical methods to conduct an in-depth examination of the model error metrics.

Author Contributions

Conceptualization, L.W.; Methodology, L.W. and Y.G.; Software, Y.G.; Writing—original draft, Y.G.; Writing—review & editing, Y.G.; Funding acquisition, L.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Liaoning Provincial Science and Technology Plan Project (2024JH2/102600106), the National Natural Science Foundation of China (grant number 62071084).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

The Leading Talents Project of the State Ethnic Affairs Commission. The data set is provided by the National Tibetan Plateau Data Center (http://data.tpdc.ac.cn, accessed on 23 February 2023).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Babaeian, E.; Paheding, S.; Siddique, N.; Devabhaktuni, V.K.; Tuller, M. Estimation of root zone soil moisture from ground and remotely sensed soil information with multisensor data fusion and automated machine learning. Remote Sens. Environ. 2021, 260, 112434. [Google Scholar] [CrossRef]
  2. Peng, J.; Albergel, C.; Balenzano, A.; Brocca, L.; Cartus, O.; Cosh, M.H.; Crow, W.T.; Dabrowska-Zielinska, K.; Dadson, S.; Davidson, M.W.; et al. A roadmap for high-resolution satellite soil moisture applications-confronting product characteristics with user requirements. Remote Sens. Environ. 2020, 252, 112162. [Google Scholar] [CrossRef]
  3. Mayer, M.; Prescott, C.E.; Abaker, W.E.A.; Augusto, L.; Cécillon, L.; Ferreira, G.W.; James, J.; Jandl, R.; Katzensteiner, K.; Laclau, J.-P.; et al. Tamm Review: Influence of forest management activities on soil organic carbon stocks: A knowledge synthesis. Forest Ecol. Manag. 2020, 466, 118127. [Google Scholar] [CrossRef]
  4. Jamei, M.; Karbasi, M.; Malik, A.; Abualigah, L.; Islam, A.R.; Yaseen, Z.M. Computational assessment of groundwater salinity distribution within coastal multi-aquifers of Bangladesh. Sci. Rep. 2022, 12, 11165. [Google Scholar] [CrossRef]
  5. Senyurek, V.; Lei, F.; Boyd, D.; Gurbuz, A.C.; Kurum, M.; Moorhead, R. Evaluations of Machine Learning-Based CYGNSS Soil Moisture Estimates against SMAP Observations. Remote Sens.-Basel. 2020, 12, 3503. [Google Scholar] [CrossRef]
  6. Liu, L.; Gudmundsson, L.; Hauser, M.; Qin, D.; Li, S.; Seneviratne, S.I. Soil moisture dominates dryness stress on ecosystem production globally. Nat. Commun. 2020, 11, 4892. [Google Scholar] [CrossRef] [PubMed]
  7. Wang, L.; Gao, Y. Estimating and Downscaling ESA-CCI Soil Moisture Using Multi-Source Remote Sensing Images and Stacking-Based Ensemble Learning Algorithms in the Shandian River Basin, China. Remote Sens. 2025, 17, 716. [Google Scholar] [CrossRef]
  8. Lopes, C.L.; Mendes, R.; Caçador, I.; Dias, J.M. Assessing salt marsh loss and degradation by combining long-term Landsat imagery and numerical modelling. Land Degrad. Dev. 2021, 32, 4534–4545. [Google Scholar] [CrossRef]
  9. Tong, C.; Wang, H.; Magagi, R.; Goïta, K.; Zhu, L.; Yang, M.; Deng, J. Soil Moisture Retrievals by Combining Passive Microwave and Optical Data. Remote Sens. 2020, 12, 3173. [Google Scholar] [CrossRef]
  10. Delavar, M.A.; Naderi, A.; Ghorbani, Y.; Mehrpouyan, A.; Bakhshi, A. Soil salinity mapping by remote sensing south of Urmia Lake, Iran. Geoderma Reg. 2020, 22, e00317. [Google Scholar] [CrossRef]
  11. Muzalevskiy, K.; Zeyliger, A. Application of Sentinel-1B Polarimetric Observations to Soil Moisture Retrieval Using Neural Networks: Case Study for Bare Siberian Chernozem Soil. Remote Sens. 2021, 13, 3480. [Google Scholar] [CrossRef]
  12. Luo, M.; Wang, Y.; Xie, Y.; Zhou, L.; Qiao, J.; Qiu, S.; Sun, Y. Combination of Feature Selection and CatBoost for Prediction: The First Application to the Estimation of Aboveground Biomass. Forest 2021, 12, 216. [Google Scholar] [CrossRef]
  13. Al-Yaari, A.; Wigneron, J.P.; Dorigo, W.; Colliander, A.; Pellarin, T.; Hahn, S.; Mialon, A.; Richaume, P.; Fernandex-Moran, R.; Fan, L.; et al. Assessment and inter-comparison of recently developed/reprocessed microwave satellite soil moisture products using ISMN ground-based measurements. Remote Sens. Environ. 2019, 224, 289–303. [Google Scholar] [CrossRef]
  14. He, L.; Cheng, Y.; Li, Y.; Li, F.; Fan, K.; Li, Y. An Improved Method for Soil Moisture Monitoring With Ensemble Learning Methods Over the Tibetan Plateau. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 2833–2844. [Google Scholar] [CrossRef]
  15. Lary, D.J.; Remer, L.A.; MacNeill, D.; Roscoe, B.; Paradise, S. Machine Learning and Bias Correction of MODIS Aerosol Optical Depth. IEEE Geosci. Remote Sens. Lett. 2009, 6, 694–698. [Google Scholar] [CrossRef]
  16. Ma, L.; Liu, Y.; Zhang, X.; Ye, Y.; Yin, G.; Johnson, B.A. Deep learning in remote sensing applications: A meta-analysis and review. ISPRS J. Photogramm. Remote Sens. 2019, 152, 166–177. [Google Scholar] [CrossRef]
  17. Zhang, Y.; Liu, J.; Shen, W. Review of Ensemble Learning Algorithms Used in Remote Sensing Applications. Appl. Sci. 2022, 12, 8654. [Google Scholar] [CrossRef]
  18. Jamei, M.; Ali, M.; Karbasi, M.; Sharma, E.; Jamei, M.; Chu, X.; Yaseen, Z.M. A high dimensional features-based cascaded forward neural network coupled with MVMD and Boruta-GBDT for multi-step ahead forecasting of surface soil moisture. Eng. Appl. Artif. Intel. 2023, 120, 105895. [Google Scholar] [CrossRef]
  19. Nguyen, T.T.; Ngo, H.H.; Guo, W.; Chang, S.W.; Nguyen, D.D.; Nguyen, C.T.; Zhang, J.; Liang, S.; Bui, X.T.; Hoang, N.B. A low-cost approach for soil moisture prediction using multi-sensor data and machine learning algorithm. Sci. Total Environ. 2022, 833, 155066. [Google Scholar] [CrossRef]
  20. Chen, L.; Xing, M.; He, B.; Wang, J.; Shang, J.; Huang, X.; Xu, M. Estimating Soil Moisture Over Winter Wheat Fields During Growing Season Using Machine-Learning Methods. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 3706–3718. [Google Scholar] [CrossRef]
  21. Zhang, L.; Zhang, Z.; Xue, Z.; Li, H. Sensitive Feature Evaluation for Soil Moisture Retrieval Based on Multi-Source Remote Sensing Data with Few In-Situ Measurements: A Case Study of the Continental, U.S. Water 2021, 13, 2003. [Google Scholar] [CrossRef]
  22. Gao, Y.; Wang, L.; Zhong, G.; Wang, Y.; Yang, J. Potential of Remote Sensing Images for Soil Moisture Retrieving Using Ensemble Learning Methods in Vegetation-Covered Area. IEEE J.-STARS 2023, 16, 8149–8165. [Google Scholar] [CrossRef]
  23. Wang, S.; Wu, Y.; Li, R.; Wang, X. Remote sensing-based retrieval of soil moisture content using stacking ensemble learning models. Land Degrad. 2023, 34, 911–925. [Google Scholar] [CrossRef]
  24. Zhao, T.; Shi, J.; Lv, L.; Xu, H.; Chen, D.; Cui, Q.; Jackson, T.J.; Yan, G.; Jia, L.; Chen, L.; et al. Soil moisture experiment in the Luan River supporting new satellite mission opportunities. Remote Sens. Environ. 2020, 240, 111680. [Google Scholar] [CrossRef]
  25. Zhao, T.; Yao, P.; Cui, Q.; Jiang, L.; Chai, L.; Lu, H.; Ma, J.; Lv, H.; Wu, J.; Zhao, W.; et al. Synchronous Observation Data Set of Soil Temperature and Soil Moisture in the Upstream of Luan River (2018); National Tibetan Plateau, Ed.; National Tibetan Plateau Data Center: Beijing, China, 2021. [Google Scholar]
  26. Amazirh, A.; Merlin, O.; Er-Raki, S.; Gao, Q.; Rivalland, V.; Malbeteau, Y.; Khabba, S.; Escorihuela, M.J. Retrieving surface soil moisture at high spatio-temporal resolution from a synergy between Sentinel-1 radar and Landsat thermal data: A study case over bare soil. Remote Sens. Environ. 2018, 211, 321–337. [Google Scholar] [CrossRef]
  27. Chaudhary, S.K.; Srivastava, P.K.; Gupta, D.K.; Kumar, P.; Prasad, R.; Pandey, D.K.; Das, A.K.; Gupta, M. Machine learning algorithms for soil moisture estimation using Sentinel-1: Model development and implementation. Adv. Space Res. 2022, 69, 1799–1812. [Google Scholar] [CrossRef]
  28. Zhang, Y.; Liang, S.; Zhu, Z.; Ma, H.; He, T. Soil moisture content retrieval from Landsat 8 data using ensemble learning. ISPRS J. Photogramm. Remote Sens. 2022, 185, 32–47. [Google Scholar] [CrossRef]
  29. Ghasemloo, N.; Matkan, A.A.; Alimohammadi, A.; Aghighi, H.; Mirbagheri, B. Estimating the Agricultural Farm Soil Moisture Using Spectral Indices of Landsat 8, and Sentinel-1, and Artificial Neural Networks. J. Geovisualization Spat. Analysis 2022, 6, 19. [Google Scholar] [CrossRef]
  30. Filippucci, P.; Brocca, L.; Quast, R.; Ciabatta, L.; Saltalippi, C.; Wagner, W.; Tarpanelli, A. High-resolution (1 km) satellite rainfall estimation from SM2RAIN applied to Sentinel-1: Po River basin as a case study. Hydrol. Earth Syst. Sci. 2022, 26, 2481–2497. [Google Scholar] [CrossRef]
  31. Khandelwal, S.; Goyal, R.; Kaul, N.; Mathew, A. Assessment of land surface temperature variation due to change in elevation of area surrounding Jaipur, India. Egypt. J. Remote Sens. Space Sci. 2017, 21, 87–94. [Google Scholar] [CrossRef]
  32. Naji, T.A. Study of vegetation cover distribution using DVI, PVI, WDVI indices with 2D-space plot. J. Phys. Conf. Ser. 2018, 1003, 012083. [Google Scholar] [CrossRef]
  33. Gurung, R.B.; Breidt, F.J.; Dutin, A.; Ogle, S.M. Predicting Enhanced Vegetation Index (EVI) curves for ecosystem modeling applications. Remote Sens. Environ. 2009, 113, 2186–2193. [Google Scholar] [CrossRef]
  34. Zhang, Y.; Tan, K.; Wang, X.; Chen, Y. Retrieval of soil moisture content based on a modified Hapke Photometric model: A novel method applied to laboratory hyperspectral and Sentinel-2 MSI data. Remote Sens. 2020, 12, 2239. [Google Scholar] [CrossRef]
  35. Wu, C.; Niu, Z.; Tang, Q.; Huang, W. Estimating chlorophyll content from hyperspectral vegetation indices: Modeling and validation. Agric. For. Meteorol. 2008, 148, 1230–1241. [Google Scholar] [CrossRef]
  36. Pettorelli, N.; Ryan, S.; Mueller, T.; Bunnefeld, N.; Jędrzejewska, B.; Lima, M.; Kausrud, K. The Normalized Difference Vegetation Index (NDVI): Unforeseen successes in animal ecology. Clim. Res. 2011, 46, 15–27. [Google Scholar] [CrossRef]
  37. Huang, J.; Chen, D.; Cosh, M.H. Sub-pixel reflectance unmixing in estimating vegetation water content and dry biomass of corn and soybeans cropland using normalized difference water index (NDWI) from satellites. Int. J. Remote Sens. 2009, 30, 2075–2104. [Google Scholar] [CrossRef]
  38. Meng, Q.; Xie, Q.; Wang, C.; Ma, J.; Sun, Y.; Zhang, L. A fusion approach of the improved Dubois model and best canopy water retrieval models to retrieve soil moisture through all maize growth stages from Radarsat-2 and Landsat-8 data. Environ. Earth Sci. 2016, 75, 1377. [Google Scholar] [CrossRef]
  39. Wang, L.; Qu, J.J. NMDI: A normalized multi-band drought index for monitoring soil and vegetation moisture with satellite remote sensing. Geophys. Res. Lett. 2007, 34, L20405. [Google Scholar] [CrossRef]
  40. Major, D.J.; Baret, F.; Guyot, G. A ratio vegetation index adjusted for soil brightness. Int. J. Remote Sens. 1990, 11, 727–740. [Google Scholar] [CrossRef]
  41. Huete, A.R. A soil-adjusted vegetation index (SAVI). Remote Sens. Environ. 1988, 25, 295–309. [Google Scholar] [CrossRef]
  42. Payero, J.O.; Neale, C.M.; Wright, J.L. Comparison of eleven vegetation indices for estimating plant height of alfalfa and grass. Appl. Eng. Agric. 2004, 20, 385–393. [Google Scholar] [CrossRef]
  43. Kaufman, Y.J.; Tanre, D. Atmospherically resistant vegetation index (ARVI) for EOS-MODIS, IEEE T. Geosci. Remote 1992, 30, 261–270. [Google Scholar] [CrossRef]
  44. Robinove, C.J.; Chavez, P.S., Jr.; Gehring, D.; Holmgren, R. Arid land monitoring using Landsat albedo difference images. Remote Sens. Environ. 1981, 11, 133–156. [Google Scholar] [CrossRef]
  45. Bugaj, M.; Wrobel, K.; Iwaniec, J. Model explainability using SHAP values for LightGBM predictions. In Proceedings of the 2021 IEEE XVIIth International Conference on the Perspective Technologies and Methods in MEMS Design (MEMSTECH), Polyana, Ukraine, 12 May 2021; pp. 102–106. [Google Scholar]
  46. Marcilio, W.E.; Eler, D.M. From explanations to feature selection: Assessing SHAP values as feature selection mechanism. In Proceedings of the 33rd SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), Porto de Galinhas, Brazil, 7–10 November 2020; pp. 340–347. [Google Scholar]
  47. Prasetyo, S.E.; Prastyo, P.H.; Arti, S. A cardiotocographic classification using feature selection: A comparative study. JITCE J. Inf. Technol. Comput. Eng. 2021, 5, 25–32. [Google Scholar] [CrossRef]
  48. Dey, S.K.; Raihan Uddin, M.; Mahbubur Rahman, M. Performance analysis of SDN-based intrusion detection model with feature selection approach. In Proceedings of the International Joint Conference on Computational Intelligence: IJCCI 2018, Seville, Spain, 18–19 September 2020; pp. 483–494. [Google Scholar]
  49. Uğuz, H. A two-stage feature selection method for text categorization by using information gain, principal component analysis and genetic algorithm. Knowl.-Based Syst. 2011, 24, 1024–1032. [Google Scholar] [CrossRef]
  50. Pereira, R.B.; Plastino, A.; Zadrozny, B.; Merschmann, L.H. Information gain feature selection for multi-label classification. J. Inf. Data Manag. 2015, 6, 48. [Google Scholar]
  51. Prasetiyowati, M.I.; Maulidevi, N.U.; Surendro, K. Determining threshold value on information gain feature selection to increase speed and prediction accuracy of random forest. J. Big Data 2021, 8, 84. [Google Scholar] [CrossRef]
  52. Gao, Z.; Xu, Y.; Meng, F.; Qi, F.; Lin, Z. Improved information gain-based feature selection for text categorization. In Proceedings of the 2014 4th International Conference on Wireless Communications, Vehicular Technology, Information Theory and Aerospace & Electronic Systems (VITAE), Aalborg, Denmark, 11–14 May 2014; pp. 1–5. [Google Scholar] [CrossRef]
  53. Yi, F.; Yang, H.; Chen, D.; Qin., Y.; Han, H.; Cui, J.; Bai, W.; Ma, Y.; Zhang, R.; Yu, H. XGBoost-SHAP-based interpretable diagnostic framework for alzheimer’s disease. BMC Med. Inform. Decis. 2023, 23, 137. [Google Scholar] [CrossRef] [PubMed]
  54. Mora, T.; Roche, D.; Rodríguez-Sánchez, B. Predicting the onset of diabetes-related complications after a diabetes diagnosis with machine learning algorithms. Diabetes Res. Clin. Pract. 2023, 204, 110910. [Google Scholar] [CrossRef] [PubMed]
  55. Pavon, J.M.; Previll, L.; Woo, M.; Henao, R.; Solomon, M.; Rogers, U.; Olson, A.; Fischer, J.; Leo, C.; Fillenbaum, G.; et al. Machine learning functional impairment classification with electronic health record data. J. Am. Geriatr. Soc. 2023, 71, 2822–2833. [Google Scholar] [CrossRef]
  56. Kim, J.; Lee, H.; Lee, J.; Rhee, S.Y.; Shin, J.I.; Lee, S.W.; Cho, W.; Min, C.; Kwon, R.; Kim, J.G.; et al. Quantification of identifying cognitive impairment using olfactory-stimulated functional near-infrared spectroscopy with machine learning: A post hoc analysis of a diagnostic trial and validation of an external additional trial. Alzheimer’s Res. Ther. 2023, 15, 127. [Google Scholar] [CrossRef] [PubMed]
  57. Ye, Z.; Zhang, T.; Wu, C.; Qiao, Y.; Su, W.; Chen, J.; Xie, G.; Dong, S.; Xu, J.; Zhao, J. Predicting the objective and subjective clinical outcomes of anterior cruciate ligament reconstruction: A machine learning analysis of 432 patients. Am. J. Sports Med. 2022, 50, 3786–3795. [Google Scholar] [CrossRef]
  58. Das, B.; Rathore, P.; Roy, D.; Chakraborty, D.; Jatav, R.S.; Sethi, D.; Kumar, P. Comparison of bagging, boosting and stacking algorithms for surface soil moisture mapping using optical-thermal-microwave remote sensing synergies. Catena 2022, 217, 106485. [Google Scholar] [CrossRef]
  59. Tao, S.; Zhang, X.; Feng, R.; Qi, W.; Wang, Y.; Shrestha, B. Retrieving soil moisture from grape growing areas using multi-feature and stacking-based ensemble learning modeling. Comput. Electron. Agric. 2023, 204, 107537. [Google Scholar] [CrossRef]
  60. Granata, F.; Di Nunno, F.; Najafzadeh, M.; Demir, I. A stacked machine learning algorithm for multi-step ahead prediction of soil moisture. Hydrology 2022, 10, 1. [Google Scholar] [CrossRef]
  61. Paramythis, A.; Loidl-Reisinger, S. Adaptive learning environments and e-learning standards. In Proceedings of the 2nd European Conference on E-Learning, Linz, Austria, 6–7 November 2003; pp. 369–379. [Google Scholar]
  62. Kerr, P. Adaptive learning. Elt J. 2016, 70, 88–93. [Google Scholar] [CrossRef]
  63. Zeiler, M.D. Adadelta: An adaptive learning rate method. arXiv 2012, arXiv:1212.5701. [Google Scholar]
  64. Frías-Blanco, I.; Verdecia-Cabrera, A.; Ortiz-Díaz, A.; Carvalho, A. Fast adaptive stacking of ensembles. In Proceedings of the 31st Annual ACM Symposium on Applied Computing, Pisa, Italy, 4 April 2016; pp. 929–934. [Google Scholar]
  65. Agarwal, S.; Chowdary, C.R. A-Stacking and A-Bagging: Adaptive versions of ensemble learning algorithms for spoof fingerprint detection. Expert Syst. 2020, 146, 113160. [Google Scholar] [CrossRef]
Figure 1. Research area, the red points are sampling points.
Figure 1. Research area, the red points are sampling points.
Remotesensing 17 01569 g001
Figure 2. The general framework of the Ada-Stacking method.
Figure 2. The general framework of the Ada-Stacking method.
Remotesensing 17 01569 g002
Figure 3. The generated framework of SM retrieval using multi-source data.
Figure 3. The generated framework of SM retrieval using multi-source data.
Remotesensing 17 01569 g003
Figure 4. The importance of all variables determined by the SHP method.
Figure 4. The importance of all variables determined by the SHP method.
Remotesensing 17 01569 g004
Figure 5. The importance of all variables determined by the information gain method.
Figure 5. The importance of all variables determined by the information gain method.
Remotesensing 17 01569 g005
Figure 6. SM inversion results under different feature combinations using Info_gain feature selection methods with different ML methods; M1, M2, M3, and M4 represent four modes, respectively, and the number of features included is 5, 10, 15, and 20, respectively, e.g., (A) MAE; (B) RMSE; (C) R2.
Figure 6. SM inversion results under different feature combinations using Info_gain feature selection methods with different ML methods; M1, M2, M3, and M4 represent four modes, respectively, and the number of features included is 5, 10, 15, and 20, respectively, e.g., (A) MAE; (B) RMSE; (C) R2.
Remotesensing 17 01569 g006
Figure 7. SM inversion results under different feature combinations using SHAP feature selection methods with different ML methods; M1, M2, M3, and M4 represent four modes, respectively, and the number of features included is 5, 10, 15, and 20, respectively, e.g., (A) MAE; (B) RMSE; (C) R2.
Figure 7. SM inversion results under different feature combinations using SHAP feature selection methods with different ML methods; M1, M2, M3, and M4 represent four modes, respectively, and the number of features included is 5, 10, 15, and 20, respectively, e.g., (A) MAE; (B) RMSE; (C) R2.
Remotesensing 17 01569 g007
Figure 8. SM inversion results under different feature combinations using Info_gain∩SHAP feature selection methods with different ML methods; M1, M2, M3, and M4 represent four modes, respectively, and the number of features included is 5, 10, 15, and 20, respectively, e.g., (A) MAE; (B) RMSE; (C) R2.
Figure 8. SM inversion results under different feature combinations using Info_gain∩SHAP feature selection methods with different ML methods; M1, M2, M3, and M4 represent four modes, respectively, and the number of features included is 5, 10, 15, and 20, respectively, e.g., (A) MAE; (B) RMSE; (C) R2.
Remotesensing 17 01569 g008
Figure 9. Scatter plots of the measured SM and estimated SM using Ada_Stacking and Info_gain feature selection method; (ad) represent four modes M1, M2, M3, and M4, respectively. The blue line represents the 1:1 line, while the red line denotes the fitted line.
Figure 9. Scatter plots of the measured SM and estimated SM using Ada_Stacking and Info_gain feature selection method; (ad) represent four modes M1, M2, M3, and M4, respectively. The blue line represents the 1:1 line, while the red line denotes the fitted line.
Remotesensing 17 01569 g009
Figure 10. Scatter plots of the measured SM and estimated SM using Ada_Stacking and SHAP feature selection method; (ad) represent four modes M1, M2, M3, and M4, respectively. The blue line represents the 1:1 line, while the red line denotes the fitted line.
Figure 10. Scatter plots of the measured SM and estimated SM using Ada_Stacking and SHAP feature selection method; (ad) represent four modes M1, M2, M3, and M4, respectively. The blue line represents the 1:1 line, while the red line denotes the fitted line.
Remotesensing 17 01569 g010
Figure 11. Scatter plots of the measured SM and estimated SM using Ada_Stacking and Info_gain ∩ Shap feature selection method; (ad) represent four modes M1, M2, M3, and M4, respectively. The blue line represents the 1:1 line, while the red line denotes the fitted line.
Figure 11. Scatter plots of the measured SM and estimated SM using Ada_Stacking and Info_gain ∩ Shap feature selection method; (ad) represent four modes M1, M2, M3, and M4, respectively. The blue line represents the 1:1 line, while the red line denotes the fitted line.
Remotesensing 17 01569 g011aRemotesensing 17 01569 g011b
Figure 12. Comparative change trend chart of SM content based on Ada-Stacking under different feature selection methods, respectively, e.g., (A) MAE; (B) RMSE; (C) R2.
Figure 12. Comparative change trend chart of SM content based on Ada-Stacking under different feature selection methods, respectively, e.g., (A) MAE; (B) RMSE; (C) R2.
Remotesensing 17 01569 g012
Table 1. Multi-source features of SM retrieval.
Table 1. Multi-source features of SM retrieval.
Satellite NameParametersDescription
Sentinel-1VVBackscattering coefficient
VH
Landsat-8B1Coastal
B2Blue
B3Green
B4Red
B5NIR
B6SWIR-1
B7SWIR-2
Multi-vegetation indexSee Table 2 for details
ASTGTM DEMSlopeMaximum rate of change in elevation
AspectDirection of projection of the slope; normal on the horizontal plane
DEMDigital Elevation Model
Table 2. Remote sensing multi-vegetation indices in this study.
Table 2. Remote sensing multi-vegetation indices in this study.
ParametersDescriptionReferences
Difference Vegetation Index (DVI)DVI = b5 − b4[32]
Enhanced Vegetation Index (EVI)EVI = 2.5 × (b5 − b4)/(b5 + 6 × b4 − 7.5 × b2 + 1)[33]
Modified Soil Index (MSI)MSI = b6/b5[34]
Modified Senescent Reflectance Index (MSR)MSR = (b5/b4 − 1)/Sqrt(b5/b4 + 1)[35]
Normalized Difference Vegetation Index (NDVI)NDVI = (b5 − b4)/(b5 + b4)[36]
Normalized Difference Water Index (NDWI1640)NDWI1640 = (b5 − b6)/(b5 + b6)[37]
Normalized Difference Water Index (NDWI2201)NDWI2201 = (b5 − b7)/(b5 + b7)[38]
Normalized Multi-Band Drought Index (NMDI)NMDI = (b5 − (b6 − b5))/(b5 + (b6 + b5))[39]
Ratio Vegetation Index (RVI)RVI = b5/b4[40]
Soil-Adjusted Vegetation Index (SAVI)SAVI = (b5 − b4) × (1 + 0.5)/(b5 + b4 + 0.5)[41]
Transformed Vegetation Index (TVI)TVI = 0.5 × (120 × (b5 − b3) – 200 × (b4 − b3))[42]
Atmospherically Resistant Vegetation Index (ARVI)ARVI = (b5 − (2 × b4 − b2))/(b5 + (2 × b4 − b2))[43]
Albedo (reflectance) index (Albedo)Albedo = 0.356 × b2 + 0.13 × b3 +
0.373 × b4 + 0.085 × b5 + 0.072 ×
b6 + 0.072 × b7 − 0.0018
[44]
Table 3. Different combinations of quantitative features utilize three feature selection methods.
Table 3. Different combinations of quantitative features utilize three feature selection methods.
Feature Selection Method and
Number of Features
Specific Characteristics
Info_gain_M1DEM, DVI, MSI, VV, NMDI
Info_gain_M2DEM, DVI, MSI, VV, NMDI, NDWI1640, EVI, VH, B5, TVI
Info_gain_M3DEM, DVI, MSI, VV, NMDI, NDWI1640, EVI, VH, B5, TVI, NDWI2201, B6, RVI, B1, SAVI
Info_gain_M4DEM, DVI, MSI, VV, NMDI, NDWI1640, EVI, VH, B5, TVI, NDWI2201, B6, RVI, B1, SAVI, Albedo, ARVI, NDVI, B3, aspect
SHAP_M1DEM, VV, slope, B1, EVI
SHAP_M2DEM, VV, slope, B1, EVI, VH, ARVI, B6, TVI, DVI
SHAP_M3DEM, VV, slope, B1, EVI, VH, ARVI, B6, TVI, DVI, B4, aspect, B2, B3, MSI
SHAP_M4DEM, VV, slope, B1, EVI, VH, ARVI, B6, TVI, DVI, B4, aspect, B2, B3, MSI, B5, NDWI2201, NDWI1640, Albedo, SAVI
Info_gain ∩ SHAP_M1DEM, VV
Info_gain ∩ SHAP_M2DEM, VV, DVI, EVI, VH, TVI
Info_gain ∩ SHAP_M3DEM, VV, DVI, EVI, VH, TVI, B1
Info_gain ∩ SHAP_M4DEM, VV, DVI, EVI, VH, TVI, B1, ARVI, aspect, B3, MSI, B5, NDWI2201, NDWI1640, Albedo, SAVI
Table 4. Accuracy evaluation of SM content-based Ada-Stacking with different feature selection methods.
Table 4. Accuracy evaluation of SM content-based Ada-Stacking with different feature selection methods.
MAE (Vol. %)RMSE (Vol.%)R2
Ada_Stacking_Info_gain_M12.233.060.94
Ada_Stacking_Info_gain_M22.062.770.94
Ada_Stacking_Info_gain_M32.082.890.94
Ada_Stacking_Info_gain_M41.862.680.95
Ada_Stacking_SHAP_M12.072.890.94
Ada_Stacking_SHAP_M22.082.880.94
Ada_Stacking_SHAP_M31.992.90.94
Ada_Stacking_SHAP_M41.892.770.94
Ada_Stacking_Info_gain ∩SHAP_M12.73.780.88
Ada_Stacking_Info_gain ∩ SHAP_M22.23.150.93
Ada_Stacking_Info_gain ∩ SHAP_M32.052.820.94
Ada_Stacking_Info_gain ∩ SHAP_M42.092.970.94
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, L.; Gao, Y. Soil Moisture Inversion Using Multi-Sensor Remote Sensing Data Based on Feature Selection Method and Adaptive Stacking Algorithm. Remote Sens. 2025, 17, 1569. https://doi.org/10.3390/rs17091569

AMA Style

Wang L, Gao Y. Soil Moisture Inversion Using Multi-Sensor Remote Sensing Data Based on Feature Selection Method and Adaptive Stacking Algorithm. Remote Sensing. 2025; 17(9):1569. https://doi.org/10.3390/rs17091569

Chicago/Turabian Style

Wang, Liguo, and Ya Gao. 2025. "Soil Moisture Inversion Using Multi-Sensor Remote Sensing Data Based on Feature Selection Method and Adaptive Stacking Algorithm" Remote Sensing 17, no. 9: 1569. https://doi.org/10.3390/rs17091569

APA Style

Wang, L., & Gao, Y. (2025). Soil Moisture Inversion Using Multi-Sensor Remote Sensing Data Based on Feature Selection Method and Adaptive Stacking Algorithm. Remote Sensing, 17(9), 1569. https://doi.org/10.3390/rs17091569

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop