Next Article in Journal
Organic and Conventional Management Effects on Soil Organic Carbon and Macro-Nutrients Across Land Uses in the Bhutanese Himalayas
Next Article in Special Issue
Assessment of Soil and Water Quality Indices in Agricultural Soils of Manouba Governorate, North-East Tunisia
Previous Article in Journal
Heavy Metal Concentrations in Debrecen’s Urban Soils: Implications for Upcoming Industrial Projects
Previous Article in Special Issue
Assessing the Effect of Undirected Forest Restoration and Flooding on the Soil Quality in an Agricultural Floodplain
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Influence of Soil Background Noise on Accuracy of Soil Moisture Content Inversion in Alfalfa Fields Based on UAV Multispectral Data

by
Jinxi Chen
,
Yuanbo Jiang
,
Wenjing Yu
,
Guangping Qi
*,
Yanxia Kang
*,
Minhua Yin
,
Yanlin Ma
,
Yayu Wang
,
Jiapeng Zhu
,
Yanbiao Wang
and
Boda Li
College of Water Conservancy and Hydropower Engineering, Gansu Agricultural University, Lanzhou 730070, China
*
Authors to whom correspondence should be addressed.
Soil Syst. 2025, 9(3), 98; https://doi.org/10.3390/soilsystems9030098
Submission received: 7 July 2025 / Revised: 4 September 2025 / Accepted: 8 September 2025 / Published: 12 September 2025
(This article belongs to the Special Issue Research on Soil Management and Conservation: 2nd Edition)

Abstract

Soil moisture plays a critical role in the global water cycle, the exchange of matter and energy within ecosystems, and the movement of water in plants. Accurate monitoring of soil moisture is essential for drought early warning systems, irrigation decision-making, and crop growth assessment. The use of drone-based multispectral remote sensing technology for estimating the soil moisture content offers advantages such as wide coverage, high accuracy, and efficiency. However, the soil background can often interfere with the accuracy of these estimations. In specific environments, such as areas with strong winds, removing soil background noise may not necessarily enhance the precision of estimates. This study utilizes unmanned aerial vehicle (UAV) multispectral imagery and employs a vegetation index threshold method to remove soil background noise. It systematically analyzes the response relationship between spectral reflectance, spectral indices, and the soil moisture content in the top 0–10 cm layer of alfalfa; constructs K-Nearest Neighbors (KNN), Random Forest Regression (RFR), ridge regression (RR), and XG-Boost inversion models; and comprehensively evaluates model performance. The results indicate the following: (1) The XG-Boost model validation set had the highest R2 value (0.812) when spectral reflectance was used as the input variable, which was significantly better than the other models (R2 = 0.465 to 0.770), and the RFR model validation set had the highest R2 value when the spectral index was used as the input variable (0.632), which was significantly better than the other models (R2 = 0.366 to 0.535). (2) After removing soil background noise, the accuracy of the soil moisture estimates for each model did not show significant changes; specifically, the R2 value for the XG-Boost model decreased to 0.803 when using spectral reflectance as the input, and the R2 value for the RFR model dropped to 0.628 when using spectral indices. (3) Before and after removing the soil background noise, the spectral reflectance can provide more accurate data support for the inversion of the soil moisture content than the spectral index, and the XG-Boost model is the most effective in the inversion of the soil moisture content when using the spectral reflectance as the input variable. The research findings provide both theoretical and technical support for the retrieval of the surface soil moisture content in alfalfa using drone-based multispectral remote sensing. Additionally, they offer evidence that validates large-scale soil moisture remote sensing monitoring.

1. Introduction

Soil moisture is a core component of the Earth’s hydrosphere, playing a vital role in key processes such as water cycling and plant moisture transport. Its spatial and temporal distributions profoundly influence the co-evolution of the biosphere, atmosphere, and lithosphere through the flow of matter and energy [1]. As a key indicator for drought monitoring, irrigation decision-making, and crop growth assessment, accurate measurement of the soil moisture content plays a crucial strategic role in enhancing the effectiveness of drought early warning systems and optimizing irrigation management [2]. Compared with the limitations of traditional soil moisture monitoring, such as limited coverage, weak sample representativeness, and time-consuming and labor-intensive acquisition, UAV remote sensing technology, with its advantages of a wide monitoring range, high spatial accuracy, low implementation costs, and high operational efficiency, has gradually become the core means of dynamically monitoring the soil water content and has provided more efficient and comprehensive technological support for grasping changes in soil moisture in real time.
Unmanned aerial vehicle (UAV) remote sensing mainly acquires soil surface spectral properties, thermal radiation, and electromagnetic wave signals using sensors such as hyperspectral, multispectral, thermal infrared, synthetic aperture radar (SAR), and electromagnetic wave sensors and realizes soil moisture content inversion by combining these data with physical models or machine learning algorithms [3,4]. Compared to other sensors, multispectral sensors are not affected by temperature, ensuring high data stability and easy data acquisition. They can directly reflect the differences in the spectral reflectance of surface features. Future research could fuse multispectral and thermal infrared remote-sensing data to construct multidimensional feature inputs, thereby substantially improving the accuracy of soil moisture estimation and the robustness of predictive models.
In recent years, numerous scholars have conducted in-depth research on the refinement of data processing technology, innovation and optimization related to inversion models, accuracy validation and uncertainty analysis, and crop type expansion. Research on improving the refinement of data processing technology mainly involves efficient rejection of soil background interference and accurate extraction of key features (spectral reflectance, texture features, temporal features, and multi-source features) [5,6]. The SHAP method utilizes cooperative game theory to quantify the contribution of each feature to the prediction outcomes, thereby enhancing the model’s interpretability. Applying the SHAP method to soil moisture prediction research enables quantitative identification of the contributions of input variables to the soil moisture content, providing a reference for variable selection when certain input factors are missing. Fu et al. [7] introduced the SHAP method to quantitatively assess the contributions of input variables in ensemble learning models. Innovation and optimization related to inversion models include the development from empirical models to the fusion of mechanistic and data-driven models, covering physical models of spectral/temperature signals and the soil water content based on radiative transfer/heat conduction mechanisms. The most commonly applied statistical methods are the traditional machine learning models (Random Forest Regression (RFR), Support Vector Machine (SVM), and K-Nearest Neighbors regression (KNN)) and the deep machine learning models (Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs)) [8,9]. Random Forest Regression (RFR), XG-Boost, K-Nearest Neighbors (KNN), and ridge regression (RR) have all demonstrated good performance in UAV remote sensing inversion [10,11]. Compared to Support Vector Machines (SVMs), Convolutional Neural Networks (CNNs), and Recurrent Neural Networks (RNNs), models like RFR, XG-Boost, KNN, and RR offer key advantages, including higher computational efficiency, stronger inherent interpretability, better generalization capabilities and robustness against overfitting for medium-sized datasets, a relatively simple hyperparameter tuning process, and generally superior modeling efficiency and competitiveness in structured data prediction tasks.
Accuracy validation and uncertainty analysis mainly systematically evaluate inversion reliability through ground-truthing data validation, multi-model comparison (cross-validation, with an independent validation set comparing the generalization ability), and uncertainty source analysis (sensor error, meteorological conditions, crop phenology, soil heterogeneity, etc.) [12]. The research covers a wide range of crop types and has gradually expanded from bulk crops such as winter wheat and maize to cash crops such as cotton, as well as saline and alkaline crops and pasture grasses in alpine regions. The above series of studies have greatly improved model universality and practical application suitability [13,14].
Soil background noise refers to additional signals received by sensors when they detect the spectral signals reflected by vegetation, which also include reflections from the soil background. Consequently, canopy reflectance encompasses the combined spectral response of both the vegetation components and the soil background. Removing the soil background is a crucial step in image processing techniques. By utilizing the spectral differences of surface features and the principles of radiative transfer, it is possible to separate the target features (such as vegetation) from the soil signals in mixed signals. This effectively reduces the effect of soil interference on the inversion of target parameters, allowing for more accurate characterization of the target features (Zhang et al. [15]). Almeida-Naunay A F [16] utilized spectral differences to calculate thresholds and implemented threshold optimization to remove soil background pixels. They found that reducing soil interference resulted in improved predictive performance. The current mainstream rejection methods include the band selection method, the vegetation index method, physical model inversion, etc. The steps include screening sensitive bands, constructing a functional relationship, and decomposing total radiance. For example, Yang et al. [17] found that using the red-to-green ratio index (RGRI) method to remove the soil background resulted in a coefficient of determination for predicting corn soil moisture that was greater than that obtained with the Otsu threshold method, while the performance without removing the soil background was the worst. Zhang et al. [18] found that by improving the vegetation index threshold method to remove the soil background of winter wheat, the effect of vegetation index inversion with removal of the soil background was always inferior to that without removal of the soil background. Da et al. [19] combined supervised classification to remove the soil background with shading and concluded that the accuracy of inversion of the maize soil water content based on the CNN, BPNN, and PLSR models was significantly improved after removal of the soil background. Wang et al. [20] used a supervised classification method to remove the soil background for corn soil moisture inversion and found that data correlation and the accuracy of the regression model were consistently lower after removal than when the soil background was not eliminated. Although existing research generally supports the idea that removing the soil background can enhance accuracy, its general applicability is still controversial. The mechanism coupling the removal operation with the input variables and the inversion model has not been fully clarified, and the studies cover relatively limited vegetation types (e.g., maize, winter wheat, cotton, etc.), so the scope of the conclusions needs to be expanded and verified. Research has found that deep learning, particularly neural networks (CNNs), can enhance agricultural functions such as soil classification, crop monitoring, and nutrient management. Additionally, Dhanya [21] employed YOLO and similar object detection algorithms to improve the agricultural image preprocessing stage by removing backgrounds and analyzing specific regions. Necati Cetin [22] utilized convolutional neural networks (CNNs) and the integrated YOLOv5 model to remove backgrounds and enhanced agricultural applications by employing precise texture classification based on texture and color features while reducing background noise interference. The deep learning-based object detection framework can dynamically predict bounding boxes and confidence scores, ensuring effective separation of soil from diversified backgrounds that include complex conditions such as shadows or debris while also facilitating data preparation and enhancing feature stability. Previous research utilizing UAV remote sensing combined with machine learning models has primarily focused on inversion of the soil moisture content for corn and winter wheat. However, studies examining the effects of alfalfa and soil background noise on the surface soil moisture content in alfalfa are relatively scarce. Therefore, this study employed UAV remote sensing technology to monitor the surface soil moisture content in alfalfa. As the ‘King of Forages’ [23], alfalfa (Medicago sativa L.) has a high nutritional value (a crude protein content of 15–25%), has strong resistance to stress (drought, cold, and salinity), and provides ecological services (rhizobacteria nitrogen fixation and fertilization, reductions in soil erosion, and improvements in the soil structure). The canopy height of alfalfa (Medicago sativa) typically ranges from 10 to 65 cm and varies significantly through different growth stages. During the peak growth period, the canopy covers a relatively large area; however, post-harvest, canopy coverage declines significantly. Additionally, factors such as intercropping combinations, seeding methods, seeding density, and environmental conditions (such as soil moisture and light intensity) influence the canopy structure of alfalfa. Gansu is the core production area of high-quality alfalfa and the focal area of the national ‘grain-to-feed’ strategy. The development of its grass industry is of strategic significance in guaranteeing the supply for regional animal husbandry and promoting the optimization of the structure of the agricultural and animal husbandry industry [24]. However, it is difficult for traditional alfalfa soil water content monitoring methods to meet the precise management needs of large-scale alfalfa planting. Based on this, this study took alfalfa as its research object; adopted UAV multispectral remote sensing technology, combined with the vegetation index threshold method, to eliminate soil background interference; systematically explored the response relationships between spectral features and the alfalfa surface soil water content; and constructed an inversion model using a machine learning algorithm. The objectives of this study were as follows: (1) We quantified the impact of the soil background on the spectrum–moisture relationship and clarified the degree of interference caused by soil noise. (2) We analyzed the effects of different combinations of variables on the enhancement of inversion accuracy by using the original spectral reflectance, the spectral index, and the culled optimized spectral features as input variables to determine the core spectral parameters affecting the moisture content. (3) based on four machine learning algorithms, namely Random Forest Regression (RFR), XG-Boost, K-Nearest Neighbors (KNN), and ridge regression (RR), we compared the inversion accuracy of the model before and after the removal of the soil background and screened the optimal model so as to provide technological support for real-time monitoring of the soil moisture content and accurate irrigation decision-making in alfalfa planting areas in Gansu.

2. Materials and Methods

2.1. An Overview of the Study Area

This experiment was conducted from April to October 2024 at the Irrigation Experiment Station of the Jingtai Chuan Electricity Lift Irrigation Water Resources Utilization Center in Gansu Province (37°12′59″ N, 104°05′10″ E, mean elevation: 1572 m, Figure 1). The soil moisture deficit in the Jingtai region significantly constrains local ecological sustainability and agricultural development, which gives this study critical scientific and practical significance. This region has a temperate continental arid climate with abundant light and sparse rainfall, and the annual average sunshine hours, frost-free period, radiation, air temperature, precipitation, and evaporation are 2652 h, 191 d, 6.18 × 105 J·cm−2, 8.6 °C, 201.6 mm, and 2761 mm, respectively (Figure 2). The soil in the test area was loamy, and its physical and chemical properties are listed in Table 1.

2.2. Experimental Design

By studying the soil moisture content of alfalfa, this research can provide a scientific basis for optimizing planting management and improving water resource utilization efficiency and ecological protection. Alfalfa (Gannong 3) primarily grows from April to October. During the flowering and bud stages, canopy coverage is high and the impact of the soil background can be neglected; therefore, this study focused on the branching stage of alfalfa. The experimental field measured 27 m in length from east to west and 40 m from north to south. To enhance the universality of the inversion model in heterogeneous artificial grasslands, this study divided the area into 16 alfalfa sub-regions, with each subplot covering an area of 54 m2 (6 m × 9 m). Drip irrigation was used in the study area. The drip-tape spacing was 40 cm, the emitter spacing was 30 cm, and the emitter flow rate was 2.0 L·h−1. Irrigation volumes were monitored with a water meter (precision: 0.0001 m3) and regulated by ball valves. During the experiment, field management of all plots followed local conventional practices. The experimental plots were set up in a completely randomized design with four groups of irrigation gradients. (An irrigation gradient refers to the spatial variability or continuous changes in water availability or soil moisture resulting from irrigation activities. Such gradients can form naturally or can be artificially designed, with the core aspect being an uneven distribution of moisture and its resulting impacts.) Each irrigation gradient is expressed as a percentage of the soil moisture content relative to the field capacity (θf). (Field capacity refers to the maximum amount of water that soil can retain in its capillary pores after gravitational water has drained away. It is typically expressed as a percentage of the dry weight of the soil.) The field capacity levels were as follows: 45–55% θf, 55–65% θf, 65–75% θf, and 75–85% θf.

2.3. UAV Multispectral Remote Sensing Data Acquisition

2.3.1. Multispectral Remote Sensing Image Acquisition

For this experiment, remote sensing data collection was conducted on 19 April 2024, 20 April 2024, and 21 April 2024 during clear and windless weather with sufficient light. Multispectral remote sensing images were collected using a DJI Matrice 300 RTK quadcopter UAV (Shenzhen DJI Innovation Technology Co., Ltd., Shenzhen, China). (Figure 3) The UAV was equipped with an MS 600 Pro multispectral camera (Changguang Yuchen Information Technology Equipment (Qingdao) Co., Ltd., Qingdao, China) (Figure 4). The flight altitude was set to 30 m, with a forward overlap rate of 80% and a side overlap rate of 70%. The flight time was from 12:00 to 13:00. The multispectral camera lens was pointed vertically downward. Each flight followed a predetermined route, and the center wavelength, wave width, and diffuse reflector plate reflectance of each band are provided in Table 2. The flight speed was 2.7 m/s, and images of the calibration plates were taken on the ground before and after each flight. These calibration plate images were used for reflectance calibration to compensate for variations in illumination conditions and to ensure data consistency. The ground resolution was 2.16 cm.

2.3.2. Multispectral Remote Sensing Image Processing

In this study, a total of four ground control points (GCPs) were established within the experimental area, and their coordinates were precisely determined using Real-Time Kinematic (RTK) positioning technology. Subsequently, geographic registration and radiometric correction were carried out using Pix4D Mapper version 4.8.0, and manual marking of the ground control points was employed to enhance location accuracy. The root-mean-square errors (RMSEs) of the generated digital orthophoto in the x-axis, y-axis, and z-axis directions were 0.23 m, 0.31 m, and 0.27 m, respectively, indicating that the orthophoto exhibited high spatial accuracy.

2.3.3. Removal of Soil Background Noise

During the drone data collection process, the canopy reflectance received by the sensor represented a mixed spectral response from both the vegetation and the soil background. To eliminate the interference of soil background noise, masking was performed on the multispectral images of the alfalfa and soil within each cell using a vegetation index [25] threshold in ENVI 5.3 software. Given that the NDVI values of crops are greater than 0 and the NDVI values of soil are less than 0, the NDVI method was employed for threshold segmentation, with a threshold set at 0, effectively removing soil background noise [26]. The vegetation index threshold method is based on the advantages of the vegetation index [27] in distinguishing between crop and non-crop pixels, thereby facilitating noise removal.
The vegetation index threshold method significantly outperforms supervised classification and spectral unmixing methods in terms of operational complexity and computational efficiency when it comes to removing soil background noise interference [28]. Supervised classification relies on a large number of high-quality training samples to distinguish between complex land cover categories, including various soil types. This process involves feature selection, classifier training, and optimization, resulting in high computational costs and susceptibility to sample representativeness and labeling subjectivity. In contrast, spectral unmixing requires accurate acquisition and validation of pure spectral signatures for vegetation and soil endmembers, as well as solving complex mixing models (such as linear or nonlinear models). The accuracy of this approach is highly sensitive to endmember variability and model assumptions, making the computational process relatively complex. In contrast, the vegetation index threshold method requires only an empirical or statistical determination of a single (or limited) vegetation index threshold for binary segmentation. The algorithm is straightforward, easy to implement, and highly efficient, making it particularly suitable for large areas and near-real-time processing scenarios. For areas with high vegetation cover, the threshold method can effectively distinguish between vegetation and non-vegetation. Due to its standardized process and direct results (generating binary masks), it usually provides stable and repeatable preliminary outcomes, thereby avoiding the additional errors associated with vague category definitions in supervised classification and the uncertainties of endmember extraction in spectral unmixing.

2.4. Ground Data Acquisition and Processing

While the UAV acquired the multispectral remote sensing images, alfalfa topsoil samples were manually and synchronously collected on the ground in the 16 sample areas, at a soil depth of 0–10 cm, using the five-point sampling method, and finally, the mean values of the five-point sampling were taken as the sample data. The soil samples collected at each point were put into aluminum boxes, and the soil mass water contents were measured by the drying method in an oven at a constant temperature of 105 °C for 24 h. The average values were taken as the soil water contents of the sampling points. The total number of soil samples was 48, and all samples strictly followed a normal distribution. Finally, 70% of the data were randomly selected for modeling, and 30% were selected for validation.

2.5. Spectral Index Selection and Calculation

Spectral indices can be used to extract feature information utilizing the ratios or differences between certain bands in remote sensing data. In remote sensing data processing, spectral indices are widely used in the extraction of feature information related to vegetation, soil, and water bodies. The vegetation index of the crop canopy in the experimental area was calculated by a specific algebraic combination of the spectral reflectance of each band [29]. Six relevant spectral indices were determined using the related literature, and the six spectral reflectances and six spectral indices used in this study include the red band (red, R), blue band (blue, B), green band (green, G), near-infrared band (NIR), red edge 1 band (red edge 1), red edge 2 band (red edge 2), normalized difference vegetation index (NDVI), ratio vegetation index (RVI), difference vegetation index (DVI), green index (GI), simple ratio pigment index (SRPI), and red-to-green ratio index (RGRI), which were calculated according to the equations presented in Table 3. Under alfalfa cover conditions, these vegetation indices are important tools for estimating the soil moisture content. In particular, the normalized difference vegetation index (NDVI) and the difference vegetation index (DVI) demonstrate significant inversion effects [8].

2.6. Soil Moisture Content Inversion Model Construction

In this study, the model input variables were divided into 2 groups: a spectral reflectance variable group and a spectral index variable group. Four machine learning algorithms, namely RFR, XG-Boost, KNN, and RR, were used to build alfalfa surface soil water content inversion models with and without the removal of soil background noise, for a total of 16 soil moisture content inversion models. Different models exhibit varying performances on the same data due to their unique algorithmic characteristics. RFR is an integrated learning algorithm that performs regression tasks by aggregating multiple decision trees, which can effectively deal with complex datasets and nonlinear relationships among features [36]. KNN is an instance-based learning algorithm for classification and regression that measures the distances between different feature values. It makes regression predictions by analyzing the nearest neighbors of the input samples and is outstanding when dealing with scenarios with complex data distributions or obscure boundaries [37]. RR, as an enhanced linear regression method, is designed to address issues of multicollinearity [38]. XG-Boost is short for Extreme Gradient Boosting. It is a machine learning algorithm implemented in the gradient boosting framework [39]. It is an efficient and flexible gradient boosting algorithm that emphasizes speed, performance, and regularization, and it is suitable for applications involving large-scale datasets and complex problems.
The four machine learning algorithms were implemented using the sklearn library in python, and the specific steps were data loading preparation and preprocessing, data partitioning, and model evaluation. Hyperparameter optimization was performed for KNN, RFR, XG-Boost, and RR using a grid search combined with 4-fold cross-validation, and the optimal parameters were selected based on the maximum mean R2 value. This study systematically evaluated the predictive performance of four machine learning algorithms—KNN, RFR, XG-Boost, and RR—using 4-fold cross-validation. First, the original dataset was divided into a training set (comprising 70% of the samples) and an independent test set (comprising 30% of the samples) in a 7:3 ratio. Subsequently, 4-fold cross-validation was implemented within the training set. The training samples were evenly divided into four mutually exclusive subsets (KNN1–KNN4, RFR1–RFR4, XG-Boost1–XG-Boost4, and RR1–RR4). In each iteration, three of the folds were used as training data, while the remaining fold served as validation data. This process cycled through all four folds to obtain robust internal validation results. After completing the cross-validation, the final generalization performance of each algorithm was assessed using the test set, which had not been involved in any model tuning (KNN Test, RFR Test, XG-Boost Test, and RR Test). Predictive results were generated to ensure the objectivity and reliability of the model selection process (Figure 5).

2.7. Data Analysis

2.7.1. Pearson Correlation Analysis

Remote sensing images usually contain multiple bands (e.g., visible, near-infrared, short-wave infrared, etc.), and direct use of all bands will lead to data redundancy and an increase in computational complexity, so Pearson correlation analysis was used to screen the sensitive spectral index combinations and sensitive spectral reflectance combinations [40]. Pearson correlation analysis, as a statistical method used to measure the linear relationship between two variables, centers on the Pearson correlation coefficient, which has values ranging from −1 to 1, with 1 indicating a perfect positive correlation, −1 indicating a perfect negative correlation, and 0 indicating no correlation. The higher the absolute value of r, the stronger the linear correlation between the predictor variable and the target variable. According to the evaluation criteria for Pearson’s correlation coefficient (|r|), |r| ≥ 0.8 indicates a high correlation; 0.5 ≤ |r| < 0.8 indicates a moderate correlation; 0.2 ≤ |r| < 0.5 indicates a low correlation; and |r| < 0.2 indicates essentially no correlation [41].

2.7.2. Soil Moisture Content Inversion and Accuracy Evaluation

This study evaluated the accuracy of the soil moisture content (SMC) inversion models using three metrics: root-mean-square error (RMSE), Mean Absolute Error (MAE), and the coefficient of determination (R2). RMSE indicates the closeness of predicted values to actual values, while R2 measures the goodness of fit of a model. An R2 value closer to 1, an RMSE closer to 0, and a smaller MAE all indicate an inversion model with better accuracy and precision. The spectral reflectance and different spectral indices with and without soil background noise were used to invert the moisture content of alfalfa surface soil. A linear model was established with the spectral reflectance and spectral index as independent variables and the soil moisture content as the dependent variable, and the coefficients of determination (R2) and root-mean-square errors (RMSEs) of the models before and after the removal of soil background noise were compared to analyze the impact of soil background noise on the inversion of the soil moisture content.

3. Results

3.1. Analysis of Correlations Between Spectral Reflectance and Spectral Index and Soil Moisture Content

Pearson correlation coefficients were used to conduct a multifactorial correlation analysis of the independent variables (spectral reflectance and the spectral index) and the dependent variable (the SMC) (Figure 6). By selecting relevant features, the model’s generalization ability and robustness could be improved, thereby enhancing its applicability and performance across different environments, datasets, and application scenarios. Before removing the soil background noise, the correlations between the spectral reflectances and the SMC, from strongest to weakest, were red edge 750, NIR, red, blue, red edge 720, and green, among which the absolute values of the correlation coefficients of red edge 750 and NIR were greater than 0.5. After removing the soil background noise, the highest correlation was found for NIR, with an absolute correlation coefficient value of 0.7, followed by red edge 750, with an absolute correlation coefficient value of 0.69. Before and after removing the soil background noise, the correlations between red, blue, red edge 720, and green and the SMC were all lower.
From the Pearson correlations between the spectral indices and the SMC (Figure 6), it can be seen that before removing the soil background noise, the spectral indices with higher correlations with the SMC were the DVI, NDVI, and RVI, all of which had absolute correlation coefficients greater than 0.5. After removing the soil background noise, the spectral indices with higher correlations with the SMC were also the DVI, NDVI, and RVI, which all had absolute correlation coefficients exceeding 0.5. Before and after removing the soil background noise, the correlations between the SRPI and GI and the SMC were lower.

3.2. Soil Moisture Content Inversion Model Based on Spectral Reflectance

The selected spectral reflectances, red edge 750 and NIR, were used as input variables to establish soil water content inversion models using four machine learning methods, namely RFR, RR, KNN, and XG-Boost (Table 4). Before and after removing the soil background noise, all models except the ridge regression (RR) achieved good inversion results. K-Nearest Neighbors (KNN), Random Forest Regression (RFR), and XG-Boost demonstrated solid performance, with coefficients of determination above 0.6 and root-mean-square errors (RMSEs) below 0.1 for both the modeling and validation sets. The XG-Boost model demonstrated optimal performance. Before removing the soil background noise, the modeling set had an R2 value of 0.884 and an RMSE of 0.014, while the validation set had an R2 value of 0.812 and an RMSE of 0.017. After removing the soil background noise, the modeling set had an R2 value of 0.867 and an RMSE of 0.015, while in the validation set the R2 value reached 0.803 and the RMSE was 0.018. In summary, among the four machine learning models constructed based on spectral reflectance, XG-Boost demonstrated a strong capability to perform soil moisture content inversion. In Table 4, it can be observed that the validation results of the dataset after removing soil background noise show little variation compared to the results before the removal of soil background noise. The R2 value of the validation set of the XG-Boost model decreased by 0.009, the R2 value of the validation set of the RFR model decreased by 0.052, the R2 value of the validation set of the KNN model decreased by 0.078, and the R2 value of the validation set of the RR model decreased by 0.023.

3.3. Spectral Index-Based Soil Moisture Content Inversion Model

The selected spectral indices, the DVI, NDVI, and RVI, were used as input variables to establish RFR, RR, KNN, and XG-Boost machine learning models to predict the SMC (Table 4). Before and after removing the soil background noise, the R2 values of the KNN, RFR, and XG-Boost models for both the training and validation sets were all above 0.4, while the RMSE values were all below 0.1, with the exception of the RR model. The RFR model had the best inversion effect. The R2 value of the modeling set was 0.772 and the RMSE was 0.02 before removing the soil background noise, while the R2 value of the validation set was 0.632 and the RMSE was 0.02. After removing the soil background noise, for the modeling set the R2 value was 0.737 and the RMSE was 0.02, while for the validation set the R2 value was 0.628 and the RMSE was 0.026. In summary, among the four machine learning models constructed based on the spectral indices, the RFR model demonstrates a strong ability to invert the soil moisture content.
In Table 4, the validation results for the dataset after removing soil background noise exhibit only slight differences compared to the results obtained from the dataset without noise removal. The R2 value of the validation set of the XG-Boost model decreased by 0.043, the R2 value of the validation set of the RFR model decreased by 0.004, the R2 value of the validation set of the KNN model decreased by 0.072, and the R2 value of the validation set of the RR model decreased by 0.015.

3.4. Comprehensive Evaluation of the Model

In the soil moisture content inversion model established based on spectral reflectance, the R2 values of the modeling set and the validation set were mostly 0.7~0.8 before removing the soil background noise. After removing the soil background noise, the R2 values of the modeling set and the validation set were also mostly 0.7~0.8 (Table 4). In the inversion model of the soil moisture content based on the spectral index, the R2 values of the modeling set were mostly 0.5~0.7 and the R2 values of the validation set were mostly 0.4~0.6 before removing the soil background noise. After removing the soil background noise, the R2 values of the modeling set were also mostly 0.5~0.7, and the R2 values of the validation set were also mostly 0.4~0.6 (Table 4). A comprehensive analysis of Table 4 revealed that the four machine learning models with spectral indices as the input variables had better soil water content inversion and better stability than those with spectral reflectances as the input variables. Furthermore, all models showed that the prediction performance after removing soil background noise had only minimal differences compared to the results obtained without noise removal. Through significance testing, all models yielded p-values lower than 0.01 (p < 0.01), indicating a high level of statistical significance for each model. This result confirms that the established models can effectively explain the variation in the dependent variable rather than attributing this variation to random error (Table 5).
When using spectral reflectance as an input variable, the R2 value of the XG-Boost model for the validation set consistently outperformed the three other inversion models both before and after the removal of soil background noise (Figure 7). In contrast, when spectral indices were used as input variables, the R2 value of the Random Forest Regression (RFR) model for the validation set was greater than those of the three other models, regardless of whether soil background noise was removed (Figure 8). Therefore, in this study, the XG-Boost model was utilized with spectral reflectance as the input variable to construct an alfalfa soil moisture inversion map without removing soil background noise (Figure 9).
Residual analysis confirmed the performance of the KNN, RFR, RR, and XG-Boost models using both spectral reflectance and spectral indices as input variables. As illustrated in Figure 10 and Figure 11, the residuals exhibited a relatively random distribution across the entire range of predicted values. This indicates that the model maintains a consistent error distribution at different levels of prediction, thereby supporting the validity of the model’s assumptions. Furthermore, the residuals were randomly distributed around the zero line and did not exhibit a noticeable trend with changes in predicted values, indicating that the model does not suffer from systematic errors.

4. Discussion

4.1. The Effects of the Input Variables on the Accuracy of Soil Moisture Content Inversion

The core principle of inverting the soil moisture content based on spectral reflectance and spectral indices lies in the fact that soil moisture significantly alters the dielectric constant and surface characteristics, which in turn affect the absorption and scattering behavior of electromagnetic waves in specific wavelength bands. Notably, there are significant differences in the sensitivity of various bands to changes in the moisture content. At the same time, a spectral index constructed by multi-band synergy can further amplify the moisture signal and effectively suppress interference factors such as vegetation and soil background. Existing studies have shown that inversion models with spectral reflectance as the input variable generally have high accuracy. When Zhang et al. [42] and others used partial least squares regression (PLSR), stepwise regression, and ridge regression models to invert the surface soil moisture content, the R2 values exceeded 0.7. Yang et al. [43] and others obtained similar results by PLSR and stepwise regression models. Wang et al. [44] employed a Stacking model that integrates partial least squares regression (PLSR) and Support Vector Regression (SVR), achieving an impressive R2 value of 0.963. The accuracy of models with the vegetation index as an input variable is also high. For example, Li et al. [45] utilized a BiLSTM model, which achieved an R2 value of 0.624, while Zhang et al. [46] employed an MLP model, achieving an R2 value of 0.638. In this study, it was found that the spectral reflectances with higher correlations with the SMC were red edge 750 and NIR, and the spectral indices with higher correlations were the DVI, NDVI, and RVI. The inversion accuracy with spectral reflectance as an input variable was better than when using the spectral indices. This may be due to the fact that band reflectance directly characterizes the spectral features of soil moisture, which can provide richer information, especially in bare soil or areas with low vegetation cover, whereas vegetation indices mainly reflect the state of vegetation under water stress, which has limitations such as providing less information, being indirect, and being susceptible to vegetation interference. It is worth noting that comparisons between spectral reflectance and spectral indices are still lacking in existing studies, and most of them focus on the topsoil layer; the applicability at different soil depths needs to be explored further in order to optimize the input variables and model selection for the multispectral inversion technique.

4.2. The Influence of Soil Background on the Accuracy of Soil Moisture Content Inversion

High-resolution multispectral images captured by drones may be significantly affected by soil background interference when inverting soil moisture content. This interference primarily arises from the mixed pixel effect, as well as the combined impact of various environmental factors such as meteorological conditions, the electromagnetic environment, and topography. Studies have shown that soil background interference has a dual nature. On the one hand, Zhang et al. [18] and Da et al. [19] found that removing the soil background significantly improves inversion accuracy, which may be because it stops bare soil spectra from interfering with the vegetation moisture signal; on the other hand, this study is consistent with the results of Wang et al. [20] and Zhang et al. [15], who showed that inversion accuracy is actually reduced after removing the soil background.
The decline in inversion accuracy may be associated with the following factors: (1) In the experimental area, the wind speed was relatively high, which partially flattened the alfalfa. This resulted in the loss of some effective canopy information when removing the soil background. Additionally, the shadow effect at midday further compromised the integrity of the spectral characteristics. (2) The temporal variability of crop moisture stress responses results in differences in the time windows for spectral signal acquisition. (3) At the plot level, the spectral information is a result of interactions between the vegetation indices and various environmental factors. Isolating vegetation information can distort the relationship between the spectral data and the measured values, ultimately leading to a decrease in inversion accuracy after the soil background is removed [47]. (4) The accuracy of soil moisture inversion models is influenced by various factors, including the crop type, environmental conditions, and sensor characteristics. Currently, the applicability of NDVI-based soil background removal methods in different vegetation systems requires systematic validation. When selecting the classification methods, this study utilized only the NDVI method for removing soil background noise. However, during NDVI classification, some areas with indistinct differentiation from plants, such as shadows, may be mistakenly classified as alfalfa plants. This could potentially decrease the inversion accuracy after soil background noise removal. Therefore, alternative methods, such as the RDVI and OSAVI methods, could also be employed to classify the alfalfa canopy and soil in the images. Research has demonstrated that the Optimization of Soil-Adjusted Vegetation Index (OSAVI) [48] and the ratio of the Transformed Chlorophyll Absorption in Reflectance Index (TCARI) [49] to the OSAVI are effective in mitigating the influence of soil background, outperforming the NDVI method in this regard. Furthermore, Zhang et al. [18] found that both the RDVI and OSAVI methods yielded superior classification results for plants and soil, resulting in improved inversion performance.

4.3. Uncertainty Analysis

This study found that using the KNN, RFR, RR, and XG-Boost models to invert the soil moisture content in the alfalfa field achieved good results; however, certain limitations persist. Firstly, limitations in sensor accuracy, atmospheric correction errors, and band calibration deviations can lead to reflectance inaccuracies, which, in turn, affect the precision of vegetation indices calculated from reflectance data. Additionally, environmental conditions can impact predictions of soil moisture content. For example, while drone remote sensing has been widely employed in agricultural monitoring, data acquisition is often restricted by weather conditions. Factors such as cloud cover thickness can affect the stability of spectral data, leading to fluctuations in band reflectance and shifts in image tones. High temperatures and drought conditions can induce changes in chlorophyll stress, subsequently altering canopy reflectance. Strong winds may also modify the canopy structure and affect the distribution of leaf angles, introducing noise into spectral data and ultimately reducing the accuracy of spectral indices. Additionally, rainfall before drone flights can change surface moisture levels, leading to fluctuations in reflectance values. When the observation direction of a drone’s sensor does not align with the direction of direct sunlight, the drone images often contain shadows that weaken canopy spectral information, thus compromising the accuracy of soil moisture inversion [50]. To address these issues, future research should focus on exploring the quantitative relationship between changes in weather conditions and the accuracy of model inversions. Additionally, synergistic application of satellite and drone remote sensing technologies should be investigated to overcome the limitations of a single data source. Moreover, this study primarily utilized data from multispectral sensors and did not integrate multimodal information such as thermal infrared and meteorological data. Multimodal information fusion technology, through collaborative inversion and modeling of multi-source remote sensing data, can significantly enhance high-precision dynamic monitoring of the soil moisture content. Zhang et al. [51] found that multimodal data fusion improved the accuracy of soil moisture content estimation. Regardless of the machine learning algorithm used, the optimal input variables were identified as a multimodal fusion of thermal and multispectral information. Zhu et al. [52] effectively fused 10 spectral bands with FVC features and applied ensemble learning to accurately predict the soil moisture contents at different root zone depths. Future research could integrate thermal imaging or synthetic aperture radar (SAR) data with knowledge-based models. By inputting thermal imaging and SAR data into these models, it would be possible to utilize the physical rules and empirical formulas within the models for soil moisture inversion. This approach not only has the potential to improve inversion accuracy but also provides more reasonable explanations and physical significance, thereby enhancing the credibility of the results. These improvements will help establish a more robust soil moisture monitoring system.

5. Conclusions

(1) In the model constructed using spectral reflectance, the XG-Boost model demonstrated strong performance both with and without removal of the soil background, achieving R2 values of 0.812 and 0.803 on the validation set, respectively. In contrast, for the model based on spectral indices, the Random Forest Regression (RFR) model provided the best performance, regardless of whether the soil background was removed, with R2 values of 0.632 and 0.628 on the validation set. (2) The inversion models using spectral reflectance as input variables demonstrated superior accuracy compared to those using spectral indices. The validation R2 values for the XG-Boost, KNN, RFR, and RR models with spectral reflectance as the input variable generally ranged from 0.7 to 0.8, while the validation R2 values for the models utilizing spectral indices typically ranged from 0.4 to 0.6. (3) After removing soil background noise, there was no significant improvement in the accuracy of alfalfa surface soil moisture inversion. The accuracy changes for the KNN, RFR, RR, and XG-Boost models were minimal. When using spectral reflectance as the input variable, the R2 value for the highest-performing XG-Boost model only changed by 0.009. Similarly, for the top-performing RFR model using vegetation indices as input variables, the R2 value for the validation set changed by just 0.004. The findings of this study provide theoretical and technical support for the prediction of alfalfa surface soil moisture. However, further investigation is needed to assess the applicability of these results to other crops and regions. Future research should use a wider variety of vegetation and conduct soil moisture inversion studies in multiple locations in order to enhance the accuracy and applicability of the models.

Author Contributions

Conceptualization, J.C.; methodology, Y.W. (Yanbiao Wang) and Y.J.; software, G.Q. and Y.J.; validation, G.Q., Y.J. and Y.K.; formal analysis, W.Y., M.Y., Y.M. and J.Z.; investigation, W.Y.; resources, W.Y., B.L., Y.W. (Yayu Wang) and Y.W. (Yanbiao Wang); data curation, J.C.; writing—original draft preparation, J.C.; writing—review and editing, Y.K.; supervision, G.Q. and Y.K.; project administration, G.Q. and Y.K.; funding acquisition, G.Q. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (52269009 and 52469007), the fifth batch of the ‘Fuxi Young Talents’ project (Gaufx-05Y11) (Gansu Agricultural University), the Doctoral Research Start-up Fund (GAU-KYQD-2024-31) (Gansu Agricultural University), the Young Tutor support fund project (GAU-QDFC-2023-12) (Gansu Agricultural University), and the ‘Northwest arid region speciality crops soil and water resources efficient use of innovation’ discipline team building special (GAU-XKTD-2022-09) (Gansu Agricultural University).

Data Availability Statement

All data are incorporated into the article.

Acknowledgments

Thanks to the Irrigation Experiment Station of the Jingtaichuan Electric Power Irrigation Water Resource Utilization Center, Gansu Province, for supporting this study. Thanks to the editors and reviewers for their valuable and constructive comments. We would like to express our gratitude to the Gansu Jingtai Wolfberry Science and Technology Academy, the Gansu Wolfberry Harmless Cultivation Engineering Research Centre, the Gansu Agricultural Wisdom Water Saving Technology Innovation Centre, and the Upper and Middle Reaches of the Yellow River Ecological Protection and Agricultural Coordination Development Research Centre for their support for this research.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could appear to have influenced the work reported in this article.

References

  1. Zhu, Q.; Liao, K.H.; Lai, X.M.; Liu, Y.; Lv, L.G. Research progress on multi-scale soil moisture monitoring and simulation in basins. Prog. Geogr. 2019, 38, 1150–1158. [Google Scholar] [CrossRef]
  2. Xue, M.; Wei, B.; Li, J.; Chen, C.H.; Huang, M.H.; Zou, L.X. Research on soil moisture prediction method based on improved BP neural network and support vector machine. Chin. J. Soil Sci. 2021, 52, 793–800. [Google Scholar] [CrossRef]
  3. Xu, C.H. Modelling Research on Soil Water Content Diagnosis Based on Thermal Infrared Remote Sensing from UAV. Master’s Thesis, Northwest Agriculture and Forestry University, Xianyang, China, 2020. [Google Scholar]
  4. Zhang, Y.; Yang, X.; Tian, F. Study on Soil Moisture Status of Soybean and Corn across the Whole Growth Period Based on UAV Multimodal Remote Sensing. Remote Sens. 2024, 16, 3166. [Google Scholar] [CrossRef]
  5. Li, Z.J.; Chen, G.F.; Zhi, J.W.; Xiang, Y.Z.; Li, D.M.; Zhang, F.C.; Chen, J.Y. Modelling soybean soil water content estimation by integrating UAV spectral information and texture features. J. Agric. Mach. 2024, 55, 347–357. [Google Scholar] [CrossRef]
  6. Shi, H.; Liu, Z.; Li, S.; Jin, M.; Tang, Z.; Sun, T.; Liu, X.; Li, Z.; Zhang, F.; Xiang, Y. Monitoring Soybean Soil Moisture Content Based on UAV Multispectral and Thermal-Infrared Remote-Sensing Information Fusion. Plants 2024, 13, 2417. [Google Scholar] [CrossRef]
  7. Fu, P.F.; Yang, X.J.; Su, Z.C.; Qu, Y.P.; Ma, M.M. Prediction of soil moisture content based on ensemble learning: A case study of western Liaoning Province. Soils 2023, 55, 671–681. [Google Scholar] [CrossRef]
  8. Yang, W.J.; Li, F.H.; Kang, D.K.; Duan, W.C.; Zhao, W.J. Research on multispectral soil water content inversion model based on the screening of sensitive variables. Water Sav. Irrig. 2025, 45–54. [Google Scholar] [CrossRef]
  9. Li, X.; Yan, J.; Huang, C.; Ma, W.; Guo, Z.; Li, J.; Yao, X.; Da, Q.; Cheng, K.; Yang, H. Estimation of Silage Maize Plant Moisture Content Based on UAV Multispectral Data and Ensemble Learning Methods. Agriculture 2025, 15, 746. [Google Scholar] [CrossRef]
  10. Liu, Z.F.; Lei, H.C.; Sheng, H.Y. Remote sensing inversion of soil nutrients in cultivated land of Huangshui River Basin based on XGBoost model. J. Arid Land Geogr. 2023, 46, 1643–1653. [Google Scholar] [CrossRef]
  11. Li, P.X.; Liu, Z.Q.; Yang, J.; Sun, W.D.; Li, M.Y.; Ren, Y.X. Retrieval of soil moisture from polarimetric SAR data using random forest regression. J. Wuhan Univ. (Inf. Sci. Ed.) 2019, 44, 405–412. [Google Scholar]
  12. Gao, Y.R. Modelling of Soil Water Content in Bare Soil by UAV Remote Sensing Inversion Under Different Loamy Soil Environments. Master’s Thesis, Taiyuan University of Technology, Taiyuan, China, 2023. [Google Scholar]
  13. Liu, Q. Research on Soil Water Content Monitoring Model of Summer Maize Root Zone Based on UAV Remote Sensing. Master’s Thesis, Northwest Agriculture and Forestry University, Xianyang, China, 2023. [Google Scholar]
  14. Vahidi, M.; Shaffan, S.; Frame, W.H. Precision Soil Moisture Monitoring Through Drone-Based Hyperspectral Imaging and PCA-Driven Machine Learning. Sensors 2025, 25, 782. [Google Scholar] [CrossRef]
  15. Zhang, Y.; He, J.; Zhang, X.F.; Guo, Y.; Yang, X.Z.; Zhang, H.L.; Liu, T.; Wei, P.P.; Wang, L.G. Inversion of soil water content in the early root zone of maize based on soil background rejection. J. Irrig. Drain. 2025, 1–9. [Google Scholar] [CrossRef]
  16. Almeida-Naunay, A.F.; Tarquis, A.M.; Lopez-Herrera, J.; Pérez-Martín, E.; Pancorbo, J.L.; Raya-Sereno, M.D.; Quemada, M. Optimization of soil background removal to improve the prediction of wheat traits with UAV imagery. Comput. Electron. Agric. 2023, 205, 107559. [Google Scholar] [CrossRef]
  17. Yang, S.; Chen, J.Y.; Zhou, Y.C.; Cui, W.X.; Yang, N. Inversion of soil water content in the root zone of maize by thermal infrared remote sensing from unmanned aerial vehicle. Water Sav. Irrig. 2021, 3, 12–18. [Google Scholar]
  18. Zhang, Z.T.; Zhou, Y.C.; Yang, S. Remote sensing inversion method of soil water content in the root zone of winter wheat without soil background. J. Agric. Mach. 2021, 52, 197–207. [Google Scholar]
  19. Da, Q.; Yan, J.; Li, G.; Guo, Z.; Li, H.; Wang, W.; Li, J.; Ma, W.; Li, X.; Cheng, K. Inversion of Soil Moisture Content in Silage Corn Root Zones Based on UAV Remote Sensing. Agriculture 2025, 15, 331. [Google Scholar] [CrossRef]
  20. Wang, J.E.; Xiao, Y.; Wang, Z.H.; Zhang, C.J.; Wang, Y.; Bai, X.Q.; Yu, G.D.; Zhang, Z.T. Study on the effect of removing soil background on soil water content in the root zone of inverted maize. Water Sav. Irrig. 2021, 12, 81–86+93. [Google Scholar]
  21. Dhanya, V.G.; Subeesh, A.; Kushwaha, N.L.; Vishwakarma, D.K.; Kumar, T.N.; Ritika, G.; Singh, A.N. Deep learning based computer vision approaches for smart agricultural applications. Artif. Intell. Agric. 2022, 6, 211–229. [Google Scholar] [CrossRef]
  22. Cetin, N.; Kavuncuoglu, E.; Buzpinar, M.A.; Gunaydin, S.; Kaplan, S. Performance comparison of deep and transfer learning models for smart soil texture classification. Comput. Electron. Agric. 2025, 237, 110722. [Google Scholar] [CrossRef]
  23. Xu, J.Y.; Shi, X.L.; Li, X.; Zhang, Y. Analysis of research status on mixed sowing of alfalfa in China. Mod. Agric. Res. 2024, 30, 107–111. [Google Scholar]
  24. He, Y.L.; Yang, Z.Z.; Yang, X.X.; Zhang, C.P.; Wu, Z.F.; Chen, X.J.; Wang, Z.L.; Dong, Q.M. Comprehensive evaluation of production performance and nutritional quality of 12 Medicago sativa varieties in Jiuquan irrigation district of Gansu. Pratacultural Sci. 1–14.
  25. Niu, Y.X.; Zhang, L.Y.; Han, W.T. Winter wheat cover extraction method based on UAV remote sensing and vegetation index. J. Agric. Mach. 2018, 49, 212–221. [Google Scholar] [CrossRef]
  26. Tian, Z.K.; Fu, Y.Y.; Liu, S.H.; Liu, F. A rapid classification method for crops based on low-altitude remote sensing by UAV. Trans. Chin. Soc. Agric. Eng. 2013, 29, 109–116+295. [Google Scholar]
  27. Houborg, R.; Fisher, J.B.; Skidmore, A.K. Advances in remote sensing of vegetation function and traits. Int. J. Appl. Earth Obs. Geoinf. 2015, 43, 1–6. [Google Scholar] [CrossRef]
  28. Yang, W.P.; Li, C.Z.; Yang, H.; Yang, G.J.; Feng, H.K.; Han, L.; Niu, Q.L.; Han, D. Monitoring of maize canopy temperature based on UAV thermal infrared and digital images. Trans. Chin. Soc. Agric. Eng. 2018, 34, 68–75+301. [Google Scholar] [CrossRef]
  29. Liang, L.; Di, L.P.; Zhang, L.P.; Deng, M.X.; Qin, Z.H.; Zhao, S.H.; Lin, H. Estimation of crop LAI using hyperspectral vegetation indices and a hybrid inversion method. Remote Sens. Environ. Interdiscip. J. 2015, 165, 123–134. [Google Scholar] [CrossRef]
  30. Wang, Y.F. Research on Soil Water Content Inversion Model Based on UAV Multispectral Remote Sensing. Master’s Thesis, Lanzhou University of Science and Technology, Lanzhou, China, 2023. [Google Scholar]
  31. Birth, G.S.; Mcvey, G.R. Measuring the Color of Growing Turf with a Reflectance Spectrophotometer. Agron. J. 1968, 60, 640–643. [Google Scholar] [CrossRef]
  32. Zhang, C.C.; Wang, R.; Hou, J.T.; Jiang, M.L.; Zhu, X.X. Inversion of soil water content based on multispectral remote sensing by unmanned aerial vehicle with feature variable screening. China Rural Water Hydropower 2024, 5, 147–154. [Google Scholar] [CrossRef]
  33. Zarco-Tejada, P.J.; Berjón, A.; López-Lozano, R.; Miller, J.R.; Martín, P.; Cachorro, V.; González, M.R.; de Fruits, A. Assessing vineyard condition with hyperspectral indices: Leaf and canopy reflectance simulation in a row-structured discontinuous canopy. Remote Sens. Environ. 2005, 99, 271–287. [Google Scholar] [CrossRef]
  34. Zhang, Z.T.; Tan, C.X.; Xu, C.H.; Chen, S.B.; Han, W.T.; Li, Y. Research on soil water content in the root zone of maize based on multi-spectral remote sensing by unmanned aerial vehicle. J. Agric. Mach. 2019, 50, 246–257. [Google Scholar]
  35. Hao, J.Y. Research on Soil Water Content Detection Based on UAV Multispectral Imagery. Master’s Thesis, Shandong Agricultural University, Shandong, China, 2023. [Google Scholar]
  36. Ali, M.; Prasad, R.; Xiang, Y.; Yaseen, Z.M. Complete ensemble empirical mode decomposition hybridized with random forest and kernel ridge regression model for monthly rainfall forecasts. J. Hydrol. 2020, 584, 124647. [Google Scholar] [CrossRef]
  37. Dou, J.L.; Li, D.; Song, J.; Wang, Q.W.; Shen, T. A microstrip antenna size optimisation method based on KNN and ANN algorithms. J. Terahertz Sci. Electron. Inf. 2025, 23, 61–65. [Google Scholar] [CrossRef]
  38. Wang, Y.; Wang, J.; Li, J.; Wang, J.; Xu, H.; Liu, T.; Wang, J. Estimating Maize Leaf Water Content Using Machine Learning with Diverse Multispectral Image Features. Plants 2025, 14, 973. [Google Scholar] [CrossRef]
  39. Yang, L.N.; Yao, K.X.; He, Y.; Xi, L.P.; Liu, W.C.; Zhao, J.L. Research on road condition sensing method based on SmoteEnn_XGBoost model. Intell. Comput. Appl. 2021, 11, 137–142+147. [Google Scholar]
  40. Kasim, N.; Sawut, R.; Shi, Q.; Maihemuti, B. Estimation of Soil Organic Matter Content Based on Optimized Spectral Index. Nongye Jixie Xuebao/Trans. Chin. Soc. Agric. Mach. 2018, 49, 155–163. [Google Scholar] [CrossRef]
  41. Jin, Y.H.; Wu, X.M.; Zhen, W.C.; Cui, X.T.; Chen, L.; Qie, Z.H. Unmanned aerial vehicle multispectral remote sensing inversion of soil water content based on the optimisation of spectral information window scale at sampling points. J. Agric. Mach. 2024, 55, 316–327. [Google Scholar] [CrossRef]
  42. Zhang, Z.T.; Wang, H.F.; Han, W.T.; Bian, J.; Chen, S.B.; Cui, T. Research on soil water content inversion based on multi-spectral remote sensing by UAV. J. Agric. Mach. 2018, 49, 173–181. [Google Scholar] [CrossRef]
  43. Yang, J.B.; Wang, B.; Huang, J.L.; Zhang, Z.T.; Zou, Y.C.; Jiang, W.H. UAV multispectral remote sensing to monitor soil water content in root zone of winter wheat during nodulation. Water Sav. Irrig. 2019, 10, 6–10. [Google Scholar]
  44. Wang, Z.G.; Huang, Z.Q.; He, C.L. Hyperspectral inversion study of water content in sand ginger black soil based on integrated learning. J. Agric. Resour. Environ. 2023, 40, 1426–1434. [Google Scholar] [CrossRef]
  45. Li, X.S.; Jia, Z.F.; He, J.Y.; Gao, W.; Pan, S.J.; Niu, Z.J.; Zhang, D.Y. A time-series inversion method of soil moisture in the root domain of kiwifruit based on BiLSTM. J. Agric. Eng. 2025, 41, 112–119. [Google Scholar] [CrossRef]
  46. Zang, J.; Deng, J.T.; Ni, G.W.; Niu, Z.J.; Pan, S.J.; Han, W.T. Study on the influencing factors of soil moisture inversion in kiwifruit root zone based on vegetation index. J. Agric. Mach. 2022, 53, 223–230. [Google Scholar] [CrossRef]
  47. Mao, Z.H.; Deng, L.; Sun, J.; Zhang, A.W.; Chen, X.Y.; Zhao, Y. Application of UAV multispectral remote sensing in maize canopy chlorophyll prediction. Spectrosc. Spectr. Anal. 2018, 38, 2923–2931. [Google Scholar]
  48. Rondeaux, G.; Steven, M.; Baret, F. Optimization of soil-adjusted vegetation indices. Remote Sens. Environ. 1996, 55, 95–107. [Google Scholar] [CrossRef]
  49. Haboudane, D.; Miller, J.R.; Tremblay, N.; Zarco-Tejada, P.J.; Dextraze, L. Integrated narrow-band vegetation indices for prediction of crop chlorophyll content for application to precision agriculture. Remote Sens. Environ. 2002, 81, 416–426. [Google Scholar] [CrossRef]
  50. Cinat, P.; Di Gennaro, S.F.; Berton, A.; Matese, A. Comparison of unsupervised algorithms for vineyard canopy segmentation from UAV multispectral images. Remote Sens. 2019, 11, 1023–1046. [Google Scholar] [CrossRef]
  51. Zhang, Y.; Han, W.; Zhang, H.; Niu, X.; Shao, G. Evaluating soil moisture content under maize coverage using UAV multimodal data by machine learning algorithms. J. Hydrol. 2023, 617, 129086. [Google Scholar] [CrossRef]
  52. Zhu, S.; Cui, N.; Guo, L.; Jin, H.; Jin, X.; Jiang, S.; Wu, Z.; Lv, M.; Chen, F.; Liu, Q.; et al. Enhancing precision of root-zone soil moisture content prediction in a kiwifruit orchard using UAV multi-spectral image features and ensemble learning. Comput. Electron. Agric. 2024, 221, 108943. [Google Scholar] [CrossRef]
Figure 1. The study area.
Figure 1. The study area.
Soilsystems 09 00098 g001
Figure 2. Meteorological data for the alfalfa growing season in 2024: (a) daily average precipitation and (b) daily average temperature.
Figure 2. Meteorological data for the alfalfa growing season in 2024: (a) daily average precipitation and (b) daily average temperature.
Soilsystems 09 00098 g002
Figure 3. DJI Matrice 300 RTK quadcopter drone.
Figure 3. DJI Matrice 300 RTK quadcopter drone.
Soilsystems 09 00098 g003
Figure 4. The MS 600 Pro multispectral camera.
Figure 4. The MS 600 Pro multispectral camera.
Soilsystems 09 00098 g004
Figure 5. The machine learning process for the soil moisture content.
Figure 5. The machine learning process for the soil moisture content.
Soilsystems 09 00098 g005
Figure 6. Matrices of the correlations between the spectral reflectances and spectral indices and the soil water content. (r) indicates removal of soil background noise. * represents p < 0.05, and ** represents p < 0.01. A p value lower than 0.05 indicates that the results are statistically significant, while a p value lower than 0.01 indicates a high level of statistical significance, both of which meet the modeling requirements.
Figure 6. Matrices of the correlations between the spectral reflectances and spectral indices and the soil water content. (r) indicates removal of soil background noise. * represents p < 0.05, and ** represents p < 0.01. A p value lower than 0.05 indicates that the results are statistically significant, while a p value lower than 0.01 indicates a high level of statistical significance, both of which meet the modeling requirements.
Soilsystems 09 00098 g006
Figure 7. Validation of the effectiveness of the inversion models constructed on the basis of spectral reflectance.
Figure 7. Validation of the effectiveness of the inversion models constructed on the basis of spectral reflectance.
Soilsystems 09 00098 g007
Figure 8. Validation of the effectiveness of the inversion models constructed based on spectral indices.
Figure 8. Validation of the effectiveness of the inversion models constructed based on spectral indices.
Soilsystems 09 00098 g008
Figure 9. Inverse soil moisture content map.
Figure 9. Inverse soil moisture content map.
Soilsystems 09 00098 g009
Figure 10. Residual plots of the models (using spectral reflectances as input features).
Figure 10. Residual plots of the models (using spectral reflectances as input features).
Soilsystems 09 00098 g010
Figure 11. Residual plots of the models (using spectral indices as input features).
Figure 11. Residual plots of the models (using spectral indices as input features).
Soilsystems 09 00098 g011
Table 1. Physical and chemical properties of soil and meteorological data.
Table 1. Physical and chemical properties of soil and meteorological data.
IndexNumeric ValueUnit
dry bulk density1.35g cm−3
field capacity24.1%
PH8.11
organic matter6.15g kg−1
total nitrogen1.58g kg−1
total phosphorus1.36g kg−1
total potassium34.16g kg−1
fast-acting nitrogen74.22mg kg−1
fast-acting phosphorus32.99mg kg−1
fast-acting potassium147.80mg kg−1
precipitation276mm
average daily temperature18.8°C
Table 2. Center wavelength, bandwidth, and diffuse reflector reflectance of each band.
Table 2. Center wavelength, bandwidth, and diffuse reflector reflectance of each band.
Spectral BandCenter Wavelength (nm)Bandwidth (nm)Reflectance of
Diffuse Reflector (%)
Blue4503560
Green5552560
Red6602060
Red edge 17201060
Red edge 27501560
NIR8403560
Table 3. Spectral index formulas.
Table 3. Spectral index formulas.
Spectral IndicesAcronymsFormulationReference
Normalized difference vegetation indexNDVI(NIR − R)/(NIR + R)[30]
Ratio vegetation indexRVINIR/R[31]
Difference vegetation indexDVINIR − R[32]
Green indexGIG/R[33]
Simple ratio pigment indexSRPIB/R[34]
Red-to-green ratio indexRGRIR/G[35]
Table 4. The accuracy of the soil moisture content inversion models based on spectral reflectance and the spectral indices.
Table 4. The accuracy of the soil moisture content inversion models based on spectral reflectance and the spectral indices.
ModelSoil BackgroundModeling SetValidation Set
R2RMSEMAER2RMSEMAE
Spectral reflectanceKNNunremoved0.7830.0170.0130.7700.0230.019
removed0.7140.0220.0140.6920.0210.015
RFRunremoved0.7680.0190.0140.7620.0210.021
removed0.8290.0150.0100.7100.0270.017
RRunremoved0.4630.0330.0260.4650.0200.016
removed0.4220.0340.0250.4420.0210.017
XG-Boostunremoved0.8840.0140.0140.8120.0170.013
removed0.8670.0150.0150.8030.0180.018
Spectral indexKNNunremoved0.6060.0280.0170.5350.0210.017
removed0.4460.0330.0220.4630.0240.023
RFRunremoved0.7720.0200.0170.6320.0200.018
removed0.7370.0200.0180.6280.0260.021
RRunremoved0.2550.0390.0310.3660.0210.018
removed0.2530.0390.0250.3510.0240.022
XG-Boostunremoved0.6130.0260.0130.5030.0270.016
removed0.5680.0230.0170.4600.0370.025
Table 5. Model significance testing.
Table 5. Model significance testing.
ModelSoil Backgroundp-Value
Spectral reflectanceKNNunremoved0.00002
removed0.0002
RFRunremoved0.00003
removed0.00002
RRunremoved0.002
removed0.00003
XG-Boostunremoved0.000001
removed0.000001
Spectral indexKNNunremoved0.00009
removed0.0005
RFRunremoved0.00002
removed0.000002
RRunremoved0.00002
removed0.000008
XG-Boostunremoved0.000003
removed0.000001
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chen, J.; Jiang, Y.; Yu, W.; Qi, G.; Kang, Y.; Yin, M.; Ma, Y.; Wang, Y.; Zhu, J.; Wang, Y.; et al. Influence of Soil Background Noise on Accuracy of Soil Moisture Content Inversion in Alfalfa Fields Based on UAV Multispectral Data. Soil Syst. 2025, 9, 98. https://doi.org/10.3390/soilsystems9030098

AMA Style

Chen J, Jiang Y, Yu W, Qi G, Kang Y, Yin M, Ma Y, Wang Y, Zhu J, Wang Y, et al. Influence of Soil Background Noise on Accuracy of Soil Moisture Content Inversion in Alfalfa Fields Based on UAV Multispectral Data. Soil Systems. 2025; 9(3):98. https://doi.org/10.3390/soilsystems9030098

Chicago/Turabian Style

Chen, Jinxi, Yuanbo Jiang, Wenjing Yu, Guangping Qi, Yanxia Kang, Minhua Yin, Yanlin Ma, Yayu Wang, Jiapeng Zhu, Yanbiao Wang, and et al. 2025. "Influence of Soil Background Noise on Accuracy of Soil Moisture Content Inversion in Alfalfa Fields Based on UAV Multispectral Data" Soil Systems 9, no. 3: 98. https://doi.org/10.3390/soilsystems9030098

APA Style

Chen, J., Jiang, Y., Yu, W., Qi, G., Kang, Y., Yin, M., Ma, Y., Wang, Y., Zhu, J., Wang, Y., & Li, B. (2025). Influence of Soil Background Noise on Accuracy of Soil Moisture Content Inversion in Alfalfa Fields Based on UAV Multispectral Data. Soil Systems, 9(3), 98. https://doi.org/10.3390/soilsystems9030098

Article Metrics

Back to TopTop