1. Introduction
Surfaces soil moisture (SSM) is a pivotal variable in the Earth system, playing a critical role in regulating land–atmosphere energy exchange [
1,
2,
3]. In addition, soil moisture is also the cornerstone of the ecosystem, which directly affects the growth and distribution of plants, thus affecting the yield and quality of crops [
4,
5]. Hence, accurate and detailed knowledge of the spatial and temporal distribution of soil moisture is critical for the development of contemporary agriculture, such as climate variability [
6,
7], drought detection [
8], flood warning [
9], and crop yield prediction [
10].
Over the past few decades, several soil moisture inversion models have been developed, including theoretical models such as the Integral Equation Model (IEM) [
11] and Advanced Integral Equation Model (AIEM) [
12], semi-empirical models such as the Dubois [
13], WCM [
14], and MIMICS [
15] models, empirical models including various linear regression approaches [
16], and numerical models like the Numerical Maxwell Model in 3D (NMM3D) [
17]. However, these models exhibit the following notable limitations: theoretical and semi-empirical models demand extensive surface parameters that are often challenging to acquire, while empirical models suffer from limited transferability due to their site-specific nature, and numerical models face computational constraints that hinder large-scale implementation. These inherent limitations currently restrict the broad application of model-based inversion methods in soil moisture monitoring.
Compared with model-based approaches, soil moisture inversion utilizing time-series data offers distinct advantages by eliminating the requirement for prior surface parameter inputs such as soil roughness, rendering it particularly suitable for long-term and large-scale monitoring applications. Various effective methods have been developed in this domain, including the change detection technique (CD) [
18], Alpha approximation model [
19], and Multitemporal Least Square Moisture Estimator (MULESME) [
20]. These approaches demonstrate notable operational simplicity while maintaining robust performance, leading to their widespread adoption in surface soil moisture retrieval studies.
The change detection method assumes that during the observation period, the influence of surface roughness and vegetation change on radar backscattering coefficient is smaller than that of soil moisture change. The change in radar backscattering coefficient is mainly caused by the change in soil moisture [
21]. In recent years, researchers have conducted extensive studies on soil moisture inversion using change detection methods and have implemented various improvements to this approach [
22,
23]. Wagner et al. [
18] pioneered the change detection method in 1999, successfully retrieving soil moisture from European Remote Sensing satellite (ERS) scatterometer data. This innovative approach eliminates the need for complex parameter inputs and auxiliary data while fully leveraging the advantages of time-series remote sensing observations. Zribi et al. [
24] incorporated Normalized Difference Vegetation Index (NDVI) to correct variations in radar backscattering coefficients, thereby improving the method’s applicability to vegetation-covered surfaces, provided a new direction for subsequent research. Bhogapurapu et al. [
25] improved the technique by utilizing the Dual-polarization Radar Vegetation Index (DpRVI), demonstrating superior soil moisture retrieval performance compared to NDVI-based corrections. Nativel et al. [
26] combined the change detection method considering vegetation with the neural network algorithm and used the hybrid algorithm to improve the soil moisture estimation in different climate environments. Current methodological improvements focus not only on vegetation parameter integration but also on radar incidence angle normalization to further enhance change detection accuracy [
27,
28].
However, due to the increasing use of remote sensing data, the impact of surface vegetation and soil roughness changes on soil moisture changes cannot be ignored in long-term sequence data analysis [
29,
30]. Currently, most studies focus on correcting the vegetation effects of land cover. These studies have certain limitations. Due to the dynamic evolution of agricultural practice and natural erosion, the change in soil roughness also significantly affects the radar backscatter signal, but it is usually not fully processed or assumed to be time-invariant. Therefore, the accuracy of soil moisture inversion based on change detection method depends largely on the correction effect of vegetation influence. In terms of vegetation correction, NDVI is the most commonly used vegetation index to correct the impact of vegetation [
31,
32]. However, in high-density vegetation coverage areas, such as dense canopies during the peak growing season, NDVI has the problem of oversaturation and cannot accurately describe ground vegetation conditions, leading to residual errors in the subsequent soil moisture retrieval. Moreover, although machine learning has been widely used in direct inversion of soil moisture, there are still some deficiencies in its combination with the change detection framework [
33]. Many attempts still remain on the surface, such as using only machine learning for post-processing, rather than deeply integrating it into the core assumptions of the change detection algorithm. This gap highlights the great potential of methodological innovation.
The primary objective of this study is to develop an enhanced change detection methodology for estimating time-series soil moisture. Specifically, to address the confounding interference of vegetation coverage on surface soil moisture estimation and to establish a more robust correspondence between variations in the radar backscattering coefficient and actual soil moisture dynamics, a piecewise function grounded in fractional vegetation coverage (FVC) is introduced. This function is designed to resolve the inherent mismatch problem in traditional change detection techniques, thereby augmenting the accuracy of soil moisture retrieval. Furthermore, acknowledging the limitations associated with relying on a single vegetation index (e.g., NDVI) for vegetation correction, a novel vegetation index is proposed to more comprehensively capture the influence of vegetation on radar backscattering coefficients within the study area. Concurrently, to uphold the fundamental assumption of time-invariant surface roughness that underpins change detection-based soil moisture retrieval and to mitigate spurious soil moisture variations induced by anthropogenic or natural disturbances, this study implements the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm. This methodology facilitates the systematic identification of abnormal events related to dynamics in surface roughness and soil moisture across the entire time series, thus ensuring the validity of the constant-roughness prerequisite that is critical for robust soil moisture estimation. A time series characterized by stable surface roughness conditions is subsequently selected to guarantee the effectiveness and precision of the soil moisture inversion algorithm. The organization of this paper is as follows:
Section 2 describes the study area and the dataset used.
Section 3 describes the traditional change detection method and improved change detection method in detail.
Section 4 presents the inversion results derived from different methods.
Section 5 shows comparative analysis and discussion. Finally,
Section 6 encapsulates the concluding remarks.
3. Methods
3.1. Improved Change Detection Method Based on Vegetation Coverage
Change detection method was first proposed by Wagner et al. [
18], and soil moisture was inverted based on change detection method using ERS scatterometer data. Therefore, soil moisture can be retrieved by calculating the difference in backscattering coefficients, as shown in Equations (1) and (2).
where
is the difference between the radar backscatter coefficient corresponding to observation time
and the minimum value of the radar backscatter coefficient during the observation period, which is called the change in radar backscatter coefficient.
and
represent the minimum and maximum values of the radar backscatter coefficient during the observation period, respectively.
and
represent the minimum and maximum values of soil moisture during the observation period, respectively;
is the soil moisture inverted by the change detection method.
Zribi et al. [
24] found that vegetation coverage reduces the sensitivity of the radar backscattering coefficient to soil moisture, and the sensitivity of the radar backscattering coefficient to soil moisture decreases with an increase in the NDVI, as shown in
Figure 2 [
23].
Subsequently, Zribi et al. [
24] corrected the variation in the radar backscattering coefficients based on NDVI (as shown in Equations (3) and (4)), which enhanced the applicability of change detection method in vegetation-covered areas.
where
is the maximum value of the change in radar backscattering coefficient fitted with NDVI, which decreases with increasing NDVI.
is the function of NDVI, and
represents the fitting coefficient [
23].
Optical remote sensing data, being particularly responsive to vegetation characteristics, have been extensively employed to quantify canopy properties. Among various vegetation indices, the Normalized Difference Vegetation Index (NDVI) [
41] has become standard metrics in numerous studies for both characterizing vegetation status and accounting for vegetation-induced modifications to soil surface backscattering signals. The statistical model based on NDVI remote sensing vegetation index uses clear parameterized expressions to associate a limited number of spectral bands with surface vegetation cover [
42,
43]. It is the most commonly used type of remote sensing vegetation index and currently the most popular and widely used model in optical remote sensing estimation. Fraction vegetation coverage (FVC) can be represented by NDVI [
44,
45], as shown in Equation (5):
Although many studies have been conducted to improve change detection method based on different vegetation indices, these studies only considered the effect of vegetation cover on the radar backscattering coefficient rather than the effect of vegetation cover on soil moisture. As vegetation coverage increases, its influence on radar backscattering coefficients manifests through multiple interconnected mechanisms. First, vegetation canopy introduces nonlinear modifications to radar backscattering characteristics, with the effect magnitude growing disproportionately with increasing coverage density. Second, enhanced vegetation coverage alters the hydrological regime through improved water retention capacity and plant transpiration processes—complex interactions that cannot be treated as mutually canceling effects [
46]. Therefore, it is necessary to consider the influence of vegetation coverage on surface soil moisture when inverting soil moisture in vegetation coverage area based on change detection method [
47,
48].
According to the FVC, the surface conditions are divided into low vegetation coverage area (0 < FVC ≤ 0.3), medium vegetation coverage area (0.3 < FVC ≤ 0.6), and high vegetation coverage area (0.6 < FVC ≤ 1.0). This setting of FVC threshold is based on the study of Zhang et al. [
49]. By analyzing the relationship between vegetation index and radar backscattering coefficient and the relationship between vegetation coverage and soil moisture, it is found that the minimum and maximum values of radar backscattering coefficient will increase with the increase in FVC, and the minimum and maximum values of soil moisture will also be affected by the increase in FVC. Therefore, when inverting the soil moisture in the vegetation coverage area based on the change detection method, it is necessary to correct the radar backscattering coefficient and soil moisture based on FVC.
Based on FVC, the minimum and maximum values of radar backscattering coefficient and soil moisture under different surface coverage conditions are corrected, and an improved change detection method is obtained. The equations are as follows:
where
and
represent the minimum and maximum values of radar backscattering coefficients under the surface coverage conditions corresponding to NDVI, respectively.
and
represent the minimum and maximum values of soil moisture under the surface coverage conditions corresponding to NDVI, respectively.
3.2. Composite Vegetation Index
The current change detection algorithms mainly rely on the statistical relationship between the change in backscattering coefficient and the spatial distribution of NDVI features for vegetation effect correction. However, vegetation correction using a single vegetation index introduces substantial uncertainty and proves inadequate for complete vegetation influence removal. While NDVI demonstrates reasonable sensitivity in low-to-medium vegetation coverage areas, its effectiveness significantly diminishes in dense vegetation conditions due to index saturation, resulting in compromised characterization of underlying vegetation dynamics [
50].
Compared to NDVI, EVI incorporates both atmospheric and soil background correction factors, demonstrating greater sensitivity to green vegetation and a wider dynamic range [
51]. This enhanced capability enables more accurate characterization across varying surface vegetation densities, effectively addressing NDVIs saturation limitations in dense canopies while better capturing seasonal vegetation dynamics. More recently, the kNDVI (8) has been introduced as an advanced solution to mitigate the nonlinearity and saturation issues inherent in traditional NDVI-vegetation relationships [
52,
53].
Considering the significant role of vegetation canopy interception in radar backscattering mechanisms, it is well established that the relationship between vegetation coverage and its influence on backscattering coefficients is fundamentally nonlinear. This complexity arises from the distinctive characteristics of existing vegetation indices: NDVI demonstrates reasonable sensitivity in sparse vegetation conditions but suffers from saturation effects in densely vegetated areas [
50]; EVI maintains better sensitivity to vegetation dynamics through its soil-adjusted formulation [
54]; meanwhile, kNDVI effectively captures nonlinear vegetation-backscattering relationships through kernel-based transformations [
55]. To harness the complementary advantages of these indices while addressing their respective limitations, this study develops a novel composite vegetation index (NDEVI) that synergistically integrates the optimal characteristics of NDVI, EVI, and kNDVI. The calculation equations of NDEVI are presented as follows:
The core physical significance of NDEVI lies in its design as an adaptive hybrid vegetation index. Its primary objective is to achieve superior monitoring across the entire vegetation gradient—from sparse to dense—through a dynamic weighting mechanism. The physical rationale of the index is underpinned by two key components as follows: the term mathematically represents the uncertainty or information content of NDVI itself. Its physical role is to automatically identify and assign greater weight to the medium vegetation coverage range, where NDVI exhibits its highest sensitivity and discriminative power. The term functions as a conditional enhancer. Here, the NDVI value acts as a switch: it suppresses the contribution of kEVI in sparse vegetation, while enabling a smooth transition to kEVI dominance when high NDVI values indicate dense foliage. This mechanism effectively resolves the prevalent issue of spectral saturation encountered by the original NDVI in high-coverage regions. Ultimately, through this intrinsic fusion logic, NDEVI achieves a more robust and continuous retrieval capability for canopy biophysical parameters, such as Leaf Area Index (LAI).
The NDEVI developed in this study effectively combined the advantages of the three vegetation indices. This composite vegetation index can achieve accurate vegetation monitoring under all coverage conditions and can more sensitively capture the nonlinear relationship between vegetation index and surface vegetation coverage. At the same time, it overcomes the saturation limit of traditional indices in dense vegetation.
3.3. Dividing Invariant Time Series
The assumption of inversion of soil moisture by change detection method is that in a certain time series range, the surface roughness is constant, and the change in radar backscattering coefficient is only related to the change in soil moisture, and then the relative water content of soil is estimated according to the change in backscattering coefficient.
However, the study area represents a typical farming-pastoral transition zone dominated by three land cover types as follows: farmland, grassland, and forest. This specific ecosystem context introduces important challenges to the fundamental assumptions of change detection methods. Agricultural operations (e.g., plowing, harvesting) and pastoral activities (particularly grazing) frequently induce abrupt changes in surface roughness. These anthropogenic disturbances often cause corresponding sudden shifts in SAR backscatter characteristics, thereby violating the key assumption of constant surface conditions during the monitoring period. Moreover, both anthropogenic activities (e.g., irrigation) and natural events (e.g., rainfall) can trigger abrupt increases in soil moisture within the study area. Under consistent vegetation coverage conditions, such abrupt hydrological changes induce significant fluctuations in soil moisture levels and corresponding sudden shifts in radar backscattering coefficients. These anomalous signal variations violate the fundamental assumptions of change detection methods, consequently compromising the accuracy and reliability of soil moisture retrieval.
In dual-polarization SAR data, these values can be regarded as time nodes in the time series of backscattering coefficients, and the sharp change in soil roughness and soil moisture is an abnormal event that can be detected in the SAR feature space [
22]. In the semi-arid Mediterranean agricultural region, Graldi et al. [
28] used the DBSCAN algorithm based on
,
, and
to exclude changes in the total backscatter signal due to dramatic changes in soil roughness and soil moisture from the dataset. While this methodology has been validated in conventional agricultural settings, its reliability and effectiveness remain unverified for farming-pastoral ecotones like our study area.
DBSCAN is a density-based clustering algorithm that can find clusters of any shape and effectively process noise (neighborhood radius ε and minimum number of points MinPts). The DBSCAN algorithm has been widely applied for outlier detection in various domains, such as identifying stop/move points in GPS trajectories [
56] and detecting anomalous subsequences/outliers in aquaculture environments, demonstrating its broad application prospects [
57]. In this study, the way of determining ε and MinPits selected in Graldi’s study was used. The ε parameter was determined through k-dist analysis by calculating the distance from each point to its fourth nearest neighbor in the 3D space, with the optimal value identified at the intersection of the two trend lines representing the data distribution. The MinPts parameter was set to four, corresponding to the dimensionality of our 3D dataset. For dual-polarization C-band SAR images,
and
can be used to detect soil roughness and soil moisture changes caused by emergencies such as agricultural or animal husbandry events after optimizing feature space selection.
Following vegetation correction using vegetation indices, this study applied the DBSCAN algorithm to identify transition points of soil roughness variation within the study area. The algorithm segments the long-term time series into multiple discrete intervals characterized by stable surface roughness conditions, based on the identified cluster structures and noise points. These partitioned time periods represent hydrological conditions unaffected by irrigation, rainfall, and other disturbance events, thereby satisfying the constant-roughness assumption essential to change detection methodology while minimizing anthropogenic impacts on the relationship between soil roughness and backscattering coefficients.
Figure 3 presents the workflow of the enhanced change detection methodology developed in this study.
4. Results
4.1. Analysis of the Influence of Surface Vegetation Coverage
To systematically evaluate the impact of vegetation coverage on soil moisture estimation in vegetated areas, this study utilized 58-scene Sentinel-1 VV-polarization time series data (2019–2020) acquired over the Shandian river basin. Comparative analysis was conducted using three distinct approaches: (a) the change detection method without vegetation correction developed by Wagner, (b) the unified vegetation correction approach introduced by Zribi, and (c) the proposed FVC-segmented correction method presented in this paper.
Figure 4 visually demonstrated the performance differences among these three methodologies, highlighting the progressive improvements in addressing vegetation effects across varying coverage conditions.
To quantitatively evaluate the accuracy of soil moisture estimation using different change detection methods, this study calculated the coefficient of determination (R2), root mean square error (RMSE), and mean absolute error (MAE) between estimated and measured soil moisture values. In addition, we employed the RMSE/σ index to evaluate model accuracy. The results demonstrated that Wagner’s change detection method, which does not account for the influence of vegetation coverage on backscattering coefficients, significantly underestimates soil moisture content compared to field measurements. Consequently, this approach proves unsuitable for areas with high vegetation density. Zribi’s improved change detection method incorporates vegetation’s linear influence on backscattering coefficients, demonstrating measurable improvements over Wagner’s baseline approach. The modified method addresses the systematic underestimation observed in the original technique, elevating R2 to 0.545 while substantially reducing both RMSE and MAE values. However, this linear correction introduced a new systematic bias—significant overestimation of soil moisture content—particularly in densely vegetated areas. This limitation stems from the following two fundamental factors: the inherently nonlinear relationship between vegetation density and backscattering response, and NDVI’s characteristic saturation behavior under high vegetation coverage conditions, which prevents accurate characterization of vegetation-scattering dynamics in dense canopies. To address the nonlinear relationship between vegetation coverage and backscattering behavior, this study proposed an FVC-segmentation change detection approach that implements distinct vegetation correction schemes for low (0 < FVC ≤ 0.3), medium (0.3 < FVC ≤ 0.6), and high (0.6 < FVC ≤ 1.0) vegetation coverage conditions. This methodology effectively mitigated the systematic overestimation observed in Zribi’s unified correction approach, yielding substantial metric improvements: R2 increased from 0.265 to 0.624, RMSE decreased from 0.066 to 0.048 m3/m3, MAE reduced from 0.047 to 0.037 m3/m3, and RMSE/σ reduced from 0.857 to 0.613. Nevertheless, residual overestimation persists due to inherent limitations of the NDVI, particularly its saturation tendency in dense vegetation scenarios that constrains complete characterization of vegetation-scattering dynamics.
4.2. Comparison of Different Vegetation Index
Building upon the FVC-segmentation correction, this study further implemented vegetation correction using six different vegetation indices: three single indices (NDVI, kNDVI, EVI) and three composite indices (NDVI + kNDVI, NDVI + EVI, NDEVI). Taking site M9 as an example,
Figure 5 demonstrates the temporal profiles of the six vegetation indices across the study area.
The performance of each vegetation index configuration was rigorously validated against synchronous in situ soil moisture measurements, enabling comprehensive comparison between single and composite vegetation indices for vegetation effect compensation.
Figure 6 presented scatter plot comparisons of the soil moisture retrieval results obtained from these six vegetation correction approaches, visually demonstrating their relative effectiveness in mitigating vegetation influences.
As shown in
Figure 6, the soil moisture content in the Shandian river basin primarily ranges between 5% and 45%, exhibiting significant seasonal variation throughout the year. Following vegetation correction, soil moisture retrieval using the single indices NDVI and EVI yielded comparable results, with EVI demonstrating marginally superior performance to NDVI. However, both indices exhibited a consistent tendency toward slight overestimation. In contrast, the kNDVI approach—which accounts for nonlinear relationships between vegetation indices and surface vegetation characteristics—effectively mitigated the overestimation issue observed with NDVI and EVI. Nevertheless, kNDVI introduced a new systematic bias characterized by pronounced underestimation, while achieving overall accuracy levels similar to those of NDVI and EVI. The implementation of composite vegetation indices—specifically NDVI + kNDVI, NDVI + EVI, and the proposed NDEVI—yielded measurable improvements in estimation accuracy. While the NDVI + EVI combination offered only marginal enhancement over its individual components and failed to resolve the fundamental limitations associated with either index alone, the NDVI + kNDVI configuration demonstrated more substantial progress. This improvement underscored the value of incorporating nonlinear relationships between surface vegetation and index responses for enhanced vegetation correction. However, this approach remained constrained by NDVI’s inherent limitations, particularly its saturation characteristics in dense vegetation conditions. The composite vegetation index NDEVI proposed in this study, developed to address the limitations of single indices and to better represent the nonlinear relationships between surface vegetation and spectral responses, demonstrated the most substantial improvement in estimation accuracy. Comparative analysis of all six vegetation indices reveals that NDEVI achieved superior performance metrics: when compared to traditional NDVI-based vegetation correction, the determination coefficient R
2 increased from 0.624 to 0.725, RMSE decreased from 0.048 to 0.041 m
3/m
3, MAE decreased from 0.037 to 0.033 m
3/m
3, RMSE/σ decreased from 0.613 to 0.524. These quantitative results confirmed that the improved change detection model incorporating NDEVI significantly outperformed the original approach.
4.3. Analysis of Vegetation Impact Correction Under Different Vegetation Coverage
Taylor diagrams were generated for six vegetation indices across different vegetation coverage levels (
Figure 7). The analysis revealed that following FVC-segmentation correction, most vegetation indices demonstrated optimal performance in medium vegetation coverage conditions, while maintaining satisfactory results in both low and high coverage scenarios. The proposed NDEVI consistently achieved strong correlation coefficients between estimated and measured soil moisture values across all vegetation density conditions. Furthermore, NDEVI exhibited the lowest root mean square error among all evaluated indices, confirming its superior accuracy in soil moisture retrieval. Under high vegetation coverage conditions (d), where retrieval challenges were most pronounced, the robustness of NDEVI become decisively evident. It is the only index capable of simultaneously maintaining high correlation (>0.8) and realistic standard deviation, while all other indices—including alternative composite indices—exhibited significant performance degradation. This consistent superiority across vegetation gradients confirms NDEVI’s unique ability to mitigate vegetation effects while preserving soil moisture distribution characteristics.
To further validate the effectiveness of the proposed NDEVI vegetation index, we conducted a stratified evaluation of vegetation correction performance across low, medium, and high vegetation coverage conditions using all six vegetation indices. As summarized in
Figure 8, among the single vegetation indices, both NDVI and EVI demonstrate relatively better accuracy for soil moisture estimation in medium vegetation coverage areas, with EVI showing clearly superior performance compared to both NDVI and kNDVI in these conditions; NDVI and EVI’s performance degraded significantly in high vegetation coverage conditions due to the well-documented saturation effect in dense vegetation canopies [
58,
59]. While kNDVI produces underestimated soil moisture values in low and medium vegetation coverage areas, it achieved substantially improved performance in dense vegetation conditions (R
2 = 0.789, RMSE/σ = 0.428), indicating particular suitability for high vegetation coverage scenarios despite its limited operational range. In comparison with the three single indices (NDVI, kNDVI, EVI), all composite vegetation indices yield superior results in sparsely vegetated areas. The integration of kNDVI and EVI successfully mitigated NDVI’s oversaturation problem. Among the composite vegetation indices evaluated, NDEVI demonstrated superior performance for soil moisture retrieval, achieving optimal results across all vegetation coverage levels—low, medium, and high—when compared to the other five vegetation indices. This comprehensive superiority stems from NDEVI’s capacity to integrate the respective advantages of different indices under varying vegetation conditions, enabling more effective vegetation correction throughout the entire coverage spectrum. The index exhibited particularly pronounced advantages in medium-to-high vegetation density areas, while still delivering measurable accuracy improvements in sparsely vegetated regions where vegetation effects are minimal. Consequently, NDEVI proved especially suitable for vegetation impact correction in environments characterized by significant spatial or temporal variations in vegetation coverage.
Overall, the NDEVI achieved optimal accuracy and stability in soil moisture retrieval across diverse vegetation coverage scenarios, demonstrating particular effectiveness in challenging medium-to-high vegetation density environments where conventional indices exhibit significant performance degradation. The method’s robust statistical characteristics—maintaining strong correlation while minimizing error metrics across all coverage conditions—confirmed its superior capability for reliable soil moisture monitoring in vegetated regions.
4.4. Detect Abnormal Events in the Time Series
Following vegetation correction using vegetation indices, this study applies the DBSCAN algorithm to and data to detect soil roughness transition points. This process excluded backscatter signal variations caused by abrupt soil roughness changes and it partitions the time series into segments with stable surface roughness, thereby satisfying the constant-roughness assumption essential for change detection algorithms. We conducted comprehensive testing of different parameter combinations and found that an excessively small ε value (ε = 1) resulted in over-detection of anomalies, fragmenting the time series into excessive segments and substantially increasing computational burden. Using the k-dist derived value produced normal anomaly detection and appropriate segmentation of invariant time series, ultimately enhancing the accuracy of change detection. Conversely, an overly large ε value (ε = 4) completely failed to identify anomalies, rendering the algorithm ineffective for time series segmentation. When MinPts was set too low (MinPts = 1), erroneous anomaly detection occurred with the emergence of multiple spurious clusters, while values of MinPts ≥ 4 yielded stable and consistent results without significant variation.
As shown in
Figure 9, based on the DBSCAN algorithm to detect the time nodes of soil roughness changes, we monitor the abnormal events in the 58 long-term series of the study area from 2019 to 2020, and divide the long-term series into three invariant time series T1, T2, T3. Soil moisture was retrieved based on three invariant time series.
As shown in
Figure 9, the algorithm successfully identified three prominent abnormal events occurring on 9 August 2019, 12 March 2020, and 11 May 2020, respectively. Of particular note was the closely spaced occurrence of the two spring 2020 abnormal events, exhibiting marked temporal discontinuity that reveals substantial instability in the underlying surface conditions during this period. From the perspective of radar remote sensing physics, these abrupt changes in backscatter coefficient are directly attributable to alterations in surface roughness and dielectric properties. Intensive spring agricultural practices such as tillage significantly modify the surface micro-scale roughness, directly disrupting the Bragg resonance conditions for radar scattering and consequently enhancing backscatter. Meanwhile, management practices including irrigation substantially increase soil moisture content, elevating the dielectric constant and thereby intensifying both volume and surface scattering mechanisms. Furthermore, characteristic summer precipitation events not only saturate the soil but may also generate surface ponding that induces specular reflection, dramatically reducing backscatter intensity, while the hydrological loading effect on vegetation canopies alters their geometric configuration to further modulate scattering characteristics. When contextualized within regional climate and land use patterns, these physical mechanisms collectively account for the observed anomalous signatures. Accordingly, we have divided the invariant time series (T1, T2, T3) unaffected by these abnormal events as stable series characterized by relatively invariant surface roughness and dielectric properties.
Soil moisture was retrieved based on three invariant time series and one variant time series; the results were shown in
Figure 10. Compared with the long time series without eliminating abnormal events, the inversion of soil moisture based on three invariant time series and one variant time series achieved better results. Among them, the result of soil moisture inversion based on variant time series was the worst—R
2 reached 0.735, RMSE/σ reduced to 0.486; the result of soil moisture inversion based on invariant time series T3 was the best—R2 reached 0.892, RMSE/σ reduced to 0.316. This discrepancy likely originates from the instability of surface parameters during the change time series. It can be seen that it is of great significance to divide the long time series into invariant time series. Significant variations existed in soil moisture inversion results derived from different invariant time series. To better visualize the distribution of soil moisture inversion across these time series, a grouped box plot (
Figure 11) was constructed, illustrating the comparative trends and variability in soil moisture estimates.
The soil moisture inversions derived from the invariant time series T1 demonstrated significant peak displacement when compared with in situ measurements. However, this temporal misalignment phenomenon was notably absent in the inversion results obtained from invariant time series T2 and T3, which showed excellent agreement with the observed soil moisture dynamics. This discrepancy may arise because the anomaly detection between the initial images of the time series and those outside the study period was not accounted for when constructing invariant time series T1. Given that the study area is situated within the permafrost region of the Inner Mongolia Plateau, where seasonal frozen ground persists for several months annually, this persistent cryospheric condition may explain the reduced inversion accuracy observed in invariant time series T1.
In contrast, invariant time series T2 and T3 incorporated this critical consideration, leading to improved temporal alignment in soil moisture inversion. Although no evident peak displacement occurred in the change time series, the retrieved soil moisture values tended to cluster within a narrow median range, demonstrating substantial deviations from the measured extremes.
To elucidate the critical role of temporal consistency in long-time soil moisture retrieval,
Figure 12 compared retrieval results obtained with and without DBSCAN-based anomaly detection and time series segmentation. Implementing the DBSCAN algorithm to maintain land surface temporal consistency yielded substantial improvements in validation metrics: R
2 increased from 0.725 to 0.844, RMSE decreased from 0.041 to 0.030 m
3/m
3, and MAE was reduced from 0.033 to 0.023 m
3/m
3. These findings demonstrate that anomalous events in extended time series significantly compromise retrieval accuracy, whereas DBSCAN-driven temporal consistency preservation substantially enhances the reliability of soil moisture estimates within the study area.
4.5. Time-Series In Situ Data Analysis
To assess the impact of anomalous events in the time series, we conducted a comprehensive analysis using in situ soil moisture measurements as ground truth. This evaluation compared the temporal patterns of observed soil moisture at five monitoring stations against the corresponding inversion estimates from January 2019 to December 2020. To ensure the reliability of temporal validation at monitoring sites, this study selected three representative sites encompassing the primary land use types within the study area: S2, S6, M9 (grassland), M3 (farmland), and L12 (bare soil). This stratified selection strategy accounts for the distinctive hydrological characteristics and surface properties associated with each land cover category, thereby enhancing the robustness and comprehensiveness of the time-series verification process.
In
Figure 13, Zribi’s NDVI-based change detection approach with vegetation correction was referred to as the traditional change detection method, while the enhanced methodology proposed in this study was designated as the improved change detection method; the red circular solid line was the soil moisture measured by the site, the blue triangular dotted line was the soil moisture inverted by the traditional change detection method, and the orange square dotted line was the soil moisture inverted by the improved change detection method.
Figure 13 demonstrated a significant correlation between temporal variations in site-specific soil moisture and corresponding NDVI dynamics. The traditional change detection method was easy to overestimate soil moisture, cannot accurately capture soil moisture dynamics, and was greatly affected by NDVI changes. In comparison, the improved change detection method demonstrated significantly better performance at all five validation sites (S2, S6, M9, M3, and L12). The soil moisture retrieved by our improved method showed excellent agreement with in situ measurements in terms of both magnitude and temporal variation. Experimental results confirmed that the improved change detection method proposed in this study achieves higher inversion accuracy than the traditional method. Specifically, it better correlated with site measurements and showed more reliable inversion results. These findings validated the effectiveness of our methodological improvements in enhancing the reliability of soil moisture retrieval while maintaining computational efficiency. The consistent performance across different sites suggests good applicability for regional-scale monitoring in farming-pastoral ecosystems.
Further analysis of the temporal anomaly events reveals that at most monitoring stations, the first anomalous event corresponded to an abrupt increase in soil moisture content, while the second anomalous event was characterized by subsequent fluctuations following this initial moisture surge. This consistent pattern was observed across multiple stations.
4.6. Soil Moisture Mapping
Using the improved change detection method developed in this study, we performed soil moisture inversion through the following procedure: first, we calibrated the minimum and maximum values of both radar backscattering coefficients and soil moisture for different land cover types using the FVC piecewise function. Subsequently, vegetation effects were corrected through NDEVI normalization, after applying the retrieved and spatially mapped soil moisture by incorporating fitted dry and wet reference values.
Figure 14 presented the resulting soil moisture inversion for the study area.
In
Figure 14, the first row showed the spatial distribution of soil moisture at different times in the study area retrieved by the traditional change detection method. The second row showed the spatial distribution of soil moisture at different times in the study area retrieved by the improved change detection method. The third row showed the spatial distribution of soil moisture at different times in the study area of soil moisture dataset.
Figure 13 revealed systematic variations in soil moisture estimation accuracy across three sampling dates. The retrieval results for 10 July 2020, showed generally elevated values, while estimates for 22 July and 3 August 2020, demonstrated slightly lower readings, all maintaining strong consistency with field measurements. This temporal pattern corresponds with regional agricultural practices: primary crops (including potatoes and corn, predominantly harvested in August) required substantial irrigation in early July but reduced water demand approaching harvest. Consequently, soil moisture levels on 10 July 2020, naturally exceeded those of subsequent dates. Furthermore, heterogeneous irrigation scheduling, methods, and intensity across farmlands generated pronounced spatial variability in soil moisture distribution, particularly evident on this date.
Compared with the soil moisture dataset, the soil moisture inversion based on the traditional change detection method had a large fluctuation, and there was a problem that the inversion soil moisture is too high or too low, which cannot accurately show the spatial distribution of soil moisture in the study area. These limitations, primarily caused by uncompensated vegetation effects and surface roughness variations, significantly impair the method’s ability to accurately characterize the spatial distribution patterns of soil moisture across the study area. The observed discrepancies not only affect the absolute moisture values but also distort the temporal dynamics and spatial gradients crucial for hydrological applications and agricultural management. Such inaccuracies fundamentally limit the utility of traditional change detection for operational soil moisture monitoring and related applications requiring precise quantification of spatial variability. The improved change detection method developed in this study demonstrates significant enhancements in soil moisture monitoring accuracy. It effectively captured characteristic seasonal variations while maintaining excellent consistency with SSM dataset throughout the time series. The improved approach accurately reproduces the observed spatial distribution patterns, particularly the distinct moisture gradient across the study area—with consistently higher values in the lower left and upper right regions, and lower levels in the central zone.
5. Discussions
The change detection method, as a soil moisture retrieval technique utilizing time-series remote sensing data, derives its primary advantage from leveraging temporal signal variations to bypass the inherent difficulties associated with precise surface parameter modeling in physical inversion approaches, while maintaining operational convenience and computational efficiency. However, extended temporal domains introduce non-negligible variations in surface roughness and vegetation dynamics, which ultimately compromise the fundamental assumptions underlying the change detection framework. To uphold the core assumption underlying the change detection method, this study implements a dual-strategy approach targeting vegetation and surface roughness to resolve the critical challenge of maintaining temporal consistency in land surface parameters over extended time series.
The typical farming-pastoral ecotone in the Shandian River Basin selected for this study, characterized by highly heterogeneous vegetation cover and frequent human activities (e.g., irrigation) combined with natural events (e.g., rainfall), serves as an ideal testing ground for validating the proposed framework. The results demonstrate that our integrated strategy—comprising the FVC segmentation correction method, the NDEVI vegetation impact correction model, and the DBSCAN-based roughness anomaly detection—effectively extends the applicability of change detection techniques to such complex environments.
The innovation of this framework lies in its synergistic approach to handling interference from multiple surface factors. First, the core principle of the FVC segmentation correction method recognizes that the backscattering response to soil moisture exhibits nonlinear behavior with varying vegetation coverage. By employing segmented processing, we achieve accurate interpretation across different vegetation backgrounds, effectively mitigating systematic biases caused by vegetation heterogeneity, as opposed to using a single global model. Second, the introduction of the NDEVI vegetation impact correction model offers greater sensitivity to vegetation water content and biomass compared to the widely used NDVI. This enables our method to more precisely quantify and separate the volume scattering contribution from the vegetation layer, thereby reducing the dominant influence of vegetation phenological cycles on the signal. Finally, the DBSCAN anomaly detection plays a crucial role by intelligently identifying and excluding transient perturbations in surface roughness or soil moisture caused by events such as heavy rainfall, irrigation, and farming/harvesting activities, in an unsupervised manner. This process does not alter the roughness itself but ensures that the data foundation used for constructing the change detection sequence satisfies the fundamental assumption of “relatively consistent roughness” through data filtering.
Compared to vegetation indices derived from dual-polarization SAR data (DRVIs) [
60], the fundamental advantage of NDEVI lies in its data accessibility and computational efficiency. It relies solely on widely available optical remote sensing data, thereby avoiding the inherent limitations of SAR data regarding acquisition difficulty, complex preprocessing, and high computational costs. This makes our method more suitable for large-scale, high-frequency operational monitoring applications. Furthermore, in contrast to the approach proposed by Mikołaj Piniewski et al. [
61], which depends on high-precision crop classification maps, the innovation of our framework is reflected in its data-driven design philosophy. The DBSCAN-based pathway we employ is essentially an unsupervised learning method that does not rely on any specific prior knowledge—such as crop type maps—that is difficult to obtain in real time. This approach can automatically identify and process surface anomaly events, significantly enhancing the method’s adaptability and robustness in vast areas lacking detailed ground monitoring data.
This study also has certain limitations that point to promising directions for future research. First, the current formulation of the NDEVI still has room for optimization. Future work could involve systematic empirical analysis to determine a more universally optimal exponent for its integration with kEVI. Second, the parameters in the DBSCAN algorithm (such as the neighborhood radius ε and the minimum points MinPts) currently rely on empirical settings. Their adaptability and generalizability across diverse geographical environments still require validation and standardization over broader study areas. Finally, the framework’s strategy for handling vegetation and roughness is essentially one of “correction” and “exclusion,” rather than simultaneous retrieval. Therefore, a highly promising future direction would be to couple this framework with advanced physical models like AIEM, aiming to develop a novel “physical-statistical” hybrid retrieval paradigm that combines the rigor of physical mechanisms with the efficiency of temporal change detection.