Precision Identification of Irrigated Areas in Semi-Arid Regions Using Optical-Radar Time-Series Features and Ensemble Machine Learning

Li, Weifeng; Xiao, Changlai; Liang, Xiujuan; Yang, Weifei; Zhang, Jiang; Dai, Rongkun; La, Yuhan; Kang, Le; Zhao, Deyu

doi:10.3390/hydrology12080214

Open AccessArticle

Precision Identification of Irrigated Areas in Semi-Arid Regions Using Optical-Radar Time-Series Features and Ensemble Machine Learning

by

Weifeng Li

^1,2,3,

Changlai Xiao

^1,2,3,

Xiujuan Liang

^1,2,3,*,

Weifei Yang

^1,2,3,*,

Jiang Zhang

^1,2,3,

Rongkun Dai

^1,2,3,

Yuhan La

^1,2,3,

Le Kang

⁴ and

Deyu Zhao

⁴

¹

Key Laboratory of Groundwater Resources and Environment, Ministry of Education, Jilin University, Changchun 130021, China

²

College of New Energy and Environment, Jilin University, Changchun 130021, China

³

Jilin Provincial Key Laboratory of Water Resources and Environment, Jilin University, Changchun 130021, China

⁴

Gansu Provincial Bureau of Geological Mineral Exploration and Development, Institute of Hydrogeological and Engineering Geological Survey, Zhangye 734000, China

^*

Authors to whom correspondence should be addressed.

Hydrology 2025, 12(8), 214; https://doi.org/10.3390/hydrology12080214

Submission received: 3 July 2025 / Revised: 28 July 2025 / Accepted: 11 August 2025 / Published: 14 August 2025

Download

Browse Figures

Versions Notes

Abstract

Addressing limitations in remote sensing irrigation monitoring (insufficient resolution, single-source constraints, poor terrain adaptability), this study developed a high-precision identification framework for Jianping County, China, a semi-arid region. We integrated Sentinel-1 SAR (VV/VH), Sentinel-2 multispectral, and MOD11A1 land surface temperature data. Savitzky–Golay (S-G) filtering reconstructed time-series datasets for NDVI, SAVI, TVDI, and VV/VH backscatter coefficients. Irrigation mapping employed random forest (RF), Support Vector Machine (SVM), and K-Nearest Neighbors (KNN) algorithms. Key results demonstrate the following. (1) RF achieved superior performance with overall accuracies of 91.00% (2022), 88.33% (2023), and 87.78% (2024), and Kappa coefficients of 86.37%, 80.96%, and 80.40%, showing minimal deviation (0.66–3.44%) from statistical data; (2) SAVI and VH exhibited high irrigation sensitivity, with peak differences between irrigated/non-irrigated areas reaching 0.48 units (SAVI, July–August) and 2.78 dB (VH); (3) cropland extraction accuracy showed <3% discrepancy versus governmental statistics. The “Multi-temporal Feature Fusion + S-G Filtering + RF Optimization” framework provides an effective solution for precision irrigation monitoring in complex semi-arid environments.

Keywords:

Sentinel-1; Sentinel-2; object-oriented; random forest; irrigation identification

1. Introduction

Agricultural production exerts profound socioeconomic impacts across developing nations, yet remains highly vulnerable to climate change threats [1,2]. Consequently, irrigation has emerged as a critical strategy for stabilizing agricultural output, mitigating drought-induced losses, and safeguarding food security [3]. Under intensifying climate variability, irrigation’s role in meeting global food demand grows increasingly pivotal, while population expansion necessitates finer-scale irrigation planning [4,5]. Notably, as the largest consumer of freshwater resources, global agricultural water use has surged by 124.35% over the past 40 years, constituting more than 70% of global freshwater withdrawals and nearly reaching the upper limit of water for agriculture on the plane [6,7], highlighting the urgent need for efficient water governance. Although optimized water management enhances socioeconomic returns [8], limited water availability fundamentally constrains crop productivity in semi-arid regions [9,10]. This compels immediate improvements in irrigation efficiency through high spatiotemporal resolution mapping of irrigated areas—enabling regional water balance assessments and advancing sustainable water resource stewardship.

The continuous development of remote sensing technology is profoundly changing the agricultural monitoring system, providing core data support for precision agricultural irrigation through high-resolution sensing of vegetation dynamics, water stress, and land use patterns. Machine learning technologies are revolutionizing global irrigation monitoring paradigms. Machine learning not only deciphers complex agro-hydrological signals but also provides methodological foundations for assessing water-use efficiency under the UN Sustainable Development Goals [11,12,13].

At present, researchers can utilize diverse machine learning models such as random forest (RF), Support Vector Machine (SVM), and K-Nearest Neighbors (KNN) to extract richer information from remote sensing data for land classification identification and irrigation area identification, such as Landsat series satellites, Moderate Resolution Imaging Spectroradiometer (MODIS), Sentinel series satellites, and Gaofen series satellites, which provide strong support for accurate and efficient mapping of irrigated areas [14,15]. Many researchers have carried out a series of studies based on MODIS, Landsat, and Sentinel series data in combination with relevant time-series vegetation indices. Pramit et al. [16] used MODIS NDVI data to classify and map the spatial and temporal dynamics of irrigated and rainfed agricultural zones in Nepal over the last two decades. Md. Abdullah et al. [17] analyzed agricultural drought variability in Bangladesh using two indices, VCI and VHI, based on MODIS data; Selma et al. [18] used a deep learning model to analyze data from the Moderate Resolution Imaging Spectroradiometer (MODIS), among others, for mapping agricultural drought in the Omusati region of Namibia. Although MODIS data has the advantage of high temporal resolution, the spatial resolution of 250 m does not allow for a fine portrayal of irrigation in complex areas. Therefore, for fine identification of irrigated areas, the use of higher spatial resolution Landsat or Sentinel series data should be considered. Ghaith et al. [19] used dense Sentinel-1 synthetic aperture radar (SAR) time series to train a random forest algorithm for mapping irrigated areas in Centre-Val de Loire, France. Alemeshet et al. [20] produced maps of irrigated areas in the Bilate and Gumara watersheds using time-series Sentinel-1 data. Ezhilarasi et al. [21] produced spatio-temporal drought risk maps using Landsat 8 OLI/TIRS and 7 ETM+ normalized difference vegetation indices. Landsat and Sentinel series data have high spatial resolution, and although agricultural irrigation and drought monitoring can be accomplished with a single use of a certain type of data, with the development of technology, the fusion of multi-source data has become a cutting-edge direction, and synthetic aperture radar (SAR), by virtue of its all-weather monitoring capability (e.g., Sentinel-1), can effectively make up for optical limitations, and its sensitivity to soil moisture provides a new dimension for agricultural irrigation monitoring [22,23,24].

Previous research predominantly focused on single remote sensing data sources or isolated temporal feature metrics, typically targeting large-scale irrigation districts in plain areas. These studies largely overlooked the impact of multi-temporal feature sets on machine learning model performance. In contrast, our study area features irrigation systems distributed across fragmented valleys with non-concentrated irrigation timing and poor spatial continuity. This research analyzes the differential impacts of multiple temporal feature sets on irrigation mapping. We innovatively integrate multi-source remote sensing data to construct five temporal feature sets, evaluating their performance for irrigation identification in semi-arid regions using machine learning models. In this study, in order to directly monitor the irrigated area, Sentinel-1, Sentinel-2, and MODIS multispectral images were used as the remote sensing data sources, and three time-series vegetation indices datasets (Normalized Difference Vegetation Index, Soil Adjusted Vegetation Index, and Temperature Vegetation Dryness Index) were constructed based on the Savitzky–Golay filtering technique with the help of these remote sensing data sources, and two time-series backscatter coefficients datasets (Vertical Transmit—Vertical Receive Polarization, Vertical Transmit—Horizontal Receive Polarization). Together with spectral features, shape features, texture features, and topographic features, they form the irrigation identification dataset. In addition, three machine learning algorithms—RF, SVM, and KNN—were used for irrigation identification in the study area. This study not only provides a theoretical basis for multi-algorithm optimization in remote sensing monitoring of irrigation in semi-arid areas, but its identification results also provide valuable data support for agricultural structure adjustment and water resource optimization in the study area, which is of practical significance for realizing economic development and ecological sustainability.

2. Case Study

2.1. Overview of the Study Area

The administrative region of Jianping is located in the western part of Liaoning Province, within the coordinates of 119°10′~120°02′ E and 40°17′~42°21′ N, occupying a geographical area encompassing 4838 km². The region is situated at the intersection of Liaoning, Hebei, and Inner Mongolia. The topography of the area is diverse, with 4 urban districts, 17 township-level administrative divisions and 7 rural districts. The climate of the region is temperate continental, with a continental climate in the north and a semi-arid climate in the south. The region is influenced by the marine climate of the Pacific Ocean, as well as the continental climate of the Mongolian Plateau, resulting in a semi-arid and semi-humid climate with high precipitation variability. The average annual temperature of the region is 7.6 °C, with an average altitude of approximately 780 m. The annual precipitation is approximately 400 mm, with the main rainy season occurring from June to September. The region’s water resources are primarily dependent on surface runoff and groundwater. The principal crops cultivated in the study area are maize, sorghum, and millet. The primary irrigation methods employed are drip and flood irrigation, with the irrigation period occurring in April, July, and August [25]. The geographical location of the study area is illustrated in Figure 1.

2.2. Data Base

2.2.1. Data Source and Pre-Processing of Multi-Source Remote Sensing Images

Sentinel-1 is the first radar satellite constellation launched under the Copernicus Programme of the European Space Agency (ESA). It is equipped with a C-band synthetic aperture radar (SAR) sensor capable of acquiring all-weather, day-and-night imagery. The constellation currently includes Sentinel-1A and Sentinel-1C, with a revisit period of 12 days for a single satellite, which is reduced to 6 days when both satellites operate jointly [26]. In this study, Sentinel-1 Ground Range Detected (GRD) products acquired in Interferometric Wide (IW) swath mode from 2022 to 2024 were used. A total of 20 time-series images were selected, each covering the entire study area. Preprocessing steps applied to the data included thermal noise removal, orbit file correction, radiometric calibration, speckle filtering, and terrain correction using Doppler information. All Sentinel-1 datasets were downloaded from the ESA Copernicus Data Space Ecosystem (https://browser.dataspace.copernicus.eu/, accessed on 20 April 2025).

Sentinel-2 is an Earth observation satellite constellation developed under the Copernicus Programme of the European Space Agency (ESA), equipped with a Multispectral Imager (MSI) that provides high-resolution multispectral imagery. The system comprises Sentinel-2A and Sentinel-2B, which together offer a revisit frequency of 5 days [27]. The MSI sensor captures data across 13 spectral bands ranging from the visible to the near-infrared and short-wave infrared regions, with spatial resolutions of 10 m, 20 m, and 60 m, depending on the band. In this study, Sentinel-2 Level-2A products acquired between 2022 and 2024 were utilized. A total of 25 acquisition dates were selected, with each date requiring four image tiles to fully cover the study area, resulting in 100 time-series images. The Level-2A products had already undergone geometric, radiometric, and atmospheric corrections, and therefore no additional preprocessing was required. To ensure data quality, only images with cloud cover below 10% were selected. Further processing included resampling all bands to 10 m resolution, followed by band fusion, cropping, and mosaicking to produce a consistent multi-temporal image dataset for Jianping County. The Sentinel-2 data used in this study were downloaded from the ESA Copernicus Data Space Ecosystem (https://browser.dataspace.copernicus.eu/, accessed on 20 April 2025).

MOD11A1 is a land surface temperature (LST) product derived from the Moderate Resolution Imaging Spectroradiometer (MODIS) onboard NASA’s Terra satellite. It provides daily surface temperature observations at a spatial resolution of 1 km [28]. In this study, MOD11A1 data for Jianping County in 2024 were obtained using the Google Earth Engine (GEE) platform (https://developers.google.cn/earth-engine?hl=zh-cn, accessed on 20 April 2025). These data were used as the surface temperature input for calculating TVDI. A summary of all datasets used in this study is presented in Table 1.

2.2.2. Sample Data

In this study, field sampling was conducted across the study area during the crop growing seasons from 2022 to 2024. A Global Positioning System (GPS) device was used to precisely geolocate each sampling point. The collected samples were categorized into five major land cover types: cropland, construction land, non-construction land, water bodies, and other vegetation. Cropland samples were further classified into irrigated and non-irrigated subcategories. To enhance the spatial representativeness of the sample, additional visually interpreted points were randomly selected using high-resolution Google Earth imagery [29]. This approach ensured a more uniform distribution of sampling points across the study area. The spatial distribution and total number of sample points are illustrated in Figure 2 and Figure 3. The classification of land use types was based on the national land use classification standards in China, with detailed interpretation criteria provided in Table 2.

2.2.3. Validation Data

Statistical data on the cultivated and irrigated areas in Jianping County during the 2022–2024 survey period were obtained from the Government Work Report and the Statistical Yearbook published by the People’s Government of Jianping County (https://www.lnjp.gov.cn/).

3. Methodology

3.1. The Study Framework

The study workflow (Figure 4) comprised the following steps:

(1): Acquisition of multi-source remote sensing imagery (2022–2024), with preprocessing conducted in SNAP 10.0 and ENVI 6.0, followed by object-based image classification implemented in eCognition software.
(2): Optimal scale parameter determination was performed using the Estimation of Scale Parameters 2 (ESP2) tool, where peak values in the rate of change in local variance (LV-ROC) curve identified optimal segmentation levels. A hierarchical segmentation framework was subsequently established for the study area. Classification samples (annually collected) were then partitioned into training and validation sets at a 7:3 ratio.
(3): Vegetation indices, spectral, textural, shape, and spatial structure features were applied to a random forest classifier. Classification results were evaluated using confusion matrices. When accuracy was unsatisfactory, training samples were re-optimized iteratively until acceptable accuracy was achieved.
(4): After achieving acceptable accuracy, produce a land use/land cover map with a spatial resolution of 10 m (2022–2024). The cropland extent map was subsequently extracted, enabling identification of groundwater-irrigated areas within this cropland layer.
(5): Groundwater-irrigated areas were similarly delineated using an object-oriented approach. Leveraging the land use classification map, five multi-temporal datasets (NDVI, SAVI, TVDI, VV, VH) (Table 3) were constructed from multi-source remote sensing data applying Savitzky–Golay filtering. High-accuracy feature extraction across the study area was performed using the RF, SVM, and KNN algorithms. Algorithmic performance was evaluated via confusion matrices. Groundwater-irrigated area maps were validated and refined against ground survey data.

3.2. Optimal Segmentation Scale Method

Determining the optimal segmentation scale is essential for object-oriented multiscale segmentation. This requires repeated testing and simulation to optimize results [30,31,32]. In this study, the ESP2 tool was applied to identify the optimal segmentation scale for the image to be classified. The formula is

L V = \frac{1}{m} \times \sum_{1}^{m} {(C_{L} - {\bar{C}}_{L})}^{2}

(1)

R O C = \frac{L_{i} - L_{i - 1}}{L_{i - 1}} \times 100 %

(2)

where C_L is the luminance value of a single image in the Lth band,

\bar{C}

_L is the average luminance of all the objects in the Lth band of the image, and m is the total number of objects in the image; ROC is the rate of change in the LV, L_i is the average standard deviation of the ith object layer of the target layer, and L_i−1 is the average standard deviation of the ith−1 object layer of the target layer.

When the segmentation scale is much larger than the features in the scene, most of the objects obtained by segmentation are highly correlated with each other, and the measure of local variance will be very low, and if the segmentation scale approximates the size of the target object, the variability among the objects will increase, and the local variance will increase. As the segmentation scale increases, the local variance will increase until it matches the real features; when the local variance reaches the maximum value, the homogeneity within the same object is maximum and the difference between the objects is maximum. The scale with the maximum local variance is defined as the optimal segmentation scale, i.e., when the ROC-LV appears to be peaked, the segmentation scale corresponding to this point is the locally optimal segmentation scale.

3.3. Savitzky–Golay Filter

The Savitzky–Golay (SG) filter, initially proposed by Savitzky and Golay [33], is a robust signal-smoothing technique. It operates through least-squares polynomial fitting to reconstruct noise-reduced remote sensing time series, effectively preserving vegetation index details [34,35,36,37]. In this study, NDVI, SAVI, TVDI, VV, and VH time series data were processed using the SG filter to minimize scattering noise impacts on classification accuracy. Experimental settings employed a 4-point window (each side of the filter kernel) and 2nd-order polynomial.

3.4. Machine Learning

Supervised classification algorithms are widely used. The training samples and the images provided must be associated and the final classification of each feature is decided by the classifier. In this study, three algorithms—RF, SVM, and KNN—are used for irrigation recognition classification. Many studies have proved that RF, SVM, and KNN are supervised classifiers with better accuracy [38,39,40].

3.4.1. RF

The RF algorithm extends decision tree classifiers by integrating multiple decision trees whose outputs are collectively combined, operating on three core principles: Bagging (bootstrap aggregating), feature randomness, and voting/averaging [41,42]. Its working mechanism initiates with random selection of multiple data subsets from the training data, each subset training a specialized decision tree. To preserve feature randomness, each tree is restricted to randomly selected feature subsets during splitting, thereby preventing correlated errors across trees. When processing validation data, final classifications are determined through majority voting, with the complete workflow illustrated in Figure 5.

3.4.2. SVM

The SVM is a machine learning technique extensively applied to classification, regression, and anomaly detection tasks. Its fundamental principle involves identifying an optimal hyperplane (or decision surface) that segregates distinct categorical samples while maximizing the classification margin, thereby enhancing model generalization capability [43,44,45]. The mathematical formulation is expressed as follows.

(i): Hyperplane equation:

w⋅x + b = 0

(3)

where w refers to the normal vector of the hyperplane, which determines the aspect of the hyperplane; x refers to the feature vector of the input samples; and b denotes the bias term, which controls the offset of the hyperplane.

(ii): Decision function:

f(x) = sign(w⋅x + b)

(4)

where f(x) denotes the prediction category of sample x and w⋅x + b denotes the function interval from sample x to the hyperplane.

(iii): The objective function of the dyadic problem:

{}_{α}^{m a x}\sum_{i} α_{i} - \frac{1}{2} \sum_{i, j} α_{i} α_{j} y_{i} y_{j} K (x_{i} x_{j})

(5)

where

α_{i}

denotes the Lagrange multiplier, which corresponds to the weight of the sample

x_{i}

;

y_{i} {, y}_{j}

denotes the category labels of the samples

x_{i}

and

x_{j}

; and

K (x_{i} x_{j})

denotes the kernel function, which is used to compute the inner product in the high-dimensional space (e.g., linear kernel, RBF kernel).

3.4.3. KNN

The KNN algorithm, a widely adopted nonparametric method for classification and regression, operates by categorizing data points through a similarity assessment with their nearest neighbors. Particularly valuable in remote sensing classification [41,43], the KNN model is more sensitive to the scale, and before using it, it is necessary to standardize or normalize the features of the data and choose the K value and distance metric (adjustable p), before completing the classification or regression task. Its main formula is as follows.

(i): Minkowski distance:

d (x, y) = {(\sum_{i = 1}^{n} {|x_{i} - y_{i}|}^{p})}^{1 / p}

(6)

where d(x, y) denotes the Minkowski distance of samples x and y; x_i, y_i denotes the value of the ith feature of samples x and y; n denotes the total number of dimensions of features. When p = 1, the formula is Manhattan distance; when p = 2, the formula is Euclidean distance.

(ii): Classification decision:

P r e d i c t i o n c a t e g o r y = a r g m a x (c) \sum_{i = 1}^{k} I (y_{i} = c)

(7)

where k denotes the number of nearest neighbors selected (hyperparameter K); y_i denotes the category label of the ith neighbor; c denotes the candidate category; and I(y_i = c) denotes the exponential function, which is 1 when y_i = c, and 0 otherwise. argmax_c denotes the selection of the category c that maximizes the result of the summation.

3.5. Model Performance Indicators

The confusion matrix, a fundamental machine learning evaluation tool, quantifies classification model performance by tabulating predicted results against true labels [46,47,48]. This study employs four evaluation metrics: overall accuracy (OA), producer’s accuracy (PA), user’s accuracy (UA), and Kappa coefficient (Kappa) for quantitative assessment.

Overall accuracy (OA) quantifies the agreement between classified and actual land cover types, defined numerically as the ratio of correctly classified pixels (sum of diagonal elements in the confusion matrix) to the total number of validation pixels. The calculation formula is

P_{C} = \sum_{k = 1}^{n} \frac{P_{k k}}{P}

(8)

where P_c is the overall classification precision, P_kk is the number of image elements of all samples on the diagonal; and p is all image elements.

Producer’s accuracy (PA), also termed cartographic accuracy, measures the probability that a ground-truth class is correctly represented in the classified map. It is numerically defined as the ratio of correctly classified pixels for a given class to the total reference pixels of that class. The calculation formula is

P_{u i} = \frac{P_{i i}}{P_{+ i}}

(9)

where P_ui is producer precision, P_ii is the number of correctly categorized pixels of a certain type, and P_+i is the number of pixels of that type in the reference data.

User’s accuracy (UA), also termed user precision, quantifies the probability that a classified pixel truly represents its assigned category. Computed as the ratio of correctly classified pixels in a given class to the total pixels assigned to that class in the classified map, its mathematical formulation is

P_{a i} = \frac{P_{j j}}{P_{j +}}

(10)

where P_ai is user precision, P_jj is the number of correctly categorized image elements in a class, and P_j+ is the number of image elements of that class in the categorized data.

The Kappa coefficient is a statistical metric assessing agreement between classification results and ground-truth labels. The Kappa coefficient can quantify the extent to which the model performance exceeds the random categorization, and is suitable for the evaluation of imbalanced class distributions. Ranging from −1 to 1, its calculation formula is

K a p p a = \frac{N \sum_{i = 1}^{r} X_{i i} - \sum_{i = 1}^{r} X_{i} X_{+ i}}{N^{2} - \sum_{i = 1}^{r} X_{i} X_{+ i}}

(11)

where N is the total number of real categorized image elements of the surface, X_ii is the total number of real image elements of a certain category of the surface that is correctly categorized, X_i is the total number of real image elements of a certain category of the surface, and X_+i is the total number of image elements of a certain category that is categorized.

4. Results and Analysis

4.1. Construction of Feature Dataset

Feature dataset construction is critical for land use classification and irrigation area identification. The study initially selected 32 classification features for screening. To address data volume and redundancy challenges, a refined dataset of 20 features—spanning spectral, index, shape, texture, and topographic attributes—was ultimately selected based on importance ranking (Figure 6, Table 3). This optimized dataset served as the foundation for land use classification. For irrigated area identification, three vegetation index time series (NDVI, SAVI, TVDI) and two backscatter coefficient time series (VV, VH) were employed. The TVDI dataset utilized MODIS-derived land surface temperature imagery to ensure temporal consistency.

4.2. Analysis of Optimal Scale Segmentation

Multi-scale segmentation integrates critical parameters including band weight, shape factor, and segmentation scale, whose determination fundamentally governs segmentation quality. To optimize parameters for the study area’s land use types, Sentinel-2 covariance and correlation matrices first evaluated band-specific segmentation contributions to establish band weights. Subsequently, the ESP2 (Estimation of Scale Parameters 2) tool analyzed local variance variations, identifying six candidate optimal scales: 115, 155, 180, 230, 290, and 310 (Figure 7). Comparative analysis selected scale 180 to mitigate feature mixing from oversegmentation and fragmentation from undersegmentation, with compactness and shape factors set to 0.5 and 0.2, respectively. This parameter configuration ensured feature integrity while capturing accurate multi-scale characteristics.

4.3. Time Series Vegetation Index Features and Backward Scattering Coefficient Analysis

Feature time-series reconstruction results were compared by first applying cropland patch masks to isolate agricultural pixels, then restricting samples to maize-dominated areas to minimize interference from crop type variability. The Savitzky–Golay filter was subsequently employed for time-series reconstruction within the cropland extent, with representative sample point selections detailed in Table 4.

Figure 8 displays vegetation index time-series curves where the horizontal axis denotes acquisition dates of 2024 remote sensing imagery and the vertical axis represents value ranges of different indices. These processed curves effectively capture distinct temporal characteristics between irrigated and non-irrigated conditions, exhibiting pronounced phenological responses. The study area follows a typical dual-peak irrigation pattern (April and July–August). Results demonstrate that NDVI and SAVI exhibit similar trends, with significantly higher values during the critical growth period (July–August) at irrigated sites compared to non-irrigated sites. Non-irrigated areas showed slow value increases from April to July driven solely by precipitation. This divergence arises from enhanced vegetation coverage under irrigation. TVDI exhibited no significant fluctuation due to combined rainfall and irrigation influences, though generally maintained stability or decreased during irrigation months (July–August). This contrasts with non-irrigated site 1, where TVDI rose continuously from April to August before declining rapidly post-September. All indices demonstrated stronger positive responses during irrigation periods than non-irrigation months, confirming their utility for irrigation timing identification. Among the three indices, SAVI demonstrated superior discriminative capacity for irrigation characteristics.

Figure 9 reveals distinct VV and VH backscatter time-series patterns between irrigated and non-irrigated areas. In irrigated zones, VV exhibited a steep positive trajectory from May, peaking at higher magnitudes (−7.5 to −9.45 dB) during July–August before gradual post-September decline. Non-irrigated areas showed attenuated VV responses with lower peaks (−9.49 to −10.92 dB) and accelerated post-September decay. This VV behavior reflects crop structural dynamics and moisture sensitivity, where summer irrigation promotes denser canopy development, enhancing double-bounce scattering from stems and leaves. Similarly, VH in irrigated areas displayed pronounced June–August increases (−13.29 to −15.57 dB), contrasting with muted non-irrigated fluctuations (−16.07 to −17.55 dB). VH’s sensitivity to volumetric water content and canopy architecture yielded more dominant scattering shifts under irrigation. Critically, both polarizations demonstrated rapid post-irradiation augmentation and elevated peaks, confirming that radar scattering enhancement correlates with irrigation-induced vegetation structural and hydrologic changes, establishing these signatures as robust irrigation indicators.

4.4. Land Use Classification and Cropland Change Analysis

Land use classification in the study area was performed using the random forest algorithm. Analysis of the classification results (Figure 10) revealed broadly consistent patterns across all three years, with discernible variations particularly in non-constructed land and other vegetation categories. Accuracy assessment (Figure 11) demonstrated substantial model performance, yielding Kappa coefficients of 82.76% (2022), 80.13% (2023), and 86.09% (2024), with corresponding overall accuracies of 89.97%, 88.31%, and 90.44%. All Kappa values exceeded 80% and overall accuracies surpassed 88.31%, confirming effective classification of the majority of samples.

Figure 12 delineates the spatiotemporal distribution of cultivated land extent across the study area from 2022 to 2024. Comparative analysis with statistical data (Table 5) reveals minimal discrepancies in irrigated area calculations, with a 2023 deviation of only 19.98 km². From 2022 to 2024, statistical records indicate a net cropland increase of 48.02 km², while computed results show a 46.16 km² expansion. Crucially, cultivated land extent remained spatially consistent over the three-year period, with computational errors consistently below 3% relative to statistical benchmarks. This sub-3% error tolerance substantiates the methodology’s viability for land classification and establishes its utility as a robust basis for irrigation identification.

4.5. Analysis of Irrigation Area Identification Accuracy and Change Analysis

Irrigation area distributions from 2022 to 2024 derived from KNN, SVM, and RF algorithms are mapped in Figure 13. The RF algorithm demonstrated superior performance, achieving consistently higher overall accuracy (OA) and Kappa coefficients than both counterparts across all three years (Figure 14). Specifically, RF attained OAs of 91.00% (2022), 88.32% (2023), and 87.78% (2024) with corresponding Kappa values of 86.37%, 80.96%, and 80.40%. SVM closely followed, yielding OAs ranging from 86.36% to 90.01% and Kappa coefficients between 76.57% and 83.14%. In contrast, KNN demonstrated relatively lower performance, with OAs of 82.36% to 83.93% and Kappa values of 73.23% to 79.80%. These results confirm RF’s enhanced capability for irrigation identification compared to SVM and KNN approaches.

Over the three-year monitoring period, the study area demonstrated a 16.02 km² expansion in identified irrigated areas, exceeding the 10.45 km² increase documented in official statistical records. This growth trajectory exhibits significant correlation with regional water conservation policies. As a key implementation zone for efficient irrigation technologies, the area has prioritized drip irrigation systems, low-pressure pipeline water conveyance, and “electricity-for-water” metering techniques. These innovations have substantially enhanced water-saving efficiency compared to traditional flood irrigation practices prevalent a decade ago. Consequently, such policy-driven interventions have simultaneously facilitated irrigated land expansion, stimulated agricultural economic development, and induced multifaceted ecological consequences.

Comparative analysis of irrigated area estimates from 2022 to 2024 (Table 6) reveals discernible accuracy variations among the three classification algorithms across years. The RF algorithm demonstrated minimal deviation (0.66% to 3.44%), attesting to its superior robustness and estimation reliability. Conversely, SVM and KNN exhibited larger errors, with KNN particularly notable at 7.85% in 2024. Collectively, all algorithms achieved strong concordance with statistical data, though RF consistently outperformed alternatives, confirming the methodology’s practical utility for irrigated area delineation in the study region.

5. Discussion

5.1. Comparison and Analysis of Classification Algorithms

This study utilized multi-source remote sensing data (2022–2024) for irrigated area identification. Comparative analysis of three classification algorithms demonstrates the superior capability of RF in handling complex datasets (Section 4.4). Under limited 2024 sampling conditions (n = 2674), RF achieved higher overall accuracy (87.77%) than SVM (86.36%) and KNN (82.36%), with a Kappa coefficient (80.40%), exceeding that of SVM (76.57%) and KNN (75.53%). RF also surpassed both algorithms in producer’s and user’s accuracy for irrigated parcels. Conversely, with larger 2023 samples (n = 5065), RF maintained superior performance (OA = 88.33%; Kappa = 80.96%) compared to SVM (86.98%; 77.16%) and KNN (81.63%; 73.23%). These results indicate KNN’s limited suitability for large-scale datasets, aligning with Kassouk et al. [49] who documented RF’s advantages in stability, usability, and processing efficiency for Tunisian illicit irrigation well detection.

Analysis of sample size–accuracy relationships (Figure 3 and Figure 13) reveals that 2024 (smallest sample: n = 2674) yielded lower RF accuracy (87.77%) than 2022 (91%), despite 2023’s maximum sampling (n = 5065) producing intermediate accuracy (88.33%). This demonstrates that while sample quantity influences irrigation mapping accuracy, sample quality exerts greater impact. Consequently, optimal classifier selection requires balancing multiple factors including sample size, dataset complexity, and computational efficiency. For complex multi-source remote sensing data, random forest represents a robust choice for irrigation identification cartography.

5.2. Comparison and Analysis of Irrigation Identification Parameters

By integrating Sentinel-1, Sentinel-2, and MODIS imagery, we established comprehensive time-series datasets including NDVI, SAVI, and TVDI vegetation indices, alongside VV and VH backscatter coefficients. Comparative analysis demonstrates that NDVI and SAVI time-series curves exhibit a unimodal phenological pattern—gradually ascending to peak values in August before declining. Crucially, during the critical growth phase (July–August), irrigated areas displayed substantially enhanced peak values relative to non-irrigated zones; NDVI peaks reached 0.56–0.63 (irrigated) versus 0.30–0.56 (non-irrigated), while SAVI peaks attained 0.83–0.94 (irrigated) compared to 0.46–0.84 (non-irrigated). This pronounced peak disparity serves as a pivotal irrigation indicator, with SAVI exhibiting superior discriminative capacity. TVDI trajectories proved less distinctive, though irrigated areas maintained stable or decreasing trends during July–August due to supplemental watering, whereas non-irrigated areas displayed rainfall-dependent fluctuations (increasing/stable/decreasing). Collectively, all vegetation indices demonstrated diagnostic utility for irrigation detection, with SAVI delivering optimal performance.

Sentinel-1-derived VV and VH backscatter coefficients offer distinct advantages through cloud-penetrating capability and moisture sensitivity. Integrating these attributes with Sentinel-2’s rich spectral information and high spatial resolution effectively compensates for spatiotemporal coverage limitations, enhancing irrigation identification accuracy and reliability. Both polarizations exhibited significantly elevated post-summer-irrigation peaks compared to non-irrigated areas; VV demonstrated 1.47–1.91 dB higher peak differentials, while VH showed enhanced discrimination with 1.98–2.78 dB differentials. VH’s superior performance originates from heightened sensitivity to volumetric water content within cluttered canopies, yielding more pronounced scattering variations in irrigated zones. These quantifiable polarimetric divergences establish VV and VH as robust indicators for precision irrigation monitoring.

5.3. Uncertainty Analysis and Outlook

This study employed multi-source remote sensing data spanning 2022–2024, though the quantity of ground-surveyed sample points in 2022 and 2023 was comparatively limited relative to 2024. While supplemental analyses integrated Google Earth imagery and high-resolution satellite data to enhance coverage, and despite achieving robust final identification accuracy, the sparser sampling in earlier years may introduce geolocational uncertainties in historical datasets.

Several methodological uncertainties warrant consideration. Primarily, the feature selection process did not fully leverage the high temporal resolution of TVDI, resulting in attenuated time-series signatures that weakened irrigation identification robustness. Secondly, the backscatter analysis employed only VV and VH polarizations; future studies should incorporate hybrid configurations (e.g., dual-polarized and fully polarimetric modes) to enhance temporal resolution and optimize feature sets for irrigation detection.

Advancing beyond conventional machine learning, emerging deep learning architectures show significant promise. The U-Net framework [50] enhances feature extraction, ConvLSTM networks [51] improve spatiotemporal segmentation, and Transformer-based models [52] enable directional object detection. These approaches demonstrate superior capability in processing complex geospatial data, offering pathways to enhance classification accuracy and computational efficiency in subsequent research.

6. Conclusions

Water resources are increasingly scarce in semi-arid regions. This study aims to determine irrigated areas within the study region, and provide a basis for calculating total irrigation water consumption and managing irrigation water use with high spatial resolution. Using Sentinel-2 imagery, we constructed feature information including vegetation indices, spectral characteristics, textures, shapes, and spatial structures. The RF algorithm generated three-year land use type maps. Building on this, multi-source remote sensing data established time-series vegetation index and backscatter coefficient datasets. Irrigation identification maps were then produced using RF, SVM, and KNN algorithms.

Analysis reveals RF’s superior capability in handling complex irrigation data with higher accuracy. Compared to SVM, RF does not require kernel function selection, simplifying processing and shortening runtime. Relative to KNN, RF mitigates the dimensionality disaster effect caused by extensive time-series feature datasets, which reduces KNN’s classification accuracy.

Regarding time-series features, SAVI demonstrates greater advantages than NDVI and TVDI for irrigation identification in semi-arid areas, while VH outperforms VV for backscatter coefficients. TVDI, VV, and VH show limited differentiation under insufficient temporal resolution, emphasizing the need for adequate temporal resolution in irrigation identification. Although Sentinel-1 and Sentinel-2 data show significant potential for irrigation mapping, their 6-day revisit period may miss irrigation/rainfall events in areas with complex rainfall patterns, diverse crop types, and multiple fertilizer applications for vegetable crops. Future work should therefore focus on constructing higher spatiotemporal resolution datasets to detect irrigation using fewer training samples.

Author Contributions

W.L.: investigation, conceptualization, methodology, writing—original draft, writing—review and editing. X.L.: project administration, supervision, funding acquisition. C.X.: writing—original draft, investigation, project administration. W.Y.: writing—original draft, Investigation. J.Z.: investigation, software. R.D.: investigation, data curation. Y.L.: data curation. L.K.: data curation. D.Z.: data curation. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (41572216) and the Liaoning Provincial River and Reservoir Management Service Centre (LNZZYS202409FW).

Data Availability Statement

The Sentinel-1 and Sentinel-2 data used in the present study were downloaded from https://browser.dataspace.copernicus.eu/. The MODIS data used in the present study were downloaded from https://developers.google.cn/earth-engine?hl=zh-cn.

Conflicts of Interest

The authors declare no potential conflicts of interest.

References

Shepherd, J. The Ipcc Has Indicated the Global Water Cycle May Accelerate Due to Current Climate Change and Have an Impact on the Frequency and Intensity of Tropical Cyclones (Ipcc 2007, Shepherd and Knutson 2007). Studies Have Shown that Soil Moisture Conditions Can. 2011. Available online: https://vpn.jlu.edu.cn/https/44696469646131313237446964696461bd7dfe6705dba81ec955e469996699f3a2c9e8c2ae15ee71b2/wos/alldb/full-record/GRANTS:13371781 (accessed on 26 June 2025).
Bokusheva, R.; Kogan, F.; Vitkovskaya, I.; Conradt, S.; Batyrbayeva, M. Satellite-based vegetation health indices as a criteria for insuring against drought-related yield losses. Agric. For. Meteorol. 2016, 220, 200–206. [Google Scholar] [CrossRef]
Salmon, J.; Friedl, M.A.; Frolking, S.; Wisser, D.; Douglas, E.M. Global rain-fed, irrigated, and paddy croplands: A new high resolution map derived from remote sensing, crop inventories and climate data. Int. J. Appl. Earth Obs. Geoinf. 2015, 38, 321–334. [Google Scholar] [CrossRef]
Godfray, H.C.J.; Beddington, J.R.; Crute, I.R.; Haddad, L.; Lawrence, D.; Muir, J.F.; Pretty, J.; Robinson, S.; Thomas, S.M.; Toulmin, C. Food security: The challenge of feeding 9 billion people. Science 2010, 327, 812–818. [Google Scholar] [CrossRef]
Tilman, D.; Clark, M. Food, Agriculture & the Environment: Can We Feed the World & Save the Earth? Daedalus 2015, 144, 8–23. [Google Scholar] [CrossRef]
Huang, Z.; Hejazi, M.; Tang, Q.; Vernon, C.R.; Liu, Y.; Chen, M.; Calvin, K. Global agricultural green and blue water consumption under future climate and land use changes. J. Hydrol. 2019, 574, 242–256. [Google Scholar] [CrossRef]
Wang, M.; Shi, W. Research progress in assessment and strategies for sustainable food system within planetary boundaries. Sci. China Earth Sci. 2024, 67, 375–386. [Google Scholar] [CrossRef]
Miletto, M. Water and Energy nexus: Findings of the World Water Development Report 2014. In Proceedings of the 11th Kovacs Colloquium on Hydrological Sciences and Water Security: Past, Present and Future, Paris, France, 16–17 June 2014. [Google Scholar]
Ozdogan, M.; Gutman, G. A new methodology to map irrigated areas using multi-temporal MODIS and ancillary data: An application example in the continental US. Remote Sens. Environ. 2008, 112, 3520–3537. [Google Scholar] [CrossRef]
Ambika, A.K.; Wardlow, B.; Mishra, V. Remotely sensed high resolution irrigated area mapping in India for 2000 to 2015. Sci. Data 2016, 3, 160118. [Google Scholar] [CrossRef]
Ahmed, A.A.; Sayed, S.; Abdoulhalik, A.; Moutari, S.; Oyedele, L. Applications of machine learning to water resources management: A review of present status and future opportunities. J. Clean. Prod. 2024, 441, 140715. [Google Scholar] [CrossRef]
Gao, H.; Zhangzhong, L.; Zheng, W.; Chen, G. How can agricultural water production be promoted? A review on machine learning for irrigation. J. Clean. Prod. 2023, 414, 137687. [Google Scholar] [CrossRef]
Nagaraj, D.; Proust, E.; Todeschini, A.; Rulli, M.C.; D’ODorico, P. A new dataset of global irrigation areas from 2001 to 2015. Adv. Water Resour. 2021, 152, 103910. [Google Scholar] [CrossRef]
Noi, P.T.; Kappas, M. Comparison of Random Forest, k-Nearest Neighbor, and Support Vector Machine Classifiers for Land Cover Classification Using Sentinel-2 Imagery. Sensors 2018, 18, 18. [Google Scholar]
Talukdar, S.; Singha, P.; Mahato, S.; Shahfahad; Pal, S.; Liou, Y.-A.; Rahman, A. Land-Use Land-Cover Classification by Machine Learning Classifiers for Satellite Observations—A Review. Remote Sens. 2020, 12, 1135. [Google Scholar]
Ghimire, P.; Karki, S.; Pandey, V.P.; Pradhan, A.M.S. Mapping Spatio-Temporal dynamics of irrigated agriculture in Nepal using MODIS NDVI and statistical data with Google Earth Engine: A step towards improved irrigation planning. Int. J. Appl. Earth Obs. Geoinf. 2025, 136, 104345. [Google Scholar] [CrossRef]
Mamun, M.A.A.; Alauddin, M.; Meraj, G.; Almazroui, M.; Ehsan, M.A. Evaluating the Spatiotemporal Variation of Agricultural Droughts in Bangladesh Using MODIS-based Vegetation Indices. Earth Syst. Environ. 2024, 8, 997–1010. [Google Scholar] [CrossRef]
Iilonga, S.N.; Ajayi, O.G. Implementation of deep learning algorithms to model agricultural drought towards sustainable land management in Namibia’s Omusati region. Land Use Policy 2025, 156, 107593. [Google Scholar] [CrossRef]
Amin, G.; Sfaksi, N.; Thierion, V.; Gilleron, J.; Ferrero, T.; Demarez, V. SENTINEL-1 SYNTHETIC APERTURE RADAR TIME SERIES FOR IRRIGATION MAPPING. In Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Athens, Greece, 7–12 July 2024. [Google Scholar]
Yimer, A.K.; Haile, A.T.; Hatiye, S.D.; Ragettli, S.; Taye, M.T. Comparative evaluation of the accuracy of mapping irrigated areas using sentinel 1 images in the Bilate and Gumara watersheds, Ethiopia. Cogent Eng. 2024, 11, 1–19. [Google Scholar] [CrossRef]
Murugesan, E.; Shanmugamoorthy, S.; Veerasamy, S.; Sivakumar, V. Drought assessment in Coimbatore South region, Tamil Nadu, India, using remote sensing and meteorological data. J. Earth Syst. Sci. 2025, 134, 40. [Google Scholar] [CrossRef]
Tian, F.; Wu, B.; Zeng, H.; Zhang, X.; Xu, J. Efficient Identification of Corn Cultivation Area with Multitemporal Synthetic Aperture Radar and Optical Images in the Google Earth Engine Cloud Platform. Remote Sens. 2019, 11, 629. [Google Scholar] [CrossRef]
Bazzi, H.; Baghdadi, N.; Zribi, M. OPERATIVE MAPPING OF IRRIGATED AREAS USING SENTINEL-1 AND SENTINEL-2 TIME SERIES. In Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Kuala Lumpur, Malaysia, 17–22 July 2022. [Google Scholar]
Bousbih, S.; Zribi, M.; El Hajj, M.; Baghdadi, N.; Chabaane, Z.L.; Fanise, P.; Boulet, G. SENTINEL-1 AND SENTINEL-2 DATA FOR SOIL MOISTURE AND IRRIGATION MAPPING OVER SEMI-ARID REGION. In Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Yokohama, Japan, 28 July–2 August 2019. [Google Scholar]
Bai, Y. Assessment and Rational Development and Utilization of Groundwater Resources in the Laohe River Basin, Liaoning Province. 2024. Available online: https://vpn.jlu.edu.cn/https/44696469646131313237446964696461a176ef2600c6a01e8255e278/kns8s/defaultresult/index?crossids=YSTT4HG0%2CLSTPFY1C%2CJUP3MUPD%2CMPMFIG1A%2CWQ0UVIAA%2CBLZOG7CK%2CPWFIRAGL%2CEMRPGLPA%2CNLBO1Z6R%2CNN3FJMUV&korder=AU&kw=% (accessed on 22 July 2025).
Attema, E.; Davidson, M.; Snoeij, P.; Rommen, B.; Floury, N. SENTINEL-1 MISSION OVERVIEW. In Proceedings of the IEEE International Geoscience and Remote Sensing Symposium, Cape Town, South Africa, 12–17 July 2009. [Google Scholar]
Drusch, M.; Del Bello, U.; Carlier, S.; Colin, O.; Fernandez, V.; Gascon, F.; Hoersch, B.; Isola, C.; Laberinti, P.; Martimort, P.; et al. Sentinel-2: ESA’s Optical High-Resolution Mission for GMES Operational Services. Remote Sens. Environ. 2012, 120, 25–36. [Google Scholar] [CrossRef]
Justice, C.O.; Vermote, E.; Townshend, J.R.G.; Defries, R.; Roy, D.P.; Hall, D.K.; Salomonson, V.V.; Privette, J.L.; Riggs, G.; Strahler, A.; et al. The Moderate Resolution Imaging Spectroradiometer (MODIS): Land remote sensing for global change research. IEEE Trans. Geosci. Remote Sens. 1998, 36, 1228–1249. [Google Scholar] [CrossRef]
GB/T 21010-2017; Current land use classification. The General Administration of Quality Supervision, Inspection and Quarantine of the People’s Republic of China,: Beijing, China, 2017.
Li, Y.; Wu, H.; Li, Y.; Ye, L.; Cheng, Z.; Xu, C. A Comparision of High Resolution Satellite Imagery Classification between Object-oriented and Pixel-based Method. In Proceedings of the 4th IEEE International Conference on Software Engineering and Service Science (ICSESS), Beijing, China, 23–25 May 2013. [Google Scholar]
Platt, R.V.; Rapoza, L. An evaluation of an object-oriented paradigm for land use/land cover classification. Prof. Geogr. 2008, 60, 87–100. [Google Scholar] [CrossRef]
Yu, L.; Xie, H.; Xu, Y.; Li, Q.; Jiang, Y.; Tao, H.; Aihemaiti, M. Identification and Monitoring of Irrigated Areas in Arid Areas Based on Sentinel-2 Time-Series Data and a Machine Learning Algorithm. Agriculture 2024, 14, 1693. [Google Scholar] [CrossRef]
Steinier, J.; Termonia, Y.; Deltour, J. Smoothing and differentiation of data by simplified least square procedure. Anal. Chem. 1972, 44, 1906–1909. [Google Scholar] [CrossRef] [PubMed]
Liu, J.; Zhan, P. THE IMPACTS OF SMOOTHING METHODS FOR TIME-SERIES REMOTE SENSING DATA ON CROP PHENOLOGY EXTRACTION. In Proceedings of the 36th IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Beijing, China, 10–15 July 2016. [Google Scholar]
Luo, J.; Ying, K.; Bai, J. Savitzky-Golay smoothing and differentiation filter for even number data. Signal Process. 2005, 85, 1429–1434. [Google Scholar] [CrossRef]
Chen, J.; Jönsson, P.; Tamura, M.; Gu, Z.; Matsushita, B.; Eklundh, L. A simple method for reconstructing a high-quality NDVI time-series data set based on the Savitzky-Golay filter. Remote Sens. Environ. 2004, 91, 332–344. [Google Scholar] [CrossRef]
Shen, W.; Chen, Y.; Cao, W.; Yu, R.; Rong, P.; Cheng, J. Spatial pattern and its influencing factors of national-level cultural heritage in China. Herit. Sci. 2024, 12, 384. [Google Scholar] [CrossRef]
Belgiu, M.; Drăguţ, L. Random forest in remote sensing: A review of applications and future directions. ISPRS J. Photogramm. Remote Sens. 2016, 114, 24–31. [Google Scholar] [CrossRef]
Shen, W.; Chen, Y.; Rong, P.; Li, J.; Yan, W.; Cheng, J. The spatial coupling and its influencing mechanism between rural human-habitat heritage and key rural tourism villages in China. Herit. Sci. 2025, 13, 79. [Google Scholar] [CrossRef]
Dai, R.; Xiao, C.; Liang, X.; Yang, W.; Chen, J.; Zhang, L.; Zhang, J.; Yao, J.; Jiang, Y.; Wang, W. Spatial-temporal evolution law analysis of resource and environment carrying capacity based on game theory combination weighting and GMD-GRA-TOPSIS model. Evidence from 18 cities in Henan Province, China. J. Clean. Prod. 2024, 439, 140820. [Google Scholar]
Baram, Y. A geometric approach to consistent classification. Pattern Recognit. 2000, 33, 177–184. [Google Scholar] [CrossRef]
Shen, W.; Chen, Y.; Rong, P.; Cao, W.; Yu, R.; Wang, P.; Cheng, J. Ecotourism suitability at county scale in China: Spatial pattern, obstacle factors, and driving factors. Ecol. Indic. 2025, 178, 113911. [Google Scholar] [CrossRef]
Ketchum, D.; Jencso, K.; Maneta, M.P.; Melton, F.; Jones, M.O.; Huntington, J. IrrMapper: A Machine Learning Approach for High Resolution Mapping of Irrigated Agriculture Across the Western US. Remote Sens. 2020, 12, 2328. [Google Scholar] [CrossRef]
Cao, R.; Fang, L.; Lu, T.; He, N. Self-Attention-Based Deep Feature Fusion for Remote Sensing Scene Classification. IEEE Geosci. Remote Sens. Lett. 2020, 18, 43–47. [Google Scholar] [CrossRef]
Sheykhmousa, M.; Mahdianpari, M.; Ghanbari, H.; Mohammadimanesh, F.; Ghamisi, P.; Homayouni, S. Support Vector Machine Versus Random Forest for Remote Sensing Image Classification: A Meta-Analysis and Systematic Review. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 6308–6325. [Google Scholar] [CrossRef]
García-Balboa, J.L.; Alba-Fernández, M.V.; Ariza-López, F.J.; Rodríguez-Avi, J. Analysis of Thematic Similarity Using Confusion Matrices. ISPRS Int. J. Geo-Inf. 2018, 7, 233. [Google Scholar] [CrossRef]
Radhika, K.; Varadarajan, S. Multi class classification of satellite images. In Proceedings of the IEEE International Conference on Innovative Mechanisms for Industry Applications (ICIMIA), Bengaluru, India, 21–23 February 2017. [Google Scholar]
Shen, W.; Yang, L.; Qin, Y. Research on the influencing factors and multi-scale regulatory pathway of ecosystem health: A case study in the Middle Reaches of the Yellow River, China. J. Clean. Prod. 2023, 406, 137038. [Google Scholar] [CrossRef]
Ge, G.; Shi, Z.; Zhu, Y.; Yang, X.; Hao, Y. Land use/cover classification in an arid desert-oasis mosaic landscape of China using remote sensed imagery: Performance assessment of four machine learning algorithms. Glob. Ecol. Conserv. 2020, 22, e00971. [Google Scholar] [CrossRef]
Li, G.; Cui, J.; Han, W.; Zhang, H.; Huang, S.; Chen, H.; Ao, J. Crop type mapping using time-series Sentinel-2 imagery and U-Net in early growth periods in the Hetao irrigation district in China. Comput. Electron. Agric. 2022, 203, 107478. [Google Scholar] [CrossRef]
Pang, S.; Gao, L. Multihead attention mechanism guided ConvLSTM for pixel-level segmentation of ocean remote sensing images. Multimed. Tools Appl. 2022, 81, 24627–24643. [Google Scholar] [CrossRef]
Zhang, C.; Su, J.; Ju, Y.; Lam, K.-M.; Wang, Q. Efficient Inductive Vision Transformer for Oriented Object Detection in Remote Sensing Imagery. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–20. [Google Scholar] [CrossRef]

Figure 1. Location map of the study area: Jianping County, Liaoning Province, China.

Figure 2. Spatial distribution of ground truth samples for model training and validation across the study area (2022–2024).

Figure 3. Number of sample sites in the study area in 2022–2024 (Left: training sample points from 2022 to 2024; Right: validation sample points from 2022 to 2024).

Figure 4. Integrated workflow of data processing, classification modeling, and irrigation mapping (2022–2024).

Figure 5. Random forest classification: ensemble learning workflow from bootstrap sampling to majority voting.

Figure 6. Feature importance ranking for land use classification based on random forest permutation importance.

Figure 7. Evaluation results of optimal segmentation scale determination using the ESP2 algorithm for object-based image analysis in the study area.

Figure 8. Time series vegetation index curves (NDVI, SAVI, TVDI) before and after S-G filtering from March to October 2024.

Figure 9. Time series backscattering coefficient curves (VV, VH) before and after S-G filtering from March to October 2024.

Figure 10. Multi-year land use classification maps for 2022, 2023, and 2024 based on object-oriented classification of multi-source remote sensing imagery.

Figure 11. Overall and class-wise accuracy assessment of cropland identification results from 2022 to 2024 based on ground validation samples.

Figure 12. Spatial distribution maps of cropland extent derived from remote sensing classification in the study area from 2022 to 2024.

Figure 13. Annual distribution maps of irrigated areas in the study region from 2022 to 2024 based on classification of time-series remote sensing data.

Figure 14. Rising sun chart showing class-level accuracy metrics for irrigated area identification from 2022 to 2024, including producer’s and user’s accuracy.

Table 1. Detailed inventory of multi-source remote sensing imagery used for land use and irrigation mapping in the study area from 2022 to 2024.

Number	Time	Type	Number	Time	Type	Number	Time	Type
1	2 April 2022	Sentinel-1A	25	25 July 2022	Sentinel-2A	49	13 July 2022	MOD11A1
2	26 April 2022	Sentinel-1A	26	27 August 2022	Sentinel-2A	50	27 August 2022	MOD11A1
3	8 May 2022	Sentinel-1A	27	26 September 2022	Sentinel-2A	51	26 September 2022	MOD11A1
4	20 May 2022	Sentinel-1A	28	21 October 2022	Sentinel-2A	52	21 October 2022	MOD11A1
5	1 June 2022	Sentinel-1A	29	24 April 2023	Sentinel-2A	53	24 April 2023	MOD11A1
6	19 July 2022	Sentinel-1A	30	14 May 2023	Sentinel-2A	54	14 May 2023	MOD11A1
7	31 July 2022	Sentinel-1A	31	3 June 2023	Sentinel-2A	55	3 June 2023	MOD11A1
8	9 April 2023	Sentinel-1A	32	3 July 2023	Sentinel-2A	56	3 July 2023	MOD11A1
9	21 April 2023	Sentinel-1A	33	22 August 2023	Sentinel-2A	57	22 August 2023	MOD11A1
10	15 May 2023	Sentinel-1A	34	21 September 2023	Sentinel-2A	58	21 September 2023	MOD11A1
11	27 May 2023	Sentinel-1A	35	21 October 2023	Sentinel-2A	59	21 October 2023	MOD11A1
12	7 August 2023	Sentinel-1A	36	9 March 2024	Sentinel-2A	60	9 March 2024	MOD11A1
13	31 August 2023	Sentinel-1A	37	29 March 2024	Sentinel-2A	61	29 March 2024	MOD11A1
14	24 September 2023	Sentinel-1A	38	18 April 2024	Sentinel-2A	62	18 April 2024	MOD11A1
15	18 October 2023	Sentinel-1A	39	8 May 2024	Sentinel-2A	63	17 June 2024	MOD11A1
16	21 May 2024	Sentinel-1A	40	17 June 2024	Sentinel-2A	64	17 July 2024	MOD11A1
17	2 June 2024	Sentinel-1A	41	17 July 2024	Sentinel-2A	65	6 August 2024	MOD11A1
18	13 August 2024	Sentinel-1A	42	6 August 2024	Sentinel-2A	66	5 September 2024	MOD11A1
19	6 September 2024	Sentinel-1A	43	5 September 2024	Sentinel-2A	67	5 October 2024	MOD11A1
20	12 October 2024	Sentinel-1A	44	5 October 2024	Sentinel-2A	68	15 October 2024	MOD11A1
21	19 April 2022	Sentinel-2A	45	15 October 2024	Sentinel-2A
22	9 May 2022	Sentinel-2A	46	19 April 2022	MOD11A1
23	23 June 2022	Sentinel-2A	47	9 May 2022	MOD11A1
24	13 July 2022	Sentinel-2A	48	23 June 2022	MOD11A1

Table 2. Classification criteria and land cover category definitions used in the land use mapping process.

Type	Definition	Interpretation of Symbols
Cultivated land	Land used for growing crops, including paddy fields, dry fields, vegetable gardens, etc.	During the growing season, the area is green, and during the non-growing season, it is soil brown. The texture is regular fields, with irregular dry land boundaries. The shape is contiguous. The geometric shape is obvious, mostly rectangular or strip-shaped, mostly distributed in flat areas, and some mountain edges are mostly strip-shaped.
Other vegetation	Refers to forest land with trees, shrubs, bamboo, etc., and grasslands with mainly herbaceous plants.	The forest is mostly dark green, with the shadows of tall trees clearly visible; the texture is rough, with intertwined tree crowns and a grainy feel. The grassland is light green or yellow-green, with a uniform and delicate texture and no obvious boundaries. It is mostly distributed in mountainous, sloping, and hilly areas.
Land for construction	Urban and rural residential areas and industrial, mining, transportation, and other land uses outside of these areas, including urban land, rural residential land, and industrial, mining, transportation, and other construction land.	The spectrum is mostly grayish white or light gray; the texture is dense and regular; the geometric features are obvious, with roads appearing as lines and residential areas appearing as blocks.
Waters	Naturally formed water bodies and water conservancy facilities, including rivers, lakes, reservoirs, canals, etc.	The spectrum is mostly dark blue or black; rivers are naturally curved, while reservoirs and ponds are regular in shape.
Unused land for construction	Refers to land that is currently difficult to used, including bare land, sandy land, bare rock, and other types of unused land.	The sandy soil is bright white, while the bare soil is grayish brown; the texture and grain are uniform, and the boundaries with other soil types are clear.

Table 3. Selected features and optimization results for improving supervised classification accuracy.

Feature	Features Before Optimization	Number	Features After Optimization	Number	Data Source	Purpose
Spectral feature	B2, B3, B4, B5 B8, B11, B12, Brightness	8	B8, B4, B2, B12, B3, B5, B11, Brightness	8	Sentinel-2	Land classification, Irrigation identification
Vegetation index characteristics	NDVI, Enhanced Vegetation Index (EVI), SAVI, Normalized Difference Water Index (NDWI), Modified Normalized Difference Water Index (MNDWI), Normalized Difference Built-up Index (NDBI), Normalized Difference Snow Index (NDSI), Land Surface Water Index (LSWI)	8	NDWI, NDVI, SAVI, EVI	4	Sentinel-2	Land classification
Shape feature	Area, Border length, Length, Width, Length/Width, Shape index, Density	7	Area, Density	2	Sentinel-2	Land classification, Irrigation identification
Texture feature	GLCM Mean, GLCM Std Dev, GLCM Homogeneity, GLCM Dissimilarity, GLCM Contract, GLCM Correlation	6	GLCM Homogeneity, GLCM Dissimilarity, GLCM Contract, GLCM Correlation	4	Sentinel-2	Land classification, Irrigation identification
Terrain features	DEM, Slope, Aspect	3	Slope, DEM	2	GEE	Land classification, Irrigation identification
Time-series vegetation index			NDVI, SAVI, TVDI	3	Sentinel-2, MODIS	Irrigation identification
Time series backward scattering coefficient			Vertical–Vertical Polarization, Vertical–Horizontal Polarization	2	Sentinel-1	Irrigation identification

Table 4. Geographic coordinates and category labels of typical sampling and validation points used for model training and accuracy assessment.

Typical Sample Point Name	Coordinates	Typical Sample Point Name	Coordinates
Irrigation point 1	42°0′2.430″ N, 119°24′18.954″ E	Non-irrigation point 1	41°57′25.268″ N, 119°41′57.170″ E
Irrigation point 1	41°59′41.636″ N, 119°33′28.386″ E	Non-irrigation point 1	41°39′4.499″ N, 119°27′15.124″ E
Irrigation point 1	41°53′54.258″ N, 119°31′0.653″ E	Non-irrigation point 1	41°40′0.214″ N, 119°19′45.660″ E
Irrigation point 1	41°37′19.873″ N, 119°59′26.005″ E	Non-irrigation point 1	41°29′7.471″ N, 119°28′48.427″ E
Irrigation point 1	41°34′11.834″ N, 119°47′29.987″ E	Non-irrigation point 1	42°12′1.019″ N, 119°22′37.798″ E

Table 5. Comparison between the cropland area calculated from classification results and official statistical records from 2022 to 2024.

Year	Calculated Irrigated Area (km²)	Statistics (km²)	Difference (km²)	Value of Error (%)
2022	1672.88	1714.77	−41.89	2.44%
2023	1730.65	1750.63	−19.98	1.14%
2024	1719.04	1762.79	−43.75	2.48%

Table 6. Comparison of irrigated area estimates obtained from remote sensing classification and official statistics from 2022 to 2024.

Year	Algorithm	Calculated Irrigated Area (km²)	Statistics (km²)	Difference (km²)	Value of Error (%)
2022	RF	579.09	575.32	3.77	0.66%
	SVM	596.63		21.31	3.70%
	KNN	606.05		30.73	5.34%
2023	RF	580.61	576.07	5.29	0.92%
	SVM	594.02		18.7	3.25%
	KNN	599.15		23.83	4.14%
2024	RF	595.11	585.77	19.79	3.44%
	SVM	614.49		39.17	6.81%
	KNN	620.47		45.15	7.85%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, W.; Xiao, C.; Liang, X.; Yang, W.; Zhang, J.; Dai, R.; La, Y.; Kang, L.; Zhao, D. Precision Identification of Irrigated Areas in Semi-Arid Regions Using Optical-Radar Time-Series Features and Ensemble Machine Learning. Hydrology 2025, 12, 214. https://doi.org/10.3390/hydrology12080214

AMA Style

Li W, Xiao C, Liang X, Yang W, Zhang J, Dai R, La Y, Kang L, Zhao D. Precision Identification of Irrigated Areas in Semi-Arid Regions Using Optical-Radar Time-Series Features and Ensemble Machine Learning. Hydrology. 2025; 12(8):214. https://doi.org/10.3390/hydrology12080214

Chicago/Turabian Style

Li, Weifeng, Changlai Xiao, Xiujuan Liang, Weifei Yang, Jiang Zhang, Rongkun Dai, Yuhan La, Le Kang, and Deyu Zhao. 2025. "Precision Identification of Irrigated Areas in Semi-Arid Regions Using Optical-Radar Time-Series Features and Ensemble Machine Learning" Hydrology 12, no. 8: 214. https://doi.org/10.3390/hydrology12080214

APA Style

Li, W., Xiao, C., Liang, X., Yang, W., Zhang, J., Dai, R., La, Y., Kang, L., & Zhao, D. (2025). Precision Identification of Irrigated Areas in Semi-Arid Regions Using Optical-Radar Time-Series Features and Ensemble Machine Learning. Hydrology, 12(8), 214. https://doi.org/10.3390/hydrology12080214

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Precision Identification of Irrigated Areas in Semi-Arid Regions Using Optical-Radar Time-Series Features and Ensemble Machine Learning

Abstract

1. Introduction

2. Case Study

2.1. Overview of the Study Area

2.2. Data Base

2.2.1. Data Source and Pre-Processing of Multi-Source Remote Sensing Images

2.2.2. Sample Data

2.2.3. Validation Data

3. Methodology

3.1. The Study Framework

3.2. Optimal Segmentation Scale Method

3.3. Savitzky–Golay Filter

3.4. Machine Learning

3.4.1. RF

3.4.2. SVM

3.4.3. KNN

3.5. Model Performance Indicators

4. Results and Analysis

4.1. Construction of Feature Dataset

4.2. Analysis of Optimal Scale Segmentation

4.3. Time Series Vegetation Index Features and Backward Scattering Coefficient Analysis

4.4. Land Use Classification and Cropland Change Analysis

4.5. Analysis of Irrigation Area Identification Accuracy and Change Analysis

5. Discussion

5.1. Comparison and Analysis of Classification Algorithms

5.2. Comparison and Analysis of Irrigation Identification Parameters

5.3. Uncertainty Analysis and Outlook

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI