Next Article in Journal
The European Charter for Sustainable Tourism (ECST) as a Tool for Development in Rural Areas: The Case of Vesuvius National Park (Italy)
Previous Article in Journal
Transcriptomic and Physiological Insights into the Role of Nano-Silicon Dioxide in Alleviating Salt Stress During Soybean Germination
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Method for Paddy Field Extraction Based on NDVI Time-Series Characteristics: A Case Study of Bishan District

1
Chongqing Jinfo Mountain Karst Ecosystem National Observation and Research Station, School of Geographical Sciences, Southwest University, Chongqing 400715, China
2
Chongqing Engineering Research Center for Remote Sensing Big Data Application, School of Geographical Sciences, Southwest University, Chongqing 400715, China
3
Daotian Science and Technology Limited Company, Chongqing 400715, China
4
Chongqing Huadi Resource and Environment Technology Co., Ltd., Chongqing 401120, China
5
Chongqing Institute of Surveying and Mapping Science and Technology, Chongqing 401121, China
6
Engineering Research Center for Intelligent Urban Spatiotemporal Information and Equipment, Ministry of Natural Resources, Chongqing 401120, China
*
Author to whom correspondence should be addressed.
Agriculture 2025, 15(22), 2321; https://doi.org/10.3390/agriculture15222321
Submission received: 19 September 2025 / Revised: 4 November 2025 / Accepted: 5 November 2025 / Published: 7 November 2025
(This article belongs to the Section Artificial Intelligence and Digital Agriculture)

Abstract

Rice, as one of the world’s three major staple crops, provides a food source for nearly half of the global population. Timely and accurate acquisition of rice cultivation information is crucial for optimizing spatial distribution, guiding production practices, and safeguarding food security. Taking Bishan District of Chongqing as the study area, NDVI values were derived from Sentinel-2 satellite imagery to construct standard NDVI time-series curves for typical land-cover types, including paddy fields, dryland, water bodies, construction land, and forest and grassland. These curves were then used in the NDVI time-series characteristics method to identify paddy fields. First, the Euclidean distance between the standard NDVI time series of paddy fields and those of other land-cover types was calculated. The sum of these element-wise differences was used to determine the upper threshold for paddy field extraction. Second, the mean absolute deviation between elements of the rice sample dataset and the standard NDVI time series was calculated for each time step. The sum of these average deviations was used as the lower threshold to extract the initial paddy field data. On this basis, an extreme-value constraint was introduced to reduce the interference of mixed pixels from forest and grassland and construction land, effectively eliminating anomalous pixels and improving the accuracy of paddy field identification. Finally, the results were validated and compared with those from other extraction methods. The results indicate that: (1) Paddy fields exhibit distinct NDVI time-series characteristics throughout the entire growing season, which can serve as a reference standard. By calculating the Euclidean distance between the NDVI curves of other land-cover types and those of paddy fields, similarity can be quantified, enabling rice identification. (2) The extraction method based on NDVI time-series characteristics successfully identified paddy fields through the appropriate setting of thresholds. The overall accuracy and Kappa coefficient remained high, while the F1-score consistently exceeded 0.8, indicating a good balance between precision and recall. Furthermore, the bootstrap uncertainty analysis revealed narrow 95% confidence intervals across all metrics, confirming the robustness and statistical reliability of the results. Overall, the proposed method demonstrated excellent performance in paddy field classification and significantly outperformed traditional machine learning methods implemented on the GEE platform. (3) Mixed pixels considerably affected the accuracy of rice classification; however, the introduction of the extreme-value constraint effectively mitigated this influence and further improved classification results.

1. Introduction

As one of the three major staple crops globally, rice constitutes the primary food source for nearly half of the world’s population [1,2]. Timely and accurate acquisition of rice cultivation information is essential for optimizing spatial distribution, guiding production practices, and ensuring food security. Traditionally, rice cultivation has been monitored through field-based surveys involving tools such as tape measures, on-site calculations using PDAs, and hierarchical reporting from village-level authorities. These approaches are typically labor-intensive, time-consuming, and prone to human bias and subjectivity [3,4]. With the advancement of remote sensing technology, it has become a vital tool for monitoring rice and other crops due to its wide spatial coverage, high timeliness, and low cost [5]. Currently, remote sensing–based methods for paddy field extraction can be broadly classified into two main categories [6]. The first category involves machine learning–based classification methods, which model spectral, textural, and other image features to achieve paddy field identification [7,8,9,10,11]. Traditional machine learning approaches include both supervised and unsupervised classification. Unsupervised classifiers, such as K-means, Iterative Self-Organizing Data Analysis Techniques Algorithm (ISODATA), Hierarchical Cluster Analysis (HCA), and Time-Weighted Dynamic Time Warping (TWDTW) are often applied when high-quality training samples are unavailable. For instance, Fatchurrachman et al. (2022) used Sentinel-1/2 time-series data and K-means clustering, integrating backscattering characteristics during transplanting and NDVI variations during the growing season, to successfully extract paddy field distribution and planting calendars [12]. However, the lack of supervision makes unsupervised results sensitive to inter-class spectral similarities and parameter settings, often leading to confusion between different crops. Therefore, compared to unsupervised classifiers, supervised classifiers are more extensively utilized in rice mapping. For example, Liu (2023) [13] applied machine-learning models on the Google Earth Engine (GEE) platform and field investigations. After feature selection, multiple machine learning algorithms were used to build classification models for cereal crop mapping in Heilongjiang Province [13]. In recent years, deep learning methods (e.g., CNN, RNN, LSTM) have developed rapidly in paddy field identification, enabling full utilization of spatiotemporal features and showing greater robustness in complex environments. For instance, Thorp et al. (2021) [14] combined synthetic aperture radar (SAR) and optical data from Sentinel-1 and Sentinel-2 satellites with field survey data. Through comparison of multiple machine learning models, they found that a recurrent neural network (RNN) based on LSTM achieved the best performance in extracting paddy fields [14]. The second category identifies paddy fields based on phenological characteristics, focusing on the unique vegetation index changes observed during rice growth stages [15,16,17]. Shao (2023) extracted paddy fields in the Jianghan Plain using multiple vegetation indices derived from MODIS time-series data, combined with phenological index variations, and applied manually set thresholds [18]. Fan et al. (2023) reconstructed complete rice growth-stage image sequences on the GEE platform in Hunan Province, and classified different rice types by identifying optimal thresholds through sensitivity analysis of NDVI time-series differences among early, middle, and late rice [19].
Differing from the above approaches, Tian et al. (2024) developed a harvest-phase model that utilized the significant NDVI difference before and after the harvest period to establish thresholds for paddy field extraction [20]. This method, which calculates vegetation indices only from remote sensing images before and after specific phenological periods, overlooks dryland crops with similar temporal patterns, potentially causing spectral confusion among land-cover types and thus reducing classification accuracy.
In summary, in the first category, supervised classification methods are widely used in paddy field extraction but has certain limitations. For example, its accuracy depends on the availability of ground samples [21]. In fragmented landscapes with complex surface types, the difficulty of sample collection and insufficient training data can reduce the accuracy of classification models [22]. Moreover, although ensemble algorithms such as Random Forest are more robust to noise and outliers, their performance may still be affected by factors such as label bias and class imbalance [23]. For deep learning methods, acquiring a sufficient number of labeled samples is a significant challenge, especially in large scale. The scarcity of labeled samples can lead to model overfitting, which in turn diminishes the model’s spatio-temporal generalization capabilities. Furthermore, the complexity of the hyperparameter tuning and the weak interpretability are also challenges that need to be overcome. In the second category, the basic idea is to distinguish paddy fields from non-paddy areas by setting thresholds based on NDVI or other vegetation index variations during key growth stages. However, it is susceptible to adverse factors such as rainfall, clouds or pseudo-flooding signals due to spring snowmelt. In the absence of effective observations at critical phenological phases, the performance of phenology-based method can be significantly degraded. At the same time, these methods often overlook the fact that other land-cover types (e.g., some dryland crops) may show similar vegetation index variations during these stages, leading to confusion among classes [24].
To address these issues, this study takes Bishan District, Chongqing, as a case study and proposes a paddy field extraction method that integrates NDVI time-series characteristics with an extreme-value constraint. Building on NDVI time-series similarity, the method introduces threshold interval setting combined with iterative validation within the interval to determine a unique optimal threshold. Combined with extreme-value constraint, it effectively removes mixed pixels and improves classification accuracy. Compared with DTW and similar methods, this study uses Euclidean distance under an equal-length time series, making the approach simpler and more efficient. Unlike machine learning methods, it does not rely on large training datasets, making it more suitable for fragmented landscapes with complex land-cover types and limited sample availability. In comparative experiments with commonly used machine learning methods on the GEE platform, the proposed method performed better, confirming its advantages in complex environments. Finally, the method was validated using multi-year imagery (2020–2024), further demonstrating its stability and applicability.

2. Materials and Methodology

2.1. Study Area

The study area is Bishan District, Chongqing (Figure 1), situated in the southeastern arc-shaped tectonic zone of the Sichuan Basin, between the Wentangxia and Libixia anticlines of the Huaying Mountain compound structure. Its geographic coordinates range from 106°02′–106°21′ E and 29°17′–29°53′ N, covering a total area of 915 km2. Bishan occupies a favorable geographical location: bounded by Shapingba District to the east, Jiangjin District to the south, and Yongchuan District to the west, and adjoining Beibei, Tongliang, and Hechuan districts to the north, it is often referred to as “the western gateway of Chongqing.” The eastern part of the district is bounded by Jinyun Mountain, while Bayue Mountain and Xishan form natural barriers to the west, creating a distinctive landform pattern of “two mountains enclosing one trough.” Based on elevation and topographic features, the area can be divided into low mountains, hills, and wide valleys. The wide valley zones are mainly distributed along the Bibei, Binan, and Meijiang river basins, accounting for about 38.8% of the total area. The region has a subtropical humid monsoon climate, characterized by distinct seasons and mild conditions, with a mean annual temperature of 18 °C, abundant precipitation, and sufficient sunshine [25]. The dominant soil types include yellow soil, purple soil, meadow soil, paddy soil, and red soil, all of which are relatively fertile and suitable for cultivating a wide variety of crops.
Rice is the dominant crop in Bishan’s agricultural system, with a cultivated area far exceeding that of dryland crops such as maize and legumes. The local system is dominated by single-cropping rice: seedlings are raised in early April, transplanted in mid-to-late April, enter vigorous growth in July–August, and are harvested by the end of October. This complete growth cycle of single-season rice corresponds precisely to April–October. Therefore, Sentinel-2 imagery from this period was selected for NDVI time-series analysis to capture the full phenological characteristics of paddy fields. In contrast, dryland crops such as maize are usually transplanted in late March to early April and harvested in late July; spring soybean is sown in early May and harvested in mid-to-late July; while summer soybean is sown from late May to early June and harvested in late September. Compared with rice, these dryland crops show differences in growth periods and NDVI peak characteristics, providing a basis for distinguishing paddy from non-paddy fields. Thus, the cropping structure and rice cultivation practices in Bishan District provide favorable experimental conditions for this study.
Sentinel-2 imagery of Bishan District in western Chongqing offers complete coverage, with less cloud and fog interference than the cloudier, rainier southwestern regions, ensuring the continuity and reliability of multi-temporal NDVI curves. Together with its typical “two mountains enclosing one trough” landform and the central role of rice in local agriculture, Bishan was therefore selected as the study area. In addition, the fragmented farmland and complex cropping patterns in Bishan further validate the applicability and robustness of the proposed method under complex conditions.

2.2. Data Source and Pre-Processing

2.2.1. Sample Points Data

This study used historical imagery from Google Earth Pro and Sentinel-2 data as reference sources, combined with the spectral characteristics of paddy fields in Sentinel-2 imagery, to construct a sample dataset for the period 2020–2024. According to the Current Land Use Classification (GB/T 21010-2017) [26], the separability of remote sensing imagery, and the actual land distribution in the study area, land-cover types in Bishan were classified into five categories: paddy fields, dryland, water bodies, construction land, and forest and grassland. During sample selection, although only 28 paddy field samples were collected, they were distributed across the main rice-growing areas of the district, covering representative paddy plots at different elevations and slopes. This ensured that the samples adequately captured the spatial heterogeneity of paddy fields within the study area. To improve sample representativeness, multi-temporal imagery over a five-year period was comprehensively considered, with priority given to paddy pixels that exhibited unimodal phenological variations in their NDVI time-series curves. Correspondingly, 26 dryland, 32 water body, 24 construction land, and 34 forest and grassland samples were selected, all evenly distributed across the study area.
The sample design emphasized spatial balance and phenological representativeness to ensure high spatiotemporal representativeness despite the limited number of samples. Validation with multi-temporal imagery showed that these samples effectively captured the spectral and temporal characteristics of different land-cover types, providing a solid foundation for subsequent NDVI time-series analysis.

2.2.2. Sentinel-2 Data

The remote sensing data used in this study were Sentinel-2 Level-2A surface reflectance products, obtained via the GEE platform. Sentinel-2 is a key component of the Copernicus Programme implemented by the European Space Agency (ESA), consisting of two satellites: Sentinel-2A and Sentinel-2B. The two satellites are phased 180° apart in orbit, providing a 10-day revisit cycle individually and a combined revisit frequency of 5 days, which enhances the efficiency and continuity of data acquisition. Sentinel-2 is designed to provide high-resolution optical remote sensing imagery. Its onboard multispectral instrument covers 13 spectral bands, with the blue, green, red, and near-infrared bands offering high spatial resolution of 10 m, which facilitates the monitoring of vegetation growth and land-cover classification [27]. The remaining bands have spatial resolutions of 20 m or 60 m. This study focuses on extracting paddy fields in Bishan District based on the NDVI time-series characteristics of paddy fields, which are more effectively captured using data with short and continuous temporal intervals. Moreover, due to the fragmented land parcels and complex surface types in Bishan District, both spatial and temporal resolution were carefully considered in selecting the data source. Sentinel-2 provides globally available, high-resolution, and frequently updated remote sensing data free of charge, making it a suitable choice for this study.

2.2.3. Data Preprocessing

As the study area is a typical single-cropping rice region, where rice is generally transplanted in early April and harvested by the end of October, Sentinel-2 imagery from April to October for each year from 2020 to 2024 was selected on the GEE platform. The imagery was processed at 10-day intervals for cloud removal, mean compositing, and clipping to obtain datasets covering the full growth cycle of rice and to ensure the continuity of NDVI time-series curves.
Due to frequent cloud and rainfall conditions in the study area, the Scene Classification Layer (SCL)-based cloud masking method provided by GEE was employed to maintain the integrity of the NDVI time series for monitoring crop growth. The SCL band, unique to Sentinel-2 Level-2A products and generated through Sen2Cor preprocessing, assigns semantic labels to each pixel, classifying them into 12 land-cover types such as cloud, shadow, water, vegetation, and bare soil, with values ranging from 0 to 11. Low-quality pixels were excluded before analysis. Compared with traditional QA60-based cloud masking methods, which rely solely on binary cloud/non-cloud flags [28,29], the SCL-based approach offers more detailed semantic classification, thereby enhancing the accuracy of land-cover classification and vegetation index calculations. In this study, cloud masking was performed based on the Sentinel-2 SCL band by excluding pixels classified as 3 (Cloud shadows), 9 (Cloud high probability), and 10 (Thin cirrus) [30,31], while retaining all other classes to achieve cloud removal. The masking function is mathematically expressed as:
V P = S C L 3,9 , 10
In Equation (1), V P represents the valid pixels retained for analysis. This approach maximizes the retention of surface information while effectively eliminating invalid pixels affected by clouds and atmospheric disturbances.
After cloud removal, large gaps appeared in the imagery, which reduced the accuracy of paddy field extraction. To address this issue, discarded cloud-contaminated pixels were first replaced with cloud-free pixels from the same period in the previous five years [32]. If no valid substitutes were available within this five-year window, a 3 × 3 neighborhood of valid pixels from the same year and time phase was further used for gap-filling.

2.3. Methodology

Figure 2 illustrates the workflow of the proposed method in this study.

2.3.1. Construction of Standard NDVI Time Series for Major Land-Cover Types

Owing to the red-edge effect of vegetation [33,34], vegetation indices derived from the red and near-infrared (NIR) spectral bands provide an effective means of characterizing vegetation growth conditions. Among these, the Normalized Difference Vegetation Index (NDVI) is the most widely applied in remote sensing classification [35], and is regarded as a key indicator of vegetation vigor and canopy density. Constructing an NDVI time series allows for the effective monitoring of vegetation dynamics across temporal and spatial scales, thereby enabling the extraction of phenological characteristics and related information. The NDVI is calculated using the following formula:
N D V I = N I R R N I R + R
where NIR and R denote the reflectance values of the near-infrared and red bands, respectively.
The construction of representative NDVI time series relies on the spectral purity of the selected pixels. In this study, spectrally pure pixels of paddy fields and other land-cover types were identified using field survey data combined with high-resolution Google Earth Pro imagery. These pixels were then used to construct the standard NDVI time series for rice [36].
To derive the spatial distribution of rice cultivation in the study area, the similarity among NDVI time series was calculated, and an appropriate threshold was subsequently applied. To quantify the similarity of NDVI time series, a standard rice NDVI curve was first generated by averaging the corresponding elements of the sample dataset [37].
S s t i = m e a n S s e t i ,   i   1,2 , 3 , , n
In the above equation, S s t i denotes the ith value of the standard NDVI time series for rice, while S s e t i represents the collection of ith values from the sample time series. The variable n is the length of the time series, and mean indicates the averaging function. To minimize the influence of spatial outliers, an outlier removal step was conducted before constructing the standard NDVI time series. This was achieved through a threshold-based method, in which the upper and lower bounds for each time step were defined as shown in Equations (4)–(6).
u p t h d i = p r i _ 75 ( i ) + 1.5 × p r i _ 75 _ 25 ( i )
l o w t h d i = p r i _ 25 ( i ) 1.5 × p r i _ 75 _ 25 ( i )
s . t .   i 1,2 , 3 , , n
In these equations, u p t h d i and l o w t h d i represent the upper and lower threshold limits for the ith value set in the sample dataset, respectively. p r i _ 25 ( i ) , p r i _ 75 ( i ) , and p r i _ 75 _ 25 ( i ) denote the 25th percentile, 75th percentile, and the interquartile range (IQR) of the ith value set, respectively. The final corrected standard time series was obtained using the following filtering function:
S s t i =   m e a n S s e t i
s . t .     S s e t i < u p t h d i   a n d   S s e t > l o w t h d i i   1,2 , 3 , , n

2.3.2. Time-Series Curve Similarity Measurement Based on Euclidean Distance

Euclidean distance does not account for temporal shifts when measuring the similarity of time-series curves, which may hinder the comparison of rice NDVI sequences across different periods within the study area [38]. In this study, however, remote sensing images were processed into ten-day mean composites, which effectively reduced the influence of temporal shifts and ensured that the constructed time series were of consistent length. Under the conditions of equal-length sequences, the simple Euclidean distance method showed a clear advantage.
The Minkowski distance is a commonly used metric for measuring the distance between vectors, and its general formula is:
D M i n k = i = 1 N f t p f t q r 1 r
In this formula, f t p and f t q represent the time-series values at time t; N is the number of time steps; and r is a parameter that determines the distance metric. When r = 1, the Minkowski distance reduces to the Manhattan distance; when r = 2, it becomes the Euclidean distance, which is more sensitive to variations between time series. In this study, the Euclidean distance was employed to quantify the similarity between NDVI time series. For a given unknown pixel’s NDVI time series S 1   =   ( k 1 ,   k 2 , k 3 , k n ) and the standard NDVI time series S 2 = ( l 1 ,   l 2 , l 3 , l n ) , this study considers the influence of curve shape on similarity and thus avoids computing a single Euclidean distance for the entire sequence. Instead, it calculates the element-wise Euclidean distance D e u c l i d ( i ) between corresponding time steps.
D e u c l i d i = k i l i
s . t .   i = 1,2 , 3 , , n

2.3.3. Selection of Curve Similarity Threshold and Determination of Extreme-Value Constraint

For similarity-based methods of land-cover classification methods, the selection of thresholds is critical to ensuring classification accuracy. Different land-cover types exhibit distinct NDVI values at different time periods. To address this, an innovative threshold-interval strategy was designed in this study to improve the reliability of paddy field extraction. Specifically, Euclidean distances were calculated between the corresponding elements of the standard NDVI time series of other land-cover types and that of paddy fields. The sums of these element-wise differences were then obtained, and the minimum sum was taken as the upper threshold, ensuring that paddy fields could be effectively distinguished from other land-cover types in terms of spectral–temporal characteristics.
Y u p = min D e u c l i d 1 + D e u c l i d 2 + D e u c l i d 3 + + D e u c l i d n
To quantify the variability of an unknown pixel’s time series relative to the standard time series at each time point, the average absolute distance D m e a n i was calculated between the sample dataset and the standard time series, referred to as the mean deviation [39], as shown in Equation (13):
D m e a n i =   m e a n a b s S s e t i S s t i   s . t .   i   1,2 , 3 , , n
Here, abs denotes the absolute value operation. Considering the spatial heterogeneity and phenological differences in rice in terms of transplanting periods and regional environments, the sum of the average deviations of all elements was calculated as the lower threshold. By setting this lower threshold, the influence of spatiotemporal phenological variations can be effectively reduced, thereby mitigating the adverse impact of temporal shifts on classification accuracy. The calculation formula is as follows:
Y d o w n = D m e a n 1 + D m e a n 2 + D m e a n 3 + + D m e a n n
Thus, the threshold range for paddy field extraction can be determined.
During the threshold determination process, simply summing the multi-period NDVI differences is prone to interference from mixed pixels. For instance, NDVI curves of certain mixed pixels, such as those comprising construction land and forest and grassland, may resemble the standard NDVI curve of paddy fields, thereby causing misclassification. To address this issue, an NDVI annual extreme-value constraint was incorporated into the threshold setting to ensure that the extracted pixels display typical seasonal fluctuations, thereby effectively eliminating mixed pixels and enhancing the accuracy of paddy field extraction. Specifically, the NDVI range of paddy fields during the study period and the average NDVI range of the land-cover types contributing to the Y u p value were used to define the extreme-value constraint threshold T . The formula is as follows:
T = S r + S o 2
In the equation, S r denotes the NDVI range of the standard paddy fields during the study period, while S o represents the NDVI range of the land-cover types contributing to the Y u p value.
The NDVI time-series characteristics method for paddy field extraction requires two conditions to be met: (1) the sum of absolute differences between each pixel and the corresponding elements of the standard NDVI time series across all periods must be less than the threshold Y; and (2) the NDVI range during the study period must be greater than T . Pixels meeting both conditions are classified as paddy fields.

2.3.4. Accuracy Assessment Method

This study assesses classification accuracy using a confusion matrix. Commonly used metrics included Producer’s Accuracy (PA), User’s Accuracy (UA), Overall Accuracy (OA), the Kappa coefficient, and the F1-score.
Producer’s Accuracy (PA), also referred to as recall, is the ratio of correctly classified pixels of a given class to the total number of reference pixels in that class. It reflects the probability that a land-cover type is correctly identified—that is, the proportion of actual samples in that class that are correctly identified by the classifier.
User’s Accuracy (UA), also known as precision, is the ratio of correctly classified pixels in a given class to the total number of pixels assigned to that class by the classifier. It reflects the reliability of the classification results.
Overall Accuracy (OA) indicates the proportion of correctly classified pixels across all land-cover types, calculated as the ratio of the sum of the diagonal elements in the confusion matrix to the total number of pixels.
The Kappa coefficient quantifies the agreement between the classification results and those expected by chance. It typically ranges from 0 to 1, with values closer to 1 indicating greater classification accuracy.
The F1-score is the harmonic mean of precision (UA) and recall (PA). It provides a comprehensive evaluation of model performance across classes and better reflects the classification capability of the model.
To quantify the uncertainty of the accuracy metrics, a non-parametric bootstrap resampling method was employed. A total of 1000 resamples with replacement were drawn from the interpreted validation samples. For each resample, a confusion matrix was reconstructed, and OA, Kappa, PA, UA, and F1-score were calculated. The 2.5th and 97.5th percentiles of the resulting distributions were then taken as the 95% confidence intervals [40].

3. Results Analysis

3.1. Image Reconstruction After Cloud Removal

Because of the high cloud cover in the study area, the direct use of the original imagery was severely affected by both clouds and their shadows. As shown in Figure 3a, the pre-masked image was largely obscured by clouds, making it difficult to discern underlying land features. After applying the cloud mask (Figure 3b), cloud interference was effectively removed; however, extensive vacant areas emerged in the imagery, disrupting spatial continuity. Consequently, the calculated NDVI image became incomplete (Figure 3c), directly reducing the accuracy of paddy field extraction.
To address this issue, missing pixels were gap-filled through a two-step procedure. First, cloud-removed pixels were replaced with cloud-free pixels from the same period in the previous five years (Figure 3d). If no valid data were available within that time frame, pixels from a 3 × 3 neighborhood in the same year and time phase were further used for spatial interpolation. The comparison results demonstrate that the gap-filling process effectively restored extensive vacant areas, substantially enhanced spatial continuity, and preserved the complete temporal characteristics of paddy fields, thereby providing a more reliable data foundation for subsequent extraction.

3.2. Characteristics of Standard NDVI Time-Series Curves for Typical Land-Cover Types

First, cloud-free reconstructed images from April to October 2024 were obtained, and NDVI images of the study area were generated using the NDVI calculation formula. NDVI values of the sample points were then extracted, with outliers removed. Finally, the mean NDVI of all remaining sample points within each land-cover type was calculated, yielding the corrected standard NDVI time-series curves (Figure 4) to represent the typical temporal characteristics of each land-cover type.
Figure 4 reveals the following observations: (1) Paddy fields exhibit distinct NDVI variation patterns throughout the entire growing season, with time-series curves that differ significantly from other land-cover types. From April to early May, the NDVI values were relatively low, corresponding to the rice seedling and plowing stages. Subsequently, the NDVI values increased rapidly, reaching their peak in late July. From late August to late September, the NDVI values declined again, indicating that the paddy fields gradually entered the harvesting stage. In October, the NDVI values rose slowly due to weed growth and the recovery of the field environment. (2) Forest and grassland maintain consistently high NDVI values during the study period, while construction land and water bodies show low and stable NDVI values, suggesting limited human disturbance for these three land-cover types. (3) Dryland exhibits periodic fluctuations in NDVI values, strongly influenced by agricultural activities such as tillage, sowing, crop growth, and harvesting. These variations reflect the dynamic nature of crop growth cycles. Therefore, the NDVI time-series curve of paddy fields can serve as a reference standard in subsequent analyses. By calculating the Euclidean distance between the NDVI curves of other land-cover types and that of paddy fields, the similarity can be assessed, enabling paddy field identification. The most critical step in this method is determining the threshold for the sum of absolute differences in NDVI values between paddy fields and other land-cover types across multiple time periods.

3.3. Establishment of Extraction Threshold and Extreme-Value Constraint

The absolute differences in NDVI values between each land-cover type and paddy fields were calculated for each time period, and these differences were then summed across all periods. The smallest total difference served as the upper limit of the extraction threshold. The corresponding results are shown in Table 1.
Based on the results in the table, the upper threshold was set at 1.679.
The average absolute difference between each element of the paddy field sample at each time period and the corresponding element of the standard NDVI time series was calculated, and their sum defines the lower limit of the extraction threshold. The results are presented in Table 2, indicating that the lower extraction threshold was 0.691.
The threshold selection in this study was carried out in two stages. First, an initial threshold range (0.691–1.679) was determined based on the proposed method, and a preliminary search was conducted within this range at an interval of 0.05 to identify the subrange yielding better accuracy. Subsequently, the range was narrowed to 1.12–1.27, identified from the preliminary experiments as the interval with better performance of the accuracy metrics, and used as the refined search range. Within this range, different thresholds were iteratively tested using validation samples, and the trends of Overall Accuracy (OA), the Kappa coefficient, and the F1-score were comprehensively compared. The final threshold was selected as the value at which OA, the Kappa coefficient, and the F1-score simultaneously reached their peaks, while User’s Accuracy (UA) and Producer’s Accuracy (PA) remained relatively balanced. This value was taken as the optimal threshold for paddy field extraction in the study area for 2024 [41].
Simply summing the multi-temporal NDVI differences is highly susceptible to the influence of mixed pixels. Certain mixed pixels exhibit NDVI curves similar to those of paddy fields and may therefore be incorrectly extracted as paddy fields (Figure 5b). To mitigate the influence of mixed pixels on extraction accuracy, an examination of the NDVI curve of paddy fields revealed pronounced fluctuations throughout the study period. Therefore, an extreme-value constraint was introduced to exclude mixed pixels with relatively small sums of absolute differences (Figure 5c). According to Equation (15), the extreme-value constraint for paddy field extraction in Bishan District in 2024 was determined to be 0.265.
The NDVI time-series characteristics method for extracting paddy fields in Bishan District in 2024 required two conditions to be satisfied. A pixel within the study area was classified as paddy field if: (1) the sum of absolute differences between each element of its NDVI time series and the corresponding element of the standard paddy field NDVI time series was less than 1.22; and (2) the NDVI range during the study period exceeded 0.265.

4. Comparison and Analysis of Methods

4.1. Alternative Extraction Methods for Comparison

The same set of sample points was used, with 70% allocated for training and 30% reserved for testing. On the GEE platform, the Random Forest(RF) algorithm was implemented using the function ee.Classifier.smileRandomForest(), and the CART method was applied via ee.Classifier.smileCart() to extract the rice cultivation area in Bishan District for 2024. The extraction results are shown in Figure 6. In this study, the default parameter settings of GEE were adopted (e.g., the number of trees for RF and the maximum depth for CART) without additional parameter tuning. The purpose of this approach was to use RF and CART as benchmark methods to validate the advantages of the proposed approach under identical sample conditions, rather than to optimize the performance of the machine learning models themselves.

4.2. Accuracy Assessment

In this study, the accuracy of paddy field extraction obtained from the three methods was evaluated using overall accuracy and the Kappa coefficient (Table 3). In addition, differences in mixed-pixel identification were compared across the methods (Figure 7), providing a comprehensive evaluation of the reliability and precision of paddy field identification.
As shown in Table 3, the extraction accuracy of the two machine learning methods was significantly lower than that of the NDVI time-series characteristics method. Moreover, Figure 7 illustrates that misclassifications occurred more frequently in typical mixed-pixel regions with the RF and CART methods, whereas the NDVI time-series characteristics method more effectively distinguished paddy fields from other land-cover types, thereby reducing the adverse impact of mixed pixels on classification accuracy.

4.3. Multi-Annual Paddy Field Extraction Based on NDVI Time-Series Characteristics (2020–2023)

Owing to the combined effects of climatic conditions such as temperature, precipitation, and solar radiation, together with factors such as variety selection, farming practices, and natural disaster disturbances, the standard NDVI curves of paddy fields in the same region may vary across years [42,43]. According to the research methodology, the extraction thresholds and extreme-value constraints for paddy fields in Bishan District during 2020–2023 were determined (Table 4). The extracted paddy field data are presented in Figure 8. The distribution maps indicate that the total paddy field area in the study region remained generally stable across years, reflecting a relatively consistent scale of rice cultivation. Moreover, the paddy fields were primarily distributed in relatively flat terrain, corresponding to the ecological characteristics of rice growth and favoring crop development as well as soil and water conservation [44]. Overall, rice cultivation areas exhibited predominantly contiguous distribution patterns, although fragmented plots resulted in scattered distribution of certain paddy fields.
The paddy field distribution data from 2020 to 2023 were imported into ArcGIS Pro3.0.2, and stratified random sampling was used to generate 900 validation points, comprising 300 paddy field points and 600 non-paddy field points. Based on historical Google Earth imagery, visual interpretation of the validation points was conducted to generate confusion matrices. From these, the PA, UA, OA, Kappa coefficient, and F1-score of paddy field extraction over the five years were obtained, as summarized in Table 5.
To quantify the uncertainty of the accuracy metrics, this study employed a non-parametric bootstrap resampling approach. The 2.5th and 97.5th percentiles of the metric distributions were calculated to derive the 95% confidence intervals, as presented in Table 6.
As shown in Table 5, the OA of paddy field classification from 2020 to 2024 consistently exceeded 0.9, while the Kappa coefficient remained above 0.8, indicating a generally high and stable classification performance. The PA consistently remained above 0.95, suggesting that omission errors were minimal and most paddy fields were successfully identified. In contrast, the UA was relatively lower, indicating the presence of a certain proportion of commission errors, where non-rice areas were misclassified as paddy fields. The F1-score also exceeded 0.8 across all years, demonstrating that the classification results achieved a reliable balance between precision and recall.
Furthermore, the 95% confidence intervals derived from bootstrap resampling (Table 6) were relatively narrow, indicating that the accuracy assessment results were both robust and statistically reliable. In particular, the confidence intervals of PA were generally above 0.9, further confirming the strong performance of paddy field extraction in minimizing omission errors.

5. Discussion and Conclusions

5.1. Discussion

(1)
Influence of External Factors
This study employed the NDVI time-series similarity method to extract paddy fields. However, external factors such as extreme weather events and human activities can interfere with NDVI variations. The manifestation, timing, and magnitude of these effects are difficult to quantify, indicating that although the method performs well under normal climatic conditions, it may underestimate or misclassify paddy fields under extreme scenarios, thereby reducing the reliability of the results. In addition, the study area experiences heavy rainfall and persistent cloud cover during summer, making it difficult to obtain sufficient high-quality multispectral imagery. Although temporal and spatial gap-filling was applied, low-quality images still introduced uncertainties in constructing the NDVI time series. In theory, Landsat data could serve as a supplementary source. However, its relatively low spatial resolution poses challenges in Bishan District, where farmland is highly fragmented and paddy fields are relatively small, increasing the likelihood of mixed-pixel issues. Moreover, differences in the red and near-infrared band definitions between Landsat and Sentinel-2 lead to systematic NDVI offsets, making it difficult to align their time series. Therefore, this study did not incorporate Landsat data for gap-filling. Instead, historical multi-year Sentinel-2 imagery and spatial neighboring pixels were used. Nevertheless, this approach could not fully eliminate the uncertainties introduced by low-quality data. In future research, we plan to further explore the integration of Sentinel-2 with SAR and other multi-source remote sensing data to mitigate the uncertainties caused by cloud cover and missing pixels.
(2)
Influence of Data Sources
The primary influence of data sources stems from their spatial resolution. In southwestern region, land parcels are highly fragmented. Due to the limited spatial resolution of remote sensing imagery, many pixels encompass multiple land-use types—commonly referred to as mixed pixels—rather than representing a single homogeneous cover type. Because NDVI values vary across land-cover types, mixed pixels exhibit NDVI values that differ from those of pure pixels. These discrepancies introduce deviations in temporal curve similarity measurements, ultimately impacting the accuracy of paddy field extraction. Although an extreme-value constraint was applied to reduce the impact of mixed pixels, it did not completely eliminate their influence. Future research should further validate the performance of the Euclidean distance-based temporal similarity method in regions characterized by large, homogeneous rice cultivation areas.
(3)
Insufficient Accuracy Assessment
The accuracy assessment revealed that although the OA and Kappa coefficient consistently remained at high levels, the UA was relatively lower. This indicates the presence of commission errors in paddy field extraction, suggesting that further optimization of threshold settings and feature selection is necessary to reduce misclassification in future studies. Moreover, since each new validation sample required manual visual interpretation, the process was considerable and labor-intensive. To address this, a non-parametric bootstrap resampling approach based on indices was adopted for uncertainty analysis. Although this method effectively quantified the statistical uncertainty of the samples, its results remained subject to certain limitations. Future research should incorporate additional uncertainty analysis strategies to further enhance the robustness of classification accuracy assessments.
(4)
Analysis of the Differences Between Remote Sensing Estimates and Statistical Data
In this study, paddy fields extracted from remote sensing imagery were not directly compared with those recorded in the Third National Land Survey (TNLS) data. This is because field observations revealed that some parcels classified as “paddy fields” in the TNLS had already been abandoned or converted to other land uses, which would lead to inconsistencies if used as a validation benchmark. Therefore, a direct comparison was not conducted in this study. Future studies could integrate field-level agricultural survey data or cropland monitoring datasets to reduce discrepancies caused by statistical inconsistencies and enhance both the interpretability and applicability of the results.
(5)
Spatial and Temporal Transferability of the Method
The method developed in this study relies on the phenological characteristics of rice NDVI time series, showing potential for application in other major rice-growing regions. However, variations in climate conditions, planting regimes, and irrigation practices across regions may result in significant differences in NDVI curves, necessitating the reconstruction of standard curves and recalibration of extraction thresholds. This highlights the constraints of methodological transferability across regions. For larger study areas, stratified extraction based on factors such as elevation and latitude may be required to mitigate the effects of spatial heterogeneity. Using data from 2020 to 2024, this study validated the stability of the method, demonstrating its robustness over multiple years. However, for dryland crops such as wheat and maize, while the methodological framework can be referenced, differences in phenological cycles and land-use patterns require the reconstruction of crop-specific NDVI standard curves. As these crops all belong to dryland categories, further subdivision of the dryland category would also be necessary when delineating land-cover types in the study area. Therefore, the broader application of this method across different crops and regions should be approached with caution. Future studies should incorporate validation across multiple regions and crop types to enhance its generalizability.
(6)
Limitations of the Threshold Determination Approach
Although the optimal threshold was determined through stepwise narrowing of intervals and refined validation, this process still entails certain limitations. First, threshold optimization relied on a stepwise search strategy. While its precision met practical requirements, it did not guarantee a truly global optimum, leaving residual uncertainty. Second, the determination of thresholds depended on the distribution and quantity of available samples. When sample sizes were limited or spatially imbalanced, the resulting optimal threshold might have been biased. To minimize potential biases from repeated validation on the same dataset, future studies should incorporate cross-validation or parameter optimization using independent validation samples, thereby enhancing the robustness of threshold selection.

5.2. Conclusions

This study employed Sentinel-2 data and applied the NDVI time-series characteristics method to extract paddy fields in Bishan District, resulting in the following conclusions:
(1)
Paddy fields exhibit distinct NDVI time-series patterns that differentiate them from other land-cover types. These characteristic temporal dynamics, captured through Euclidean distance analysis, allow for effective identification of paddy fields.
(2)
A systematic evaluation of paddy field classification results from 2020 to 2024 was conducted using multiple accuracy metrics. The results showed that the OA and Kappa coefficient consistently remained at high levels, while the F1-score was stable above 0.8, indicating that the classification results achieved a reliable balance between precision and recall. Further bootstrap-based uncertainty analysis revealed that the confidence intervals of all metrics were relatively narrow, confirming the robustness and statistical reliability of the results. Overall, the proposed method demonstrated excellent classification performance for paddy field extraction and significantly outperformed traditional machine learning methods implemented on the GEE platform.
(3)
This study proposed a strategy of “threshold interval setting combined with iterative validation within the interval,” further integrated with extreme-range constraints, to address both spatiotemporal variability and mixed-pixel issues in threshold determination. Specifically, the upper threshold ensured effective discrimination between paddy fields and other land-cover types, the lower threshold mitigated the influence of phenological variations and temporal shifts, and the extreme-range constraint further eliminated anomalous pixels with small absolute differences. This combined approach effectively enhanced the robustness and accuracy of paddy field extraction.
Finally, this method is more suitable for paddy field extraction in relatively small areas. In larger study regions, rice growth stages are influenced by factors such as elevation, latitude, extreme weather, and human activities, making it difficult for a single standard NDVI time-series curve to capture the NDVI variations in all paddy fields across the region.

Author Contributions

Idea development, C.Y. and Y.T.; data collection, C.Y., Y.T. and Y.H.; data processing, C.Y. and Y.T., J.T.; article writing, C.Y. and Y.T.; graphic design, C.Y., Y.T. and W.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Sentinel-2 data are available via Google Earth Engine (https://code.earthengine.google.com/ (accessed on 1 January 2025)).

Acknowledgments

We express our gratitude to the anonymous reviewers for their valuable insights.

Conflicts of Interest

Authors Chenxi Yuan, Yongzhong Tian and Ye Huang were employed by the company Daotian Science and Technology Limited Company. Author Jinglian Tian was employed by the company Chongqing Huadi Resource and Environment Technology Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Xu, S.; Zhu, X.; Chen, J.; Zhu, X.; Duan, M.; Qiu, B.; Wan, L.; Tan, X.; Xu, Y.N.; Cao, R. A robust index to extract paddy fields in cloudy regions from SAR time series. Remote Sens. Environ. 2023, 285, 113374. [Google Scholar] [CrossRef]
  2. FAO. FAOSTAT Food Balance Sheets; FAO: Rome, Italy, 2024. [Google Scholar]
  3. Pan, B.; Zheng, Y.; Shen, R.; Ye, T.; Zhao, W.; Dong, J.; Ma, H.; Yuan, W. High-resolution distribution dataset of double-season paddy rice in China. Remote Sens. 2021, 13, 4609. [Google Scholar] [CrossRef]
  4. Majnoun Hosseini, M.; Valadan Zoej, M.J.; Taheri Dehkordi, A.; Ghaderpour, E. Cropping intensity mapping using temporal transfer of a stacked ensemble machine learning model within Google Earth Engine. Geocarto Int. 2024, 39, 1. [Google Scholar] [CrossRef]
  5. Abdali, E.; Valadan Zoej, M.J.; Taheri Dehkordi, A.; Ghaderpour, E. A parallel-cascaded ensemble of machine learning models for crop type classification in Google Earth Engine using multi-temporal Sentinel-1/2 and Landsat-8/9 data. Remote Sens. 2023, 16, 127. [Google Scholar] [CrossRef]
  6. Meng, L.; Li, Y.; Shen, R.; Zheng, Y.; Pan, B.; Yuan, W.; Li, J.; Zhuo, L. Large-scale and high-resolution paddy rice intensity mapping using downscaling and phenology-based algorithms on Google Earth Engine. Int. J. Appl. Earth Obs. Geoinf. 2024, 128, 103725. [Google Scholar] [CrossRef]
  7. He, Y.; Dong, J.; Liao, X.; Sun, L.; Wang, Z.; You, N.; Li, Z.; Fu, P. Examining rice distribution and cropping intensity in a mixed single- and double-cropping region in South China using all available Sentinel-1/2 images. Int. J. Appl. Earth Obs. Geoinf. 2021, 101, 102351. [Google Scholar] [CrossRef]
  8. You, N.; Dong, J.; Huang, J.; Du, G.; Zhang, G.; He, Y.; Yang, T.; Di, Y.; Xiao, X. The 10-m crop type maps in Northeast China during 2017–2019. Sci. Data 2021, 8, 41. [Google Scholar] [CrossRef] [PubMed]
  9. Ni, R.; Tian, J.; Li, X.; Yin, D.; Li, J.; Gong, H.; Zhang, J.; Zhu, L.; Wu, D. An enhanced pixel-based phenological feature for accurate paddy rice mapping with Sentinel-2 imagery in Google Earth Engine. ISPRS J. Photogramm. Remote Sens. 2021, 178, 282–296. [Google Scholar] [CrossRef]
  10. Huang, C.; You, S.; Liu, A.; Li, P.; Zhang, J.; Deng, J. High-resolution national-scale mapping of paddy rice based on Sentinel-1/2 data. Remote Sens. 2023, 15, 4055. [Google Scholar] [CrossRef]
  11. Chen, B. Remote Sensing Image Identification and Monitoring of Rice Planting Areas in the Yunnan–Guizhou Plateau Based on the GEE Platform. Master’s Thesis, Yunnan Normal University, Kunming, China, 2021. [Google Scholar]
  12. Fatchurrachman; Rudiyanto; Che Soh, N.; Mohd Shah, R.; Goh Eng Giap, S.; Indra Setiawan, B.; Minasny, B. High-resolution mapping of paddy rice extent and growth stages across Peninsular Malaysia using a fusion of Sentinel-1 and Sentinel-2 time series data in Google Earth Engine. Remote Sens. 2022, 14, 1875. [Google Scholar] [CrossRef]
  13. Liu, Y. Remote Sensing Classification Mapping of Major Grain Crops in Heilongjiang Province. Master’s Thesis, Heilongjiang University, Harbin, China, 2023. [Google Scholar]
  14. Thorp, K.R.; Drajat, D. Deep machine learning with Sentinel satellite data to map paddy rice production stages across West Java, Indonesia. Remote Sens. Environ. 2021, 265, 112679. [Google Scholar] [CrossRef]
  15. Zhan, P.; Zhu, W.; Li, N. An automated rice mapping method based on flooding signals in synthetic aperture radar time series. Remote Sens. Environ. 2021, 252, 112112. [Google Scholar]
  16. Carrasco, L.; Fujita, G.; Kito, K.; Miyashita, T. Historical mapping of rice fields in Japan using phenology and temporally aggregated Landsat images in Google Earth Engine. ISPRS J. Photogramm. Remote Sens. 2022, 191, 277–289. [Google Scholar] [CrossRef]
  17. Sun, L.; Lou, Y.; Shi, Q.; Zhang, L. Spatial domain transfer: Cross-regional paddy rice mapping with a few samples based on Sentinel-1 and Sentinel-2 data on GEE. Int. J. Appl. Earth Obs. Geoinf. 2024, 128, 103762. [Google Scholar] [CrossRef]
  18. Shao, Q.; Li, R.; Qiu, J.; Han, Y.; Han, D.; Chen, M.; Chi, H. Large-scale mapping of new mixed rice cropping patterns in southern China with a phenology-based algorithm and MODIS dataset. Paddy Water Environ. 2023, 21, 243–261. [Google Scholar]
  19. Fan, X.; Wang, Z.; Zhang, H.; Liu, H.; Jiang, Z.; Liu, X. Large-scale rice mapping based on Google Earth Engine and multi-source remote sensing images. J. Indian Soc. Remote Sens. 2023, 51, 93–102. [Google Scholar]
  20. Tian, J.; Tian, Y.; Wan, W.; Yuan, C.; Liu, K.; Wang, Y. Research on the temporal and spatial changes and driving forces of rice fields based on the NDVI difference method. Agriculture 2024, 14, 1165. [Google Scholar] [CrossRef]
  21. Fang, H.; Liang, S.; Chen, Y.; Ma, H.; Li, W.; He, T.; Tian, F.; Zhang, F. A comprehensive review of rice mapping from satellite data: Algorithms, product characteristics, and consistency assessment. Sci. Remote Sens. 2024, 10, 100172. [Google Scholar] [CrossRef]
  22. Blickensdörfer, L.; Schwieder, M.; Pflugmacher, D.; Nendel, C.; Erasmi, S.; Hostert, P. Mapping of crop types and crop sequences with combined time series of Sentinel-1, Sentinel-2, and Landsat 8 data for Germany. Remote Sens. Environ. 2022, 269, 112831. [Google Scholar] [CrossRef]
  23. Pelletier, C.; Valero, S.; Inglada, J.; Champion, N.; Marais Sicre, C.; Dedieu, G. Effect of training class label noise on classification performances for land cover mapping with satellite image time series. Remote Sens. 2017, 9, 173. [Google Scholar] [CrossRef]
  24. Tian, J.; Tian, Y.; Cao, Y.; Wan, W.; Liu, K. Research on rice fields extraction by NDVI difference method based on Sentinel data. Sensors 2023, 23, 5876. [Google Scholar] [CrossRef] [PubMed]
  25. Shen, S. Study on Pesticide Application Behavior of Vegetable Farmers in Bishan District, Chongqing. Master’s Thesis, Shihezi University, Shihezi, China, 2024. [Google Scholar]
  26. GB/T 21010-2017; Current Land Use Classification. Drafting organizations: China Land Surveying and Planning Institute and Department of Surveying and Monitoring, Ministry of Natural Resources: Beijing, China, , 2017.
  27. Song, M.; Xu, L.; Ge, J.; Zhang, H.; Zuo, L.; Jiang, J.; Ding, Y.; Xie, Y.; Wu, F. EARice10: A 10 m resolution annual rice distribution map of East Asia for 2023. Earth Syst. Sci. Data 2025, 17, 661–683. [Google Scholar] [CrossRef]
  28. Li, J.; Wang, L.; Liu, S.; Peng, B.; Ye, H. An automatic cloud detection model for Sentinel-2 imagery based on Google Earth Engine. Remote Sens. Lett. 2022, 13, 13. [Google Scholar] [CrossRef]
  29. Gao, X.; Chi, H.; Huang, J.; Han, Y.; Li, Y.; Ling, F. Comparison of cloud-mask algorithms and machine-learning methods using Sentinel-2 imagery for mapping paddy rice in Jianghan Plain. Remote Sens. 2024, 16, 1305. [Google Scholar]
  30. Roßberg, T.; Schmitt, M. Comparing the relationship between NDVI and SAR backscatter across different frequency bands in agricultural areas. Remote Sens. Environ. 2025, 319, 114612. [Google Scholar]
  31. ESA. Sentinel-2 Product Specification: Level-2A Input/Output Data Definition Document (S2-PDGS-MPC-L2A-IODD-2.9). Eur. Space Agency 2020. [Google Scholar]
  32. Guan, X.; Huang, C.; Liu, G.; Xu, Z.; Liu, Q. Extraction of rice remote sensing information using a time-series similarity method based on DTW distance: A case study of Thailand. Resour. Sci. 2014, 36, 267–272. [Google Scholar]
  33. Zhang, C.; Zhang, H.; Tian, S. Phenology-assisted supervised paddy rice mapping with Landsat imagery on Google Earth Engine: Experiments in Heilongjiang Province of China from 1990 to 2020. Comput. Electron. Agric. 2023, 212, 108105. [Google Scholar] [CrossRef]
  34. Griffiths, P.; Nendel, C.; Hostert, P. Intra-annual reflectance composites from Sentinel-2 and Landsat for national-scale crop and land cover mapping. Remote Sens. Environ. 2019, 220, 135–151. [Google Scholar]
  35. Fan, X.; Gao, P.; Tian, B.; Wu, C.; Mu, X. Spatio-temporal patterns of NDVI and its influencing factors based on the ESTARFM in the Loess Plateau of China. Remote Sens. 2023, 15, 2553. [Google Scholar] [CrossRef]
  36. Guan, X.; Huang, C.; Liu, G.; Meng, X.; Liu, Q. Mapping rice cropping systems in Vietnam using an NDVI-based time-series similarity measurement based on DTW distance. Remote Sens. 2016, 8, 19. [Google Scholar] [CrossRef]
  37. Gao, Y.; Wang, L.; Chen, J.; Li, J. An algorithm for rice information extraction using morphological similarity. Remote Sens. Inf. 2020, 35, 11. [Google Scholar]
  38. Huang, C.; Xu, Z.; Zhang, C.; Li, H.; Liu, Q.; Yang, Z.; Liu, G. Extraction method of rice planting structure in tropical regions based on Sentinel-1 time-series characteristics. Trans. Chin. Soc. Agric. Eng. 2020, 36, 177–184. [Google Scholar]
  39. Zhang, Z. Study on Extraction of Rice–Crayfish Fields in Cloudy Southern China Based on the GEE Platform. Master’s Thesis, East China University of Technology, Nanchang, China, 2023. [Google Scholar]
  40. Kalkhan, M.A.; Reich, R.M.; Czaplewski, R.L. Variance estimates and confidence intervals for the Kappa measure of classification accuracy. Can. J. Remote Sens. 1997, 23, 210–216. [Google Scholar] [CrossRef]
  41. Li, S.; Li, F.; Gao, M.; Li, Z.; Leng, P.; Duan, S.; Ren, J. A new method for winter wheat mapping based on spectral reconstruction technology. Remote Sens. 2021, 13, 1810. [Google Scholar] [CrossRef]
  42. Zhao, X.; Nishina, K.; Kawaguchi Akitsu, T.; Jiang, L.; Masutomi, Y.; Nishida Nasahara, K. Feature-based algorithm for large-scale rice phenology detection based on satellite images. Agric. For. Meteorol. 2023, 329, 109283. [Google Scholar] [CrossRef]
  43. Liu, Y.; Liu, W.; Li, Y.; Ye, T.; Chen, S.; Li, Z.; Sun, R. Concurrent precipitation extremes modulate the response of rice transplanting date to preseason temperature extremes in China. Earth’s Future 2023, 11, e2022EF002888. [Google Scholar] [CrossRef]
  44. Li, J.; Ding, W.; Ran, W.; Yang, H.; Liang, Z.; Tong, X.; Sun, B. Effects of natural rainfall characteristics and crop cover on runoff and sediment yield of purple soil sloping farmland in the Three Gorges Reservoir Area. Trans. Chin. Soc. Agric. Eng. 2025, 41, 137–146. [Google Scholar]
Figure 1. The location and topography of the study area.
Figure 1. The location and topography of the study area.
Agriculture 15 02321 g001
Figure 2. Workflow of paddy field extraction based on NDVI time-series characteristics.
Figure 2. Workflow of paddy field extraction based on NDVI time-series characteristics.
Agriculture 15 02321 g002
Figure 3. NDVI image reconstruction. (a) Original Sentinel-2 imagery; (b) Cloud-masked image; (c) NDVI composite; (d) Result after temporal gap-filling.
Figure 3. NDVI image reconstruction. (a) Original Sentinel-2 imagery; (b) Cloud-masked image; (c) NDVI composite; (d) Result after temporal gap-filling.
Agriculture 15 02321 g003
Figure 4. NDVI time-series curves of major land-cover types (calculated as the mean NDVI of sample points after outlier removal).
Figure 4. NDVI time-series curves of major land-cover types (calculated as the mean NDVI of sample points after outlier removal).
Agriculture 15 02321 g004
Figure 5. Impact of mixed pixels before and after applying extreme-value constraint (example from a local area). Green squares denote pixels classified as paddy field. (a) Original image; (b) Classification result before applying the extreme-value constraint; (c) Classification result after applying the extreme-value constraint.
Figure 5. Impact of mixed pixels before and after applying extreme-value constraint (example from a local area). Green squares denote pixels classified as paddy field. (a) Original image; (b) Classification result before applying the extreme-value constraint; (c) Classification result after applying the extreme-value constraint.
Agriculture 15 02321 g005
Figure 6. Comparison of Paddy Field Extraction Results in Bishan District in 2024 Using Different Classification Algorithms. (a) NDVI time-series characteristics. (b) Random Forest. (c) Classification and Regression Trees.
Figure 6. Comparison of Paddy Field Extraction Results in Bishan District in 2024 Using Different Classification Algorithms. (a) NDVI time-series characteristics. (b) Random Forest. (c) Classification and Regression Trees.
Agriculture 15 02321 g006
Figure 7. Typical Misclassification of Mixed Pixels by the Three Methods. Green squares denote pixels classified as paddy field.
Figure 7. Typical Misclassification of Mixed Pixels by the Three Methods. Green squares denote pixels classified as paddy field.
Agriculture 15 02321 g007
Figure 8. Paddy Field Extraction Results in Bishan District from 2020 to 2023.
Figure 8. Paddy Field Extraction Results in Bishan District from 2020 to 2023.
Agriculture 15 02321 g008
Table 1. NDVI Differences Between Various Land-Cover Types and Paddy Fields at Corresponding Time Points.
Table 1. NDVI Differences Between Various Land-Cover Types and Paddy Fields at Corresponding Time Points.
Forest and GrasslandDrylandConstruction LandWater Bodies
Apr-M0.2320.2340.0470.109
Apr-L0.2730.2530.0220.091
May-E0.1900.1800.0580.107
May-M0.2340.1660.1440.256
May-L0.1350.0980.2410.349
Jul-L0.0080.0840.4130.445
Aug-L0.2920.1350.1140.223
Sept-M0.2190.0940.1100.187
Sept-L0.2780.0660.1480.262
Oct-E0.2660.1410.1330.191
Oct-M0.2300.1460.1830.225
Oct-L0.1420.0820.1630.221
The sum of the differences2.4991.6791.7762.666
Table 2. Average Deviations of Corresponding Elements at Each Time Period.
Table 2. Average Deviations of Corresponding Elements at Each Time Period.
Time PeriodAverage Deviation
Apr-M0.058
Apr-L0.016
May-E0.013
May-M0.075
May-L0.06
Jul-L0.073
Aug-L0.041
Sept-M0.042
Sept-L0.047
Oct-E0.085
Oct-M0.089
Oct-L0.092
The sum of the differences0.691
Table 3. Accuracy comparison of paddy field extraction results from the three methods.
Table 3. Accuracy comparison of paddy field extraction results from the three methods.
MethodOverall AccuracyKappa Coefficient
NDVI time-series characteristics0.920.81
RF0.90.69
CART0.90.69
Table 4. Paddy Field Extraction Thresholds and Extreme-Value Constraints (2020–2023).
Table 4. Paddy Field Extraction Thresholds and Extreme-Value Constraints (2020–2023).
Year2020202120222023
Curve Similarity Threshold1.51.241.551.03
Extreme-Value Constraint0.4380.2910.2610.264
Table 5. Accuracy Evaluation of Paddy Field Classification Results (2020–2023).
Table 5. Accuracy Evaluation of Paddy Field Classification Results (2020–2023).
Year2020202120222023
Producer’s Accuracy0.970.950.970.98
User’s Accuracy0.860.790.80.81
Overall Accuracy0.940.920.930.93
Kappa Coefficient0.870.810.830.84
F1-score0.910.860.880.89
Table 6. Confidence intervals of accuracy assessment metrics for paddy field classification results from 2020 to 2024.
Table 6. Confidence intervals of accuracy assessment metrics for paddy field classification results from 2020 to 2024.
Year20202021202220232024
Producer’s Accuracy0.94–0.990.92–0.980.95–0.990.97–1.000.96–1.00
User’s Accuracy0.82–0.900.75–0.840.76–0.850.77–0.860.73–0.82
Overall Accuracy0.93–0.960.90–0.940.91–0.940.92–0.950.90–0.94
Kappa Coefficient0.83–0.910.76–0.850.79–0.870.80–0.880.77–0.85
F1-score0.89–0.940.83–0.900.85–0.910.86–0.920.83–0.89
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yuan, C.; Tian, Y.; Huang, Y.; Tian, J.; Wan, W. A Method for Paddy Field Extraction Based on NDVI Time-Series Characteristics: A Case Study of Bishan District. Agriculture 2025, 15, 2321. https://doi.org/10.3390/agriculture15222321

AMA Style

Yuan C, Tian Y, Huang Y, Tian J, Wan W. A Method for Paddy Field Extraction Based on NDVI Time-Series Characteristics: A Case Study of Bishan District. Agriculture. 2025; 15(22):2321. https://doi.org/10.3390/agriculture15222321

Chicago/Turabian Style

Yuan, Chenxi, Yongzhong Tian, Ye Huang, Jinglian Tian, and Wenhao Wan. 2025. "A Method for Paddy Field Extraction Based on NDVI Time-Series Characteristics: A Case Study of Bishan District" Agriculture 15, no. 22: 2321. https://doi.org/10.3390/agriculture15222321

APA Style

Yuan, C., Tian, Y., Huang, Y., Tian, J., & Wan, W. (2025). A Method for Paddy Field Extraction Based on NDVI Time-Series Characteristics: A Case Study of Bishan District. Agriculture, 15(22), 2321. https://doi.org/10.3390/agriculture15222321

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop