Mapping National-Scale Croplands in Pakistan by Combining Dynamic Time Warping Algorithm and Density-Based Spatial Clustering of Applications with Noise

: Croplands are commonly mapped using time series of remotely sensed images. The dynamic time warping (DTW) algorithm is an e ﬀ ective method for realizing this. However, DTW algorithm faces the challenge of capturing complete and accurate representative cropland time series on a national scale, especially in Asian countries where climatic and topographic conditions, cropland types, and crop growth patterns vary signiﬁcantly. This study proposes an automatic cropland extraction method based on the DTW algorithm and density-based spatial clustering of applications with noise (DBSCAN), hereinafter referred to as ACE-DTW, to map croplands in Pakistan in 2015. First, 422 frames of multispectral Landsat-8 satellite images were selected from the Google Earth Engine to construct monthly normalized di ﬀ erence vegetation index (NDVI) time series. Next, a total of 2409 training samples of six land cover types were generated randomly and explained visually using high-resolution remotely sensed images. Then, a multi-layer DBSCAN was used to classify NDVI time series of training samples into di ﬀ erent categories automatically based on their pairwise DTW distances, and the mean NDVI time series of each category was used as the standard time series to represent the characteristics of that category. These standard time series attempted to represent cropland information and maximally distinguished croplands from other possible interference land cover types. Finally, image pixels were classiﬁed as cropland or non-cropland based on their DTW distances to the standard time series of the six land cover types. The overall cropland extraction accuracy of ACE-DTW was 89.7%, which exceeded those of other supervised classiﬁers (classiﬁcation and regression trees: 78.2%; support vector machines: 78.8%) and existing global cropland datasets (Finer Resolution Observation and Monitoring of Global Land Cover: 87.1%; Global Food Security Support Analysis Data: 83.1%). Further, ACE-DTW could produce relatively complete time series of variable cropland types, and thereby provide a signiﬁcant advantage in mountain regions with small, fragmented croplands and plain regions with large, high-density patches of croplands.


Introduction
The distribution and quality of croplands are important factors in global food security and water resource security [1][2][3]. Cropland maps based on remotely sensed images represent a key approach for obtaining the spatial distribution of croplands, establishing the reasonable use of cropland resources, and ensuring food security [4]. Such nationwide maps provide basic and important geographic information that can help countries obtain a clear understanding of the quantity and quality of their croplands, which is vital for sustaining their national economies [5][6][7]. This is particularly true in Asian countries where agriculture is a pillar industry [8][9][10].
Cropland mapping methods based on remotely sensed images have been widely used. Early methods used supervised classification directly [5,11,12]. These methods did not consider the significant differences in croplands at different growth stages; hence, an increasing number of cropland mapping studies have distinguished croplands through time series analysis [13][14][15][16][17]. Cropland mapping based on time series of satellite images can be classified into three types. The first type is based on a priori knowledge of the phenology according to the growth characteristics of different crops [18][19][20][21][22]. The second type uses supervised classification of time series, which is currently the most widely used approach [23][24][25][26][27]. In recent years, with the development of algorithms, this type has evolved into using machine learning instead of supervised classification [28]. The third type uses the dynamic time warping (DTW) algorithm to map croplands. This type uses the patterns of phenological crop development and obtains more accurate results [4,29,30].
Compared with the first two types of cropland mapping methods based on time series, the DTW algorithm-based methods fully consider the time series of crops with staggered peaks under different climatic and topographic conditions. In previous studies, the DTW algorithm has been proved to be an effective approach for cropland mapping. For example, Mondal et al. [31] proved that DTW-based cropland mapping performs better than Euclidean-distance-based mapping. Belgiu et al. [30] used Sentinel-2 time series data and the DTW algorithm to map croplands in three study representative areas. The extraction accuracy was higher than that obtained using the random forest classifier.
However, existing DTW algorithm-based studies have mainly focused on small-scale study areas (<20,000 km 2 ) [29][30][31][32] or a specific crop (e.g., rice) [4]. One of the main reasons is that these studies relied on a priori knowledge on phenology [32] or artificial summaries of typical characteristics of the study area [33] to generate standard cropland time series. For large areas with various cropland types, conventional DTW algorithms may fail to capture complete and accurate cropland information. As a result, mapping national-scale croplands with complex cropland status using the DTW algorithm is very challenging. Particularly, in Asian countries, to adapt to climate change, farmers often use intercropping and crop rotation, which has led to significant regional differences in crops and intra-class variations within crop types [33][34][35][36][37][38].
Therefore, a method to fully capture the representative time series of croplands using DTW in large-scale cropland mapping is necessary. We selected Pakistan, a typical Asian country with complex cropland status, as the study area, and used NDVI time series from Landsat-8 images. An automated cropland extraction method based on the DTW algorithm and density-based spatial clustering of applications with noise (DBSCAN) (hereinafter referred to as ACE-DTW) was proposed to map croplands at a national scale. A multi-layer DBSCAN was used to generate standard time series of various land cover types in Pakistan automatically, and a cropland map of Pakistan during 2015 was constructed according to these standard time series. The results showed that ACE-DTW can produce a relatively complete description of cropland types and produce a high-precision cropland map of Pakistan.

Study Area and Data Sources
Pakistan is located in a temperate zone within Southern Asia and has a total area of approximately 790,000 km 2 . The country has both arid and semi-arid climates, which vary in different regions ( Figure 1a). In general, Pakistan has more croplands in the south and east than in the north and west. The growth of the same type of crop exhibits spatial differences in different areas [33]. The territory Remote Sens. 2020, 12, 3644 3 of 17 contains diverse terrain, including mountains (e.g., the Himalayas and Karakoram), plateaus, bare-rock areas, and deserts. As mountains, bare-rock areas, and deserts are not suitable for cultivation, cropland fields in the north and southwest are generally small in size. The southeast, which is dominated by plains, is covered by high-density cropland areas distributed along the Indus River [35].
Remote Sens. 2020, 12, x FOR PEER REVIEW 3 of 17 territory contains diverse terrain, including mountains (e.g., the Himalayas and Karakoram), plateaus, bare-rock areas, and deserts. As mountains, bare-rock areas, and deserts are not suitable for cultivation, cropland fields in the north and southwest are generally small in size. The southeast, which is dominated by plains, is covered by high-density cropland areas distributed along the Indus River [35].
. A long history of intensive and meticulous farming in Pakistan has led to complex farming methods, a wide variety of crops, and different types of croplands adapted to different climatic and topographical conditions [39]. These factors present challenges for cropland extraction. Croplands in the southeastern plains are regularly shaped, closely arranged, and distributed over large areas. Croplands in the southwestern plateaus are scattered around cities. In addition, there are terraces in the northern mountain regions, and croplands in this region are distributed along valleys or rivers ( Figure 1a). As a country that typifies Asian agriculture, Pakistan exhibits rich crop varieties, significant spatial differences in crops, intra-class variations of the same crop in different regions, and complex small-scale farming methods, including crop rotation on the same plot in different seasons and intercropping of different crops.
Based on the Google Earth Engine (GEE), 422 frames of remotely sensed Landsat-8 images were selected; these images were captured between December 2014 and January 2016. A total of 1782 training sample points and 858 test points were generated randomly using ArcGIS, and highresolution remotely sensed images from the Google Earth were used for visual interpretations of land cover types (Table 1, Figures 1c,d and 2). During the interpretation process of the remotely sensed images, vegetable gardens could not be separated from cultivated croplands; thus, the croplands extracted in this study included vegetable gardens. A long history of intensive and meticulous farming in Pakistan has led to complex farming methods, a wide variety of crops, and different types of croplands adapted to different climatic and topographical conditions [39]. These factors present challenges for cropland extraction. Croplands in the southeastern plains are regularly shaped, closely arranged, and distributed over large areas. Croplands in the southwestern plateaus are scattered around cities. In addition, there are terraces in the northern mountain regions, and croplands in this region are distributed along valleys or rivers ( Figure 1a). As a country that typifies Asian agriculture, Pakistan exhibits rich crop varieties, significant spatial differences in crops, intra-class variations of the same crop in different regions, and complex small-scale farming methods, including crop rotation on the same plot in different seasons and intercropping of different crops.
Based on the Google Earth Engine (GEE), 422 frames of remotely sensed Landsat-8 images were selected; these images were captured between December 2014 and January 2016. A total of 1782 training sample points and 858 test points were generated randomly using ArcGIS, and high-resolution remotely sensed images from the Google Earth were used for visual interpretations of land cover types (Table 1, Figure 1c,d and Figure 2). During the interpretation process of the remotely sensed images, vegetable gardens could not be separated from cultivated croplands; thus, the croplands extracted in this study included vegetable gardens.   In addition to remotely sensed images, this study used two additional datasets for analysis: ALOS World 3D-30, with a spatial resolution of 30 m, and 2015 WATER VAPOR-AQUA/MODIS, with a spatial resolution of 0.1° and a temporal resolution of 1 month. ALOS World 3D-30 was used to remove high mountain areas (with elevation > 3600 m and slope > 30°) that are not suitable for planting crops. WATER VAPOR-AQUA/MODIS was used to divide Pakistan into three regions according to humidity (Figure 1b) as follows: the northern mountain region (Region I, water vapor < 461 cm/year), southwestern plateau region (Region II, 461 < water vapor < 977 cm/year), and southeastern plain region (Region III, water vapor > 977 cm/year). The division of the three regions reflects the significant spatial characteristics of climate and topography in Pakistan.
Four sets of reference data were used in this study, including cropland mapping results from two different supervised classifiers (i.e., classification and regression trees (CART) and support vector machines (SVM)) and two existing global cropland datasets (i.e., Finer Resolution Observation and Monitoring of Global Land Cover (FROM-GLC) and Global Food Security Support Analysis Data (GFSAD)). The two supervised classifiers used the same training samples and remotely sensed images as ACE-DTW. The two existing global cropland datasets contained data from 2015 and had a spatial resolution of 30 m. FROM-GLC extracted croplands through multi-season remotely sensed image classifiers, and GFSAD extracted croplands through supervised time series classifiers [40,41]. In addition to remotely sensed images, this study used two additional datasets for analysis: ALOS World 3D-30, with a spatial resolution of 30 m, and 2015 WATER VAPOR-AQUA/MODIS, with a spatial resolution of 0.1 • and a temporal resolution of 1 month. ALOS World 3D-30 was used to remove high mountain areas (with elevation > 3600 m and slope > 30 • ) that are not suitable for planting crops. WATER VAPOR-AQUA/MODIS was used to divide Pakistan into three regions according to humidity (Figure 1b) as follows: the northern mountain region (Region I, water vapor < 461 cm/year), southwestern plateau region (Region II, 461 < water vapor < 977 cm/year), and southeastern plain region (Region III, water vapor > 977 cm/year). The division of the three regions reflects the significant spatial characteristics of climate and topography in Pakistan.
Four sets of reference data were used in this study, including cropland mapping results from two different supervised classifiers (i.e., classification and regression trees (CART) and support vector machines (SVM)) and two existing global cropland datasets (i.e., Finer Resolution Observation and Monitoring of Global Land Cover (FROM-GLC) and Global Food Security Support Analysis Data (GFSAD)). The two supervised classifiers used the same training samples and remotely sensed images as ACE-DTW. The two existing global cropland datasets contained data from 2015 and had a spatial resolution of 30 m. FROM-GLC extracted croplands through multi-season remotely sensed image classifiers, and GFSAD extracted croplands through supervised time series classifiers [40,41].

NDVI Time Series
The study used 422 frames of multispectral remotely sensed Landsat-8 images, which were captured between December 2014 and January 2016 (one additional month beyond estimations for improving the accuracy of time series reconstruction) and constructed a monthly NDVI time series of Pakistan for 2015. NDVI is one of the most-used vegetation indices for studying vegetation phenology [42]. The NDVI values of all the pixels were calculated, and each pixel was assigned its monthly maximum NDVI value. As a result, 14 monthly NDVI maps of Pakistan were generated. To eliminate outliers in the time series, a Savitzky-Golay (S-G) filter was used to smooth out the 14 monthly NDVI time series. The S-G filter is widely used for time series reconstruction [43,44]. It used simplified least-squares-fit convolution as its theoretical basis. After the time series were smoothed out by the S-G filter, 12 monthly NDVI maps with reconstructed time series were generated to construct a monthly time series of Pakistan croplands for 2015 ( Figure 3).

NDVI Time Series
The study used 422 frames of multispectral remotely sensed Landsat-8 images, which were captured between December 2014 and January 2016 (one additional month beyond estimations for improving the accuracy of time series reconstruction) and constructed a monthly NDVI time series of Pakistan for 2015. NDVI is one of the most-used vegetation indices for studying vegetation phenology [42]. The NDVI values of all the pixels were calculated, and each pixel was assigned its monthly maximum NDVI value. As a result, 14 monthly NDVI maps of Pakistan were generated. To eliminate outliers in the time series, a Savitzky-Golay (S-G) filter was used to smooth out the 14 monthly NDVI time series. The S-G filter is widely used for time series reconstruction [43,44]. It used simplified least-squares-fit convolution as its theoretical basis. After the time series were smoothed out by the S-G filter, 12 monthly NDVI maps with reconstructed time series were generated to construct a monthly time series of Pakistan croplands for 2015 ( Figure 3).

DTW Distance
We calculated the DTW distance among the reconstructed NDVI time series of training samples. The DTW algorithm calculates the distance between two time series based on the minimum distance with the least amount of bending. The sum of the shortest path distances is called the DTW distance [45]. In this study, 12-month NDVI values of each pixel were inserted into an array as the NDVI time series of the pixel, and the DTW distance between two time series was used to characterize the similarity of two pixels. The DTW distance between two time series and the similarity between them were inversely proportional. The DTW distance is calculated as follows: DTW(Q, C) is the DTW distance between Q and C, which represents the sum of the minimum distances between each value of Qndvi and Cndvi. The cumulative distance is defined as γ(i,j), and the sequences are matched up from point (0, 0). The distance between the current point (i,j) and the initial

DTW Distance
We calculated the DTW distance among the reconstructed NDVI time series of training samples. The DTW algorithm calculates the distance between two time series based on the minimum distance with the least amount of bending. The sum of the shortest path distances is called the DTW distance [45]. In this study, 12-month NDVI values of each pixel were inserted into an array as the NDVI time series of the pixel, and the DTW distance between two time series was used to characterize the similarity of two pixels. The DTW distance between two time series and the similarity between them were inversely proportional. The DTW distance is calculated as follows: For Pixel Q: Q ndvi = [q 1 , q 2 , . . . , q i , . . . , q 12 ] For Pixel C: C ndvi = [c 1 , c 2 , . . . , c j , . . . , c 12] DTW(Q, C) is the DTW distance between Q and C, which represents the sum of the minimum distances between each value of Q ndvi and C ndvi . The cumulative distance is defined as γ(i,j), and the Remote Sens. 2020, 12, 3644 6 of 17 sequences are matched up from point (0, 0). The distance between the current point (i,j) and the initial point is d (i,j), and the minimum distance between point i on Q and point j on C is |q i -c j |. The cumulative distance γ (i,j) can be expressed as follows: · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · q12 − c1 · · · · · · q12 − c12 + min γ (11,11), γ (11,12), γ(12, 11) The dynamic results obtained from the above equation are presented in matrix γ. The result of γ (12,12) is the DTW distance between Q and C [46].

Multi-Layer DBSCAN
We clustered time series of training samples into different categories using a multi-layer DBSCAN based on DTW distance. Our method can ensure that more than 95% of the information obtained from NDVI describing characteristics of different land cover types is automatically captured. The DTW algorithm enables our method to avoid the subjectivity of previous methods effectively based on a priori knowledge and, thus, reduce omissions of cropland types in large areas ( Figure 4).
Remote Sens. 2020, 12, x FOR PEER REVIEW 6 of 17 point is d (i,j), and the minimum distance between point i on Q and point j on C is |qi-cj|. The cumulative distance γ (i,j) can be expressed as follows: (11,11), (11,12), (12,11) The dynamic results obtained from the above equation are presented in matrix γ. The result of γ (12,12) is the DTW distance between Q and C [46].

Multi-Layer DBSCAN
We clustered time series of training samples into different categories using a multi-layer DBSCAN based on DTW distance. Our method can ensure that more than 95% of the information obtained from NDVI describing characteristics of different land cover types is automatically captured. The DTW algorithm enables our method to avoid the subjectivity of previous methods effectively based on a priori knowledge and, thus, reduce omissions of cropland types in large areas ( Figure 4). DBSCAN is a density-based clusterer that uses the Manhattan distance as the distance between two points [47,48]. In the set of training points N, the DTW distance between the NDVI time series of two pixels (i.e., Ni and Nj) was calculated to characterize the similarity between two time series instead of using the Manhattan distance between two points. DBSCAN requires two parameters, namely, Eps and Minpts. Minpts is the minimum number of points in each category, which was set to 3 (Minpts = 3) in this study. Eps is the radius of the cluster and was obtained based on the characteristics of the set of DTW distances between training points. Specifically, the distances between all the points in the set of training points N were calculated to generate an array. The DBSCAN is a density-based clusterer that uses the Manhattan distance as the distance between two points [47,48]. In the set of training points N, the DTW distance between the NDVI time series of two pixels (i.e., N i and N j ) was calculated to characterize the similarity between two time series instead of using the Manhattan distance between two points. DBSCAN requires two parameters, namely, Eps and Minpts. Minpts is the minimum number of points in each category, which was set to 3 (Minpts = 3) Remote Sens. 2020, 12, 3644 7 of 17 in this study. Eps is the radius of the cluster and was obtained based on the characteristics of the set of DTW distances between training points. Specifically, the distances between all the points in the set of training points N were calculated to generate an array. The obtained array was used to produce an Eps list based on the k-means algorithm and mathematical expectations [49,50]. The first value of the list was taken as Eps for the clusterer.
The training points used in the study cover different areas in Pakistan; thus, the pairwise DTW distances between training points have a relatively large range. Consequently, a single time clusterer may produce a large number of noise points that cannot be clustered, resulting in the omission of significant amounts of useful information. To resolve this issue, a multi-layer DBSCAN was built to obtain a relatively complete time series of each type of land cover. The noise points generated by the first clusterer were used as a new point set to be reclustered. In the new round of clustering, Minpts = 3, and Eps was determined according to the new point set. If the number of noise points generated by the second clusterer was still >5% of the number of points in the original set of training points N, a new clusterer was performed until the number of noise points was <5% of the number of points in N. After the data were clustered multiple times, the typical time series of each land cover type captured >95% of the information in the training samples. Cropland categories covering relatively small areas or influenced by varying cropping methods were unlikely to be omitted. Finally, the mean value of the time series for each category was taken as the standard time series of the category.

Cropland Mapping
In this study, a group of standard time series were generated for six land cover types: cropland, bare land, vegetation, water body, artificial surface, and ice and snow. A group of standard time series were generated for each land cover type. The minimum DTW distance between one pixel and each standard time series in a group representing one land cover type was taken as the DTW distance between the pixel and that land cover type. The DTW distances between all pixels and all land cover types were calculated. Pixels whose DTW distance to the cropland type was smaller than that to any other land cover types were classified as cropland pixels, thereby completing the cropland mapping [32] ( Figure 2). The entire mapping process was completed on the GEE, which eased the task of obtaining remotely sensed images and performing calculations for large areas (The algorithm used in this study is available on GEE at: https://code.earthengine.google.com/492035f66bf79da7a6093bf008aaa838)

Data Analysis
The study used 819 test samples ( Figure 1, Table 1) to calculate the overall accuracies (OAs) of the cropland map produced through ACE-DTW and four referent datasets. OA (i.e., the sum of the true positives plus true negatives divided by the total number of individuals tested) is a popular measures of classification accuracy derived from a confusion matrix [51][52][53]. The accuracy calculation processes were completed on GEE (Algorithm of is available on GEE at https://code.earthengine. google.com/1cbe39a363e0cf25cf493965079d107c). ArcGIS was used for visualization of overall and detailed cropland mapping results.  The large number of standard cropland time series reflects the fact that Pakistan has a complicated mix of cropland types resulting from its large land area and significant regional differences in climate, topography, and farming methods [39]. This is particularly true in Region III where a total of 76 standard cropland time series were generated. To avoid omissions of cropland types, we used 89 standard cropland time series to extract croplands in three regions and map national-scale croplands in Pakistan.

Cropland Mapping Results for Pakistan
The cropland maps obtained through ACE-DTW using 89 standard cropland time series and four reference datasets (FROM-GLC, GFSAD, CART, and SVM) were relatively consistent in terms of cropland coverage in Pakistan. Cropland covered an area of approximately 2 × 10 5 km 2 , of which approximately 90% was distributed in the southeastern plain region. This region has the highest cropland density in Pakistan, and the croplands in this region are mainly distributed along the Indus River. Croplands in the southwest plateau region and the northern mountain region were mainly distributed along valleys and cities, and in these two regions, croplands are small, fragmented, and discretely distributed ( Figure 6). These results support previous findings regarding cropland distribution in Pakistan and similar areas [33]. The large number of standard cropland time series reflects the fact that Pakistan has a complicated mix of cropland types resulting from its large land area and significant regional differences in climate, topography, and farming methods [39]. This is particularly true in Region III where a total of 76 standard cropland time series were generated. To avoid omissions of cropland types, we used 89 standard cropland time series to extract croplands in three regions and map national-scale croplands in Pakistan.

Cropland Mapping Results for Pakistan
The cropland maps obtained through ACE-DTW using 89 standard cropland time series and four reference datasets (FROM-GLC, GFSAD, CART, and SVM) were relatively consistent in terms of cropland coverage in Pakistan. Cropland covered an area of approximately 2 × 10 5 km 2 , of which approximately 90% was distributed in the southeastern plain region. This region has the highest cropland density in Pakistan, and the croplands in this region are mainly distributed along the Indus River. Croplands in the southwest plateau region and the northern mountain region were mainly distributed along valleys and cities, and in these two regions, croplands are small, fragmented, and discretely distributed ( Figure 6). These results support previous findings regarding cropland distribution in Pakistan and similar areas [33]. Remote Sens. 2020, 12, x FOR PEER REVIEW 9 of 17 To evaluate the accuracy of national-scale cropland maps, a total of 819 test samples were used to calculate OAs of five cropland maps (Figure 7). The calculations showed that ACE-DTW achieved the highest accuracy in nationwide maps compared to the reference maps. The OA of ACE-DTW was 89.7%. The commission and omission errors were 6.2% and 15.9%, respectively. The OA of ACE-DTW was higher than that of supervised classifiers for both nationwide (89.7% vs. 78.2%, 78.8%) and regional extractions (north: 93.5% vs. 89.0%, 79.8%; southeast: 88.1% vs. 68.0%, 78.8%; southwest: 93.4% vs. 81.0%, 80.1%); the OA of the nationwide extraction performed by ACE-DTW was 10% higher than that of the supervised classifiers (Figure 7). The OA of ACE-DTW was higher than the accuracies of the two existing cropland datasets in nationwide extractions (89.7% vs. 87.1%, 83.1%). At a regional level, the OA of ACE-DTW was higher than that of the two datasets in the northern mountain and southeastern plain regions (north: 93.5% vs. 87.0%, 88.5%; southeast: 88.1% vs. 83.1%, 80.8%); its accuracy was slightly lower than that of FROM-GLC but was still higher than that of GFSAD in the southwest plateau region (93.4% vs. 95.6%, 88.5%) (Figure 7). The above comparison indicates that ACE-DTW outperformed the supervised classifiers and existing global cropland maps in the 2015 national-scale cropland mapping of Pakistan. To evaluate the accuracy of national-scale cropland maps, a total of 819 test samples were used to calculate OAs of five cropland maps (Figure 7). The calculations showed that ACE-DTW achieved the highest accuracy in nationwide maps compared to the reference maps. The OA of ACE-DTW was 89.7%. The commission and omission errors were 6.2% and 15.9%, respectively. The OA of ACE-DTW was higher than that of supervised classifiers for both nationwide (89.7% vs. 78.2%, 78.8%) and regional extractions (north: 93.5% vs. 89.0%, 79.8%; southeast: 88.1% vs. 68.0%, 78.8%; southwest: 93.4% vs. 81.0%, 80.1%); the OA of the nationwide extraction performed by ACE-DTW was 10% higher than that of the supervised classifiers (Figure 7). The OA of ACE-DTW was higher than the accuracies of the two existing cropland datasets in nationwide extractions (89.7% vs. 87.1%, 83.1%). At a regional level, the OA of ACE-DTW was higher than that of the two datasets in the northern mountain and southeastern plain regions (north: 93.5% vs. 87.0%, 88.5%; southeast: 88.1% vs. 83.1%, 80.8%); its accuracy was slightly lower than that of FROM-GLC but was still higher than that of GFSAD in the southwest plateau region (93.4% vs. 95.6%, 88.5%) (Figure 7). The above comparison indicates that ACE-DTW outperformed the supervised classifiers and existing global cropland maps in the 2015 national-scale cropland mapping of Pakistan.

Figure 7.
Cropland mapping accuracy in Pakistan (as a whole) and the three humidity regions. The overall accuracy of ACE-DTW is higher than that of reference maps in Pakistan, the northern mountain region (Region I), and southeastern plain regions (Region II), but is slightly lower than that of FROM-GLC in the southwestern plateau region (Region III).

Cropland Mapping Performance in Typical Sites
Based on the overall evaluation, the performance of five cropland maps in greater detail was investigated. Six typical sites with different spatial distributions of cropland were selected for comparative analysis (Figure 8). The actual cropland coverage in these areas was plotted using Google Earth (Figure 8, Truth Data). According to the truth data, sites a and b are in the mountain region, in which the croplands are strip-shaped and distributed along the mountain valley, extending to the mountain ridges on both sides. Sites c and d are in the plateau region; the croplands in site c account for approximately half of the area, whereas croplands in site d appear to be scattered within a small area. Sites e and f are in the plain region. Site e is located at the junction of mountain and plain, which has typical croplands with proper divisions in the plains and typical narrow terraces in the mountains. Site f has the highest cropland density in Pakistan, with a cropland coverage of over 80%.
.  The overall accuracy of ACE-DTW is higher than that of reference maps in Pakistan, the northern mountain region (Region I), and southeastern plain regions (Region II), but is slightly lower than that of FROM-GLC in the southwestern plateau region (Region III).

Cropland Mapping Performance in Typical Sites
Based on the overall evaluation, the performance of five cropland maps in greater detail was investigated. Six typical sites with different spatial distributions of cropland were selected for comparative analysis (Figure 8). The actual cropland coverage in these areas was plotted using Google Earth (Figure 8, Truth Data). According to the truth data, sites a and b are in the mountain region, in which the croplands are strip-shaped and distributed along the mountain valley, extending to the mountain ridges on both sides. Sites c and d are in the plateau region; the croplands in site c account for approximately half of the area, whereas croplands in site d appear to be scattered within a small area.
Sites e and f are in the plain region. Site e is located at the junction of mountain and plain, which has typical croplands with proper divisions in the plains and typical narrow terraces in the mountains. Site f has the highest cropland density in Pakistan, with a cropland coverage of over 80%.
The OA results for the three regions showed that ACE-DTW performed well in mapping croplands in the mountain and plain regions ( Figure 7); additionally, results exhibited in five maps of six sites showed similar high performance ( Figure 8). Results in six sites showed that the supervised classifiers exhibited the worst performance among the five maps of all six sites (Figure 8, CART, SVM); thus, we do not discuss them in detail. Among the other two cropland datasets, ACE-DTW achieved higher accuracy in sites a, b, e, and f, whereas the accuracy of FROM-GLC was higher in sites c and d (Figure 8). In sites a and b, the croplands extracted by ACE-DTW were more complete, whereas the other two maps had clear omissions. Sites c and d in the southeastern plateau region mainly contain bare land and small cropland areas, and all maps for these sites exhibit errors of commissions and omissions; however, FROM-GLC best reflected the actual situation. In site f, the croplands extracted by ACE-DTW excluded the most farm roads and small cities, followed by FROM-GLC and GFSAD. For site e at the mountain-plain junction, ACE-DTW managed to remove farm roads and extract terraces more completely at the same time.
to the mountain ridges on both sides. Sites c and d are in the plateau region; the croplands in site c account for approximately half of the area, whereas croplands in site d appear to be scattered within a small area. Sites e and f are in the plain region. Site e is located at the junction of mountain and plain, which has typical croplands with proper divisions in the plains and typical narrow terraces in the mountains. Site f has the highest cropland density in Pakistan, with a cropland coverage of over 80%.
.  Results from six sites showed that ACE-DTW exhibited a good performance in the mountain (sites a and b) and plain regions (sites e and f). To confirm the completeness and accuracy of the standard time series of ACE-DTW in these two regions, we selected sites a and f for analysis. At mountain site a, ACE-DTW captured two types of cropland that the other two cropland maps failed to capture (Figure 9, Curves 3-4); as a result, the other two maps underestimated the area of croplands. At plain site f, where cropland density is high, ACE-DTW managed to filter out more small cities and roads because their time series are similar to those of large cities (Figure 9, Curves 12 and 9); the other two cropland maps could not differentiate these small cities and roads and croplands. The OA results for the three regions showed that ACE-DTW performed well in mapping croplands in the mountain and plain regions ( Figure 7); additionally, results exhibited in five maps of six sites showed similar high performance (Figure 8). Results in six sites showed that the supervised classifiers exhibited the worst performance among the five maps of all six sites (Figure 8, CART, SVM); thus, we do not discuss them in detail. Among the other two cropland datasets, ACE-DTW achieved higher accuracy in sites a, b, e, and f, whereas the accuracy of FROM-GLC was higher in sites c and d (Figure 8). In sites a and b, the croplands extracted by ACE-DTW were more complete, whereas the other two maps had clear omissions. Sites c and d in the southeastern plateau region mainly contain bare land and small cropland areas, and all maps for these sites exhibit errors of commissions and omissions; however, FROM-GLC best reflected the actual situation. In site f, the croplands extracted by ACE-DTW excluded the most farm roads and small cities, followed by FROM-GLC and GFSAD. For site e at the mountain-plain junction, ACE-DTW managed to remove farm roads and extract terraces more completely at the same time.
Results from six sites showed that ACE-DTW exhibited a good performance in the mountain (sites a and b) and plain regions (sites e and f). To confirm the completeness and accuracy of the standard time series of ACE-DTW in these two regions, we selected sites a and f for analysis. At mountain site a, ACE-DTW captured two types of cropland that the other two cropland maps failed to capture (Figure 9, Curves 3-4); as a result, the other two maps underestimated the area of croplands. At plain site f, where cropland density is high, ACE-DTW managed to filter out more small cities and roads because their time series are similar to those of large cities (Figure 9, Curves 12 and 9); the other two cropland maps could not differentiate these small cities and roads and croplands. .

Discussion
ACE-DTW introduced a multi-layer DBSCAN to generate standard time series of various cropland types automatically, and, thereby, can capture cropland information at a large national scale (~790,000 km 2 ). Although various researchers discovered the advantages of DTW in mapping croplands [4,30,31,40], existing studies primarily focused on small-scale study areas (~200-20,000 km 2 ) [29][30][31][32] or a specific crop (e.g., rice) [4], ignoring the approaches to obtain complex standard time series at a large spatial scale. For example, Csillik et al. [32] chose two test areas: the first one is in Southern California with an area of 575 km 2 and the second one is in Northwestern Texas with an area of 594 km 2 ; Guan and Huang [4] mapped rice cropping systems in Vietnam. We used a total of 89 standard time series automatically acquired by DBSCAN for our cropland mapping in Pakistan, whereas in previous studies, only 4-16 standard time series were generally used to map croplands [4,10,31,32]. For example, Belgiu et al. [30] only used 4-5 standard time series in three study areas, and these standard time series were based on typical crops. Mondal et al. [31] used 15 standard time series in Himachal Pradesh, and these standard time series are the average of time series with similar shapes and magnitudes in 62 cropland grids. The judgment of which time series are similar was made by the researcher; hence, it is difficult to apply this method to a large area. Furthermore, by controlling the number of noise points, ACE-DTW can obtain more than 95% of the information regarding cropland samples nationwide ( Figure 4). When the number of training points is increased, ACE-DTW has the potential to obtain more accurate and complete standard time series to describe the complex cropland status automatically, whereas in previous studies, to improve the ability to obtain cropland information, a greater effort was required.
ACE-DTW can extract more complete croplands in mountain region and more accurate croplands in plain region compared to the reference cropland datasets [40,41] in Pakistan (Figures 8 and 9). Generally, the area of croplands in the mountain region is easily underestimated and the area of croplands in the plain region is easily overestimated [54,55]. Pflugmacher et al. [54] determined that maps tend to slightly under-predict certain small classes (croplands in mountain region in our study) and over-predict certain large classes (croplands in plain region in our study). Fritz et al. [55] found that three land cover products based on remotely sensed images missed a high proportion of cropland in the northern Sudan where the desert and the Nile Valley are located. In contrast, ACE-DTW obtained complete and accurate standard time series through multi-layer DBSCAN, to fully capture relatively complete cropland characteristics in mountain region and accurately distinguish croplands from other land cover types such as artificial surfaces in plain region. Particularly, croplands play increasingly important roles in mountain regions [56][57][58][59], and ACE-DTW can be used as an efficient method to map croplands there and might be able to help solve various problems related to ecosystem management [59], water retention and storage [60], and SDG (Sustainable Development Goal) 2 in mountain region [61].
However, ACE-DTW's ability to distinguish croplands from sparse vegetation is limited, particularly in the plateau region of Pakistan (Figures 6-8). Such misidentifications often occur in remote sensing-based cropland mapping studies [33,59,[62][63][64]. For example, Matton et al. [63] found that croplands are easily mixed with neighboring vegetation in fields lacking cropland management. Nabil et al. [64] found that spectral similarity with grasslands and other herbaceous vegetation limits accurate mapping of African cropland, especially in arid and semi-arid areas. In these study areas that are similar to the plateau region of Pakistan, the rainfall regime causes synchronized phenology between crops and natural vegetation; croplands which are fallow in the dry season and cultivated again in the rainy season [65] are spectrally similar to natural vegetation that withers in the dry season and grows in the rainy season [66] in NDVI time series; hence, it is difficult to separate crops from natural vegetation. Addressing this problem is a challenge and requires further investigation. In future studies, additional remotely sensed images-based indexes (e.g., the enhanced vegetation index of multispectral remotely sensed images [67] and the C-band like-polarized ratio of synthetic aperture radar images [12]) and the fusion of multi-source data (e.g., Landsat-7, Landsat-8, and MODIS [68])