An Unsupervised Crop Classification Method Based on Principal Components Isometric Binning

The accurate and timely access to the spatial distribution information of crops is of great importance for agricultural production management. Although widely used, supervised classification mapping requires a large number of field samples, and is consequently costly in terms of time and money. In order to reduce the need for sample size, this paper proposes an unsupervised classification method based on principal components isometric binning (PCIB). In particular, principal component analysis (PCA) dimensionality reduction is applied to the classification features, followed by the division of the top k principal components into equidistant bins. Bins of the same category are subsequently merged as a class label. Multitemporal Gaofen 1 satellite (GF-1) remote sensing images were collected over the southwest of Hulin City and Luobei County of Hegang City, Heilongjiang Province, China in order to map crop types in 2016 and 2017. Our proposed method was compared with commonly used classifiers (random forest, K-means and Iterative Self-Organizing Data Analysis Techniques Algorithm (ISODATA)). Results demonstrate PCIB and random forest to have the highest classification accuracies, reaching 82% in 2016 in the southwest of Hulin City. In Luobei County in 2016, the accuracies of PCIB and random forest were determined as 81% and 82%, respectively. It can be concluded that the overall accuracy of our proposed method meets the basic requirements of classification accuracy. Despite exhibiting a lower accuracy than that of random forest, PCIB does not require a large field sample size, thus making it more suitable for large-scale crop mapping.


Introduction
Food security is crucial for the livelihood and economic development of the global population. The timely and accurate acquisition of crop planting area and its spatial distribution is of great significance for the accurate estimation of crop yield, the management of crop planting structure adjustments and the formulation of relevant regulatory policies [1][2][3][4]. The majority of current remote sensing-based crop classification methods fall into the category of supervised classification, including machine learning [5][6][7][8][9][10][11][12] and deep learning [13,14]. Such methods require the collection of ground sample data in order to train the classification model. The accuracy and spatial distribution of the sample data collected in the field also affects the classification accuracy. The recent increase in demand of large-scale crop classifications has resulted in the dramatic rise of ground sampling [8,15]. Consequently, annual field sampling consumes an extensive amount of manpower, material and financial resources [16]. Furthermore, the majority of the current crop classification techniques require current-year samples, while the sample collection, data processing and additional steps are highly time consuming, leading to a lag in classification data. This prevents the effective application of the classification results for the current agricultural season.
Current research focuses on reducing the dependence on large sample datasets while ensuring the accurate and timely classification of crops for the current season. At present, there exist three methods to directly classify crops using limited or no sample data from the current season: (i) the supervised classification of crops based on historical samples; (ii) semi-supervised classification; and (iii) unsupervised classification.
Historical-based classification is often established using auxiliary information such as the stability of the crop planting structure, or the use of past unchanged crop pixels as the "training samples" for the crops in the present season. For example, Hao [17] used historical year cropland data layer (CDL) data collected in Kansas, the United States to extract hypothetical samples. These samples were then screened via the Artificial Antibody Network (ABNet) method to obtain "training samples" for the classification of crops in the current season at 90% accuracy. Zhang [18] integrated historical information with current crop planting structure and spectral data to investigate new sample production and screening methods based on cluster analysis. The proposed seasonal crop classification model was able to attain a classification accuracy of 80.44% for samples located in Hulin City, China. Despite the high accuracy of crop classification based on historical data, such methods are limited in areas lacking historical samples.
Semi-supervised learning methods are able to leverage large amounts of relatively inexpensive unlabeled data with a limited amount of labeled data [19][20][21][22]. For example, Liu [20] proposed a semi-supervised support vector machine approach using a self-training algorithm for land cover classification in Yanji City, Jilin province, China, with the highest accuracy achieved for the marked and unmarked sample ratio of 1:3. In Hu [23], a cooperative training method based on the random forest and matrix completion collaborative representation (MCCR) classifiers was applied to map land cover using Landsat time-series (LTS) images of Wuhan City, China, between 2000 and 2015. Results revealed that the MCCR classifier exhibited the same working efficiency as that of the random forest classifier in the presence of data noise, yet the cooperative training method was able to improve the classification through the iterative learning of unlabeled data. However, these semi-supervised learning-based models are not suitable for the land cover classification of remote sensing images with a very small number of labeled and large unlabeled samples because of high computational complexities associated with the use of a greater number of kernels [24]. Ratle [25] proposed a semi-supervised remote sensing image classification framework based on neural networks, improving the classification accuracy and scalability of previous techniques. Solano [26] combined hierarchical correlation clustering (HCC) and an artificial neural network (ANN) for a semi-supervised classification method (HHC-ANN) based on the Normalized Difference Vegetation Index (NDVI) and phenological characteristics. The classification accuracy of HHC-ANN (84%) surpassed that of both K-means and ANN. Neural networks are associated with multiple benefits, including adaptivity, speed, fault tolerance and optimality [27]. Thus, we can choose robust semi-supervised classification models for efficient crop classification.
Unsupervised classification methods have been applied in order to efficiently process a large number of unlabeled samples in remote sensing images. Such methods do not require sample data and only rely on spectrum or texture information to extract and divide image features based on their statistical characteristics. For example, Gumma [28] and Xiong [29] used K-means and ISODATA classification for the clustering of Moderate-resolution Imaging Spectroradiometer (MODIS) 250 m 16-day-derived NDVI time series to determine a rice distribution map of South Asia and crop distribution across Africa. Hao [30] and Cai [31] combined MODIS and Landsat images with field data to determine NDVI time series. This was then used to extract the spatial distribution information of the key crops in northeast China and Hubei province via the ISODATA algorithm and spectral coupling technology. Sherrie [32] used K-means and Gaussian mixture model (GMM) to perform unsupervised classification of crops in the midwestern United States, with CDL data employed for ground truth. Classification accuracies were observed to vary with vegetation density. Iounousse [33] proposed an unsupervised approach based on a Probabilistic Neural Network (PNN) with the implementation of the cluster validity technique. The approach was applied for a sequence of seven time series of NDVI remote sensing images acquired by Landsat and Systeme Probatoire d'Observation de la Terre (SPOT) to build a land use map. The obtained results are compared with real land use and exhibited an overall accuracy of 96.56%. Venkata [34] combined K-means and a probabilistic neural network to propose a new classification method with improved classification accuracy.
Unsupervised classification mapping does not require a large number of ground samples. This is particularly true for the traditional K-means and ISODATA methods which are widely used in land cover and crop classification [28,32,35]. However, these algorithms are highly sensitive to outliers, high dimensionality, and noise. Thus, in order to improve classification accuracy, the dimensionality of remote sensing images must be reduced prior to classification. Principal component analysis (PCA) is able to maintain the maximum amount of original information for the elimination of noise and can be applied to classify land cover via satellite imagery time series [36][37][38][39][40]. For example, Abedini [39] performed the land classification of the Ulu Kinta Catchment in the state of Perak and across Peninsular Malaysia based on Landsat Thematic Mapper (TM) images. PCA was used to determine the optimum band combination for classification, followed by ISODATA clustering with an overall accuracy of 80%. Dharani [41] performed PCA dimensionality reduction and morphological operations using Landsat-8 images and retained the first three principal components for K-means clustering to determine the changes in land use and cover (LULC), with promising classification results. Therefore, in order to obtain high-precision crop distributions mapping, we can also use PCA to obtain low-dimensional and high-quality feature data prior to classification.
Clustering algorithms such as K-means and ISODATA need to randomly select K 0 samples from the dataset as the initial clustering center. However, the classification results are sensitive to the initial center. Thus, in order to avoid local optimality, multiple experiments are needed [42][43][44]. Data binning, commonly used in statistics, is often used to merge similar pixels in image processing. We attempted to integrate this method with PCA for the classification procedure. Although PCA and data binning have been widely applied in different fields, we combined and applied them to crop classification in remote sensing, which is a new idea. The principal components, regarded as a series of continuous variables, were divided into a number of bins. Unlike K-means and ISODATA, the process of binning does not require multiple iterations per pixel, and is suitable for application of high-resolution remote sensing images with a large number of pixels in crop classification.
Therefore, in the current paper, we propose a crop classification method based on principal components isometric binning (PCIB), taking the southwest of Hulin City and Luobei County of Hegang City, Heilongjiang Province, China as the study areas. We explore the feasibility of our proposed method via comparisons with the traditional random forest, K-means and ISODATA approaches, and demonstrate PCIB as a promising approach for crop classification mapping without the requirement of seasonal sample data.

Study Area
The southwest of Hulin City and Luobei County, with a total area of 3000 km 2 in Heilongjiang Province, were selected as the research area ( Figure 1). The planting calendar is relatively stable, with small inter-annual differences. The growing season of the key crops generally lies from March to October. For example, maize is sown in late April to early May, reaches maturity in September and is harvested around October. Rice is planted in April and matures in September. The two research areas are located in the main grain-producing areas of Heilongjiang Province and are less affected by other field objects. At the same time, these areas can effectively compare the regional differences and verify the feasibility of the method proposed in this paper.

Multitemporal GF-1 Data
Since the revisit cycle of GF-1 image is 4 days and the spatial resolution is 16 m, it has the characteristics of large coverage area and high spatial and temporal resolution. We selected GF-1 wide field view (WFV) images during the main growing seasons of maize and rice in 2016 and 2017 (April-September) with cloud cover <10% ( Table 1). The data were sourced from the China Centre for Resource Satellite Data and Application (CRESDA) and were the level-one product (L1) [18]. Field samples were used to verify the accuracy of our proposed classification approach. The sampling of land features included recording latitude and longitude coordinates with a handheld The two research areas are located in the main grain-producing areas of Heilongjiang Province and are less affected by other field objects. At the same time, these areas can effectively compare the regional differences and verify the feasibility of the method proposed in this paper.

Multitemporal GF-1 Data
Since the revisit cycle of GF-1 image is 4 days and the spatial resolution is 16 m, it has the characteristics of large coverage area and high spatial and temporal resolution. We selected GF-1 wide field view (WFV) images during the main growing seasons of maize and rice in 2016 and 2017 (April-September) with cloud cover <10% ( Table 1). The data were sourced from the China Centre for Resource Satellite Data and Application (CRESDA) and were the level-one product (L1) [18]. Field samples were used to verify the accuracy of our proposed classification approach. The sampling of land features included recording latitude and longitude coordinates with a handheld global positioning system (GPS), manually noting vegetation type and collecting photos. Because of the ISPRS Int. J. Geo-Inf. 2020, 9, 648 5 of 24 large plots of Heilongjiang, most of the sampling was taken along the roadside of the field. In addition to maize and rice, we also collected land cover samples of wheat plots, greenhouses, woodland, water bodies and other land types. However, these land cover types were not the focus of the study and thus their sampling was limited. These land cover types were combined into the "others" category during classification.
A total of 322 and 455 samples were taken of the southwest of Hulin City in 2016 and 2017, respectively. In Luobei County, Hegang City, a total of 383 samples were collected in 2016 and 406 in 2017 ( Table 2). All samples within each year were used to build the supervised classifier, whereby the samples were randomly divided into training samples (2/3 of the total sample size) and verification samples (1/3 of the total sample size). Unsupervised classification methods require the same verification samples as supervised classification to determine classification accuracy. One-third of the training samples were taken randomly to determine the category label for the unsupervised classification results.

Methods
The workflow of the proposed method can be divided into the following five components: data preprocessing, feature value selection, comparison of classification methods, category label determination and accuracy evaluation ( Figure 2). In brief, the GF-1 WFV automatic processing and sharing platform developed by our team was used to preprocess the image and sample data. Following this, the Near-Infrared Reflectivity (NIR) reflectance and vegetation indices Normalized Difference Vegetation Index (NDVI) and Normalized Difference Water Index (NDWI) were selected as the characteristic values for the classification. The images were then classified using the random forest, K-means, ISODATA and PCIB techniques. Category labels were determined for the classification results of the K-means, ISODATA and PCIB methods. Finally, the results were verified and compared between methods.

Data Preprocessing
The GF-1 images and field sample data were stored using the Raster Dataset Clean and Reconstitution Multi-Grid (RDCRMG) grid system developed by China Agricultural University. Based on the C # and the Geospatial Data Abstraction Library (GDAL), procedures such as radiometric calibration, ortho-rectification and the image registration were performed for all data [45,46]. Radiometric calibration was used to eliminate the errors generated by the sensor and convert the dimensionless digital number (DN) value recorded by the sensor into the radiation brightness or reflectivity of the atmospheric top layer. Based on the official radiometric calibration coefficient updated by the China Resources Satellite Application Center, we radiometrically calibrated the GF-1 data as follows: where L e (λ e ) is the radiation brightness after conversion, DN is the observed satellite load value, Gain is the calibration slope, and Offset refers to the offset of absolute calibration coefficient.

Data Preprocessing
The GF-1 images and field sample data were stored using the Raster Dataset Clean and Reconstitution Multi-Grid (RDCRMG) grid system developed by China Agricultural University. Based on the C # and the Geospatial Data Abstraction Library (GDAL), procedures such as radiometric calibration, ortho-rectification and the image registration were performed for all data [45,46]. Radiometric calibration was used to eliminate the errors generated by the sensor and convert the dimensionless digital number (DN) value recorded by the sensor into the radiation brightness or reflectivity of the atmospheric top layer. Based on the official radiometric calibration coefficient updated by the China Resources Satellite Application Center, we radiometrically calibrated the GF-1 data as follows: where ( ) is the radiation brightness after conversion, DN is the observed satellite load value, Gain is the calibration slope, and Offset refers to the offset of absolute calibration coefficient.

Feature Selection for Classification
The selection of features for remote sensing imagery classification is a function of the classification objectives and accuracy, as well as the spatial and temporal scales. Potential features for crop classification include spectral, spatial, temporal and polarization characteristics, digital elevation model (DEM) and additional auxiliary information [5,11]. In order to identify the maize and rice crops within the study area, we selected the NIR spectral signal, the vegetation indices NDVI and NDWI, and temporal characteristics including categorical feature variables (Table 3).

Feature Selection for Classification
The selection of features for remote sensing imagery classification is a function of the classification objectives and accuracy, as well as the spatial and temporal scales. Potential features for crop classification include spectral, spatial, temporal and polarization characteristics, digital elevation model (DEM) and additional auxiliary information [5,11]. In order to identify the maize and rice crops within the study area, we selected the NIR spectral signal, the vegetation indices NDVI and NDWI, and temporal characteristics including categorical feature variables (Table 3). Table 3. Feature value selection. R, G and NIR are the reflectance of red, green and near-infrared bands, respectively.

Feature Value Formula Application
Near-Infrared Reflectivity NIR Canopy structure Normalized Difference Vegetation Canopy structure, water content Figure 3 presents the time-series curves of NDVI and NDWI during the growth periods of rice and maize in Luobei County, Hegang City in 2016. The growth period of rice and maize was very close, the NDVI and NDWI curves had a high similarity. The NDVI and the NDWI differed greatly in the early growth stage and differed slightly in the middle and late stages.  Figure 3 presents the time-series curves of NDVI and NDWI during the growth periods of rice and maize in Luobei County, Hegang City in 2016. The growth period of rice and maize was very close, the NDVI and NDWI curves had a high similarity. The NDVI and the NDWI differed greatly in the early growth stage and differed slightly in the middle and late stages.

Random Forest Classification
We employ the supervised classification of the target crops using the random forest training model on the feature values extracted from the GF-1 time series. Random forest classification (RF) is a multi-decision tree classification method that was proposed by Breiman [47] in 2001. In particular, multiple classification and regression trees (CART) were constructed through random resampling of the data and feature variables. The classification of the data was then executed via multi-decision tree voting [48].
We used the Python tool scikit-learn [49] to execute the random forest classifier, with a total of 100 trees generated. The number of features of each tree is equal to the square root of the number of original features, and the number of samples selected for each tree is consistent with the number of training sets.

Unsupervised Classification
We performed K-means and ISODATA unsupervised learning for the clustering of feature data into crop types using ENVI (ver. 5.1, ESRI). K-means and ISODATA are the most basic and commonly used unsupervised classification algorithms. They are simple in principle and easy to implement, and are widely employed in the remote sensing field [28][29][30]35,39].

K-Means
K-means partitions m samples into k clusters by alternately assigning samples to the nearest cluster centroid, as measured by Euclidean distance. The cluster centroids are updated using the mean of the samples assigned to the cluster.

ISODATA
ISODATA initially calculates the average of evenly distributed classes within the data space, and subsequently employs the rule of minimum distance to iteratively aggregate the remaining pixels. The mean value is recalculated in each iteration, and the pixel is reclassified according to the new

Random Forest Classification
We employ the supervised classification of the target crops using the random forest training model on the feature values extracted from the GF-1 time series. Random forest classification (RF) is a multi-decision tree classification method that was proposed by Breiman [47] in 2001. In particular, multiple classification and regression trees (CART) were constructed through random resampling of the data and feature variables. The classification of the data was then executed via multi-decision tree voting [48].
We used the Python tool scikit-learn [49] to execute the random forest classifier, with a total of 100 trees generated. The number of features of each tree is equal to the square root of the number of original features, and the number of samples selected for each tree is consistent with the number of training sets.

Unsupervised Classification
We performed K-means and ISODATA unsupervised learning for the clustering of feature data into crop types using ENVI (ver. 5.1, ESRI). K-means and ISODATA are the most basic and commonly used unsupervised classification algorithms. They are simple in principle and easy to implement, and are widely employed in the remote sensing field [28][29][30]35,39].

K-Means
K-means partitions m samples into k clusters by alternately assigning samples to the nearest cluster centroid, as measured by Euclidean distance. The cluster centroids are updated using the mean of the samples assigned to the cluster.

ISODATA
ISODATA initially calculates the average of evenly distributed classes within the data space, and subsequently employs the rule of minimum distance to iteratively aggregate the remaining pixels. The mean value is recalculated in each iteration, and the pixel is reclassified according to the new mean value. This process continues until the number of pixels in each class changes less than the selected threshold or reaches the maximum number of iterations. The parameters required for the ISODATA classification include the initial clustering center and the number of categories [50].

Principal Components Isometric Binning Classification
The PCIB proposed in this paper is based on PCA dimensionality reduction and principal components isometric binning. Figure 4 presents the classification process. mean value. This process continues until the number of pixels in each class changes less than the selected threshold or reaches the maximum number of iterations. The parameters required for the ISODATA classification include the initial clustering center and the number of categories [50].

Principal Components Isometric Binning Classification
The PCIB proposed in this paper is based on PCA dimensionality reduction and principal components isometric binning. Figure 4 presents the classification process.  Let m denote the number of pixels in the target area, and assume that n is the feature variable, then matrix X can be used to represent the m × n data as follows: where x mn represents the value of the mth pixel and the nth feature vector. The correlation coefficient matrix R is then calculated as: where r ij (i = 1,2, . . . ,m, j = 1,2, . . . ,n) is the correlation coefficient of x i and x j . The eigenequation |λE − R| = 0 is solved for eigenvalue λ j (j = 1, 2, . . . , n) and λ 1 ≥ λ 2 ≥ . . . ≥ λ n ≥ 0. Eigenvector e j (j = 1, 2, . . . , n) corresponds to eigenvalue λ j . When the cumulative contribution rate of the principal components following the eigenvector transformation reaches a high percentage, the required dimensionality reduction of the data is attained.
In this study, we apply the cumulative contribution rate to determine k. When the cumulative contribution rate of the current k principal components reaches more than 70%, the former k principal components are retained, that is, the k-dimensional data is reduced. The cumulative contribution rate of the first k principal components is defined as , namely, the number of original variables information extracted from the first k principal components.
PCA can be used for dimensionality reduction, classification and feature extraction [51,52].

Principal Components Isometric Binning
We reduce the original dataset to k-dimensions to obtain matrix Y, described as follows: where k is the number of principal components following the dimensionality reduction and m is the number of pixels in the study area. Matrix Y is divided into bins of equal distance, which are then further divided into k1 bins. The detailed steps are as follows: (i) The first principal component is divided into k 11 intervals. All pixels falling into each interval are gathered together, with bin distance h1 = r j1 −r i1 k11 , where r j1 is the maximum value of the column and r i1 is the minimum value.
(ii) The second principal component is divided into k 11 intervals corresponding to the binning result of the first principal component. Each interval is then divided into k 12 sub-intervals. All pixels in each sub-interval are divided into a bin, and bin distance h = is divided into k 11 * k 12 * . . . * k 1k bins, that is, k1 = k 11 * k 12 * . . . * k 1k , until the end. A frequency distribution histogram is drawn to show the binning situation intuitively.
By exploring the results of the first binning, we found that there will be confusion if a large number of pixels fall into a bin. In order to improve the classification accuracy, these bins are regathered. More specifically, each bin is divided into k2 (k2 = k 21 *k 22 * . . . *k 2k ) bins, with the division steps equal to those of the first binning.

Determination of Category Labels
The binning results do not represent crop type, and crop labels need to be assigned to them. Visual recognition was combined with minimal ground auxiliary data (auxiliary identification data mentioned in Section 2.2.2.) to perform visual discrimination via ArcMap (ver. 10.2, ERSI) using a 432 band false color synthesis and Google Earth high-resolution images.
The first binning results in the discrimination of water bodies, grasslands, woodlands, bare land and towns (denoted as "other classes"), while the naming of rice and maize is based on the auxiliary ground samples. If a bin contains a single sample type, it is denoted as a single class; if multiple sample types are present, it is denoted as a confusion class. As the accuracy of the first binning results must be evaluated to determine the value of k1, a label for each confusion bin is determined. This allows for the selection of the ground objects compromising the largest proportion in this bin as the bin category.
We labelled the second binning result using both the aforementioned method and the auxiliary ground samples.

Accuracy Assessment
One-third of the total sample size collected in the field was taken as the verification sample, with the confusion matrix employed to evaluate the accuracy of the classification results. Evaluation indicators include the total, producer and user accuracies and Kappa coefficient: where TP and

Effect of Parameter Selection
The selection of parameters is a critical step in our proposed method as it has a direct impact on the binning results. In the following, we detail the selection of two key user-defined parameters.
(1) Number of the first bin, k1 The determination of parameter k1 is key. If the value of k1 is too small, the classification results will be highly inaccurate; whereas, if k1 is too high, visual recognition proves to be difficult. Thus, we set the initial range to 5 ≤ k1 ≤ 50, with the various k (number of principal components) used to further determine k1. Here, based on the dataset after dimensionality reduction, only two cases are discussed (k = 1, 2). Taking the southwest of Hulin City as an example, in the following, we determine parameter k1.
For k = 1, an interval of 5 is used to divide the k-dimensional data into 5, 10, 15, . . . , 50 bins and calculate the classification accuracy of each bin ( Figure 5, left). The classification accuracy increases with k1 until the maximum accuracy is reached for k1 = 35. Thus, we set 30 ≤ k1 ≤ 40. cases are discussed ( = 1, 2). Taking the southwest of Hulin City as an example, in the following, we determine parameter 1. For = 1, an interval of 5 is used to divide the -dimensional data into 5, 10, 15, ... , 50 bins and calculate the classification accuracy of each bin ( Figure 5, left). The classification accuracy increases with 1 until the maximum accuracy is reached for 1 = 35. Thus, we set 30 ≤ 1 ≤ 40.  (Table 4). According to a certain interval, 15 values of k1 are selected relatively uniformly. Figure 5 (right) demonstrates the classification accuracy corresponding to each value. The peak classification accuracy is observed for = 3 and = 4. Thus, k1 is set as follows: The value of k1 that corresponds to the highest classification accuracy is selected as the final value.   When k = 2, the k-dimensional data are divided into k1 = k 11 * k 12 bins. The contribution rate of the first principal component is greater than that of the second principal component, k 11 > k 12 , and 5 ≤ k 11 * k 12 ≤ 50, thus, there are 51 combinations of k 11 and k 12 that satisfy the above conditions (Table 4). According to a certain interval, 15 values of k1 are selected relatively uniformly. Figure 5 (right) demonstrates the classification accuracy corresponding to each value. The peak classification accuracy is observed for k 12 = 3 and k 12 = 4. Thus, k1 is set as follows: 12 * 3, . . . , 16 * 3, 12 ≤ k 11 ≤ 16, k 12 = 3 5 * 4, . . . , 12 * 4, 5 ≤ k 11 ≤ 12, k 12 = 4 (10) The value of k1 that corresponds to the highest classification accuracy is selected as the final value.
(2) Number of the second bin, k2 Taking the southwest of Hulin City as an example, the value range of k2 is derived in a similar way. We initially set 3 ≤ k2 ≤ 20. According to the different k(number of principal components) values, k2 is then determined as follows: (i) When k = 1: 11 ≤ k2 ≤ 16 (11) (ii) When k = 2: k2 = The confusion bin for k1 = X (the classification accuracy is highest when k1 = X) is divided into k2 bins again, and the accuracy of the binning results is evaluated. The value of k2 is selected as that with the highest classification accuracy.

Comparison of PCIB Results
Taking the southwest of Hulin City for the example, we compared the results of the two sets of data binning in 2016.
The contribution rate of the first and second principal components reached 70% following the principal component dimension reduction of the multidimensional characteristic data, and thus k = 2. The two-dimensional data were divided into k1 bins, with the corresponding frequency distribution histograms depicted in Figure 6.
Taking k1 = 12 * 4 as an example, Table 5 reported the number of pixels falling into the bins.   Taking k1 = 12 * 4 as an example, Table 5 reported the number of pixels falling into the bins. We assign a crop type label to each bin according to the method described in Section 3.6. The bins with the same label are merged to obtain the classification result map of the region corresponding to the k1 values. The overall accuracy is reported in Table 6. The maximum overall accuracy is observed for k1 = 12 * 4 at 75.23%. The k1 = 12 * 4 confusion bin is divided for the second time. Table 7 reports the overall classification accuracy of the k2 values. The overall accuracy peaks at k2 = 5 * 4, reaching 81.65%.  Figure 7 presents the corresponding 2016 crop distribution. Figure 7a,b are able to distinguish between water and forests. However, in Figure 7a, the road was misclassified into maize, and rice was misclassified into maize. In Figure 7b, this misclassification was greatly reduced and the rice plot outline is clear. This indicates the improvement of the crop classification accuracy via the second binning.

Comparison of Classification Methods
We applied the random forest (RF), K-means, ISODATA and PCIB classifiers to the preprocessed images and based on the verification samples, we used the confusion matrix to evaluate the accuracy of the classification results. Evaluation indicators include the total accuracy, user accuracy, producer accuracy and Kappa coefficient. The classification accuracy of the PCIB method in the southwest of Hulin city and Luobei County of Hegang City was slightly lower than that of random forest, yet it was consistently more accurate than K-means and ISODATA (Table 8). In 2016, the PCIB method exhibited the most accurate classification in the southwest of Hulin City, reaching 82%, which is equal to that of random forest and higher than that of K-means (76%) and ISODATA (78%). The PCIB classification accuracy of Luobei County, Hegang City was reduced in 2017 (76%), and the accuracy of the other three methods in this area was also low, at 79%, 74% and 75% for random forest, K-means and ISODATA, respectively. This is attributed to the limited availability of the images for the region in 2017 (early April to late June). However, images in July, when rice is in the jointing stage with vigorous growth, are key to distinguishing maize from rice. Thus, the absence of images from July to September affected the final classification accuracy.  Figure 7a,b are able to distinguish between water and forests. However, in Figure 7a, the road was misclassified into maize, and rice was misclassified into maize. In Figure 7b, this misclassification was greatly reduced and the rice plot outline is clear. This indicates the improvement of the crop classification accuracy via the second binning.

Comparison of Classification Methods
We applied the random forest (RF), K-means, ISODATA and PCIB classifiers to the preprocessed images and based on the verification samples, we used the confusion matrix to evaluate the accuracy of the classification results. Evaluation indicators include the total accuracy, user accuracy, producer accuracy and Kappa coefficient. The classification accuracy of the PCIB method in the southwest of Hulin city and Luobei County of Hegang City was slightly lower than that of random forest, yet it was consistently more accurate than K-means and ISODATA (Table 8). In 2016, the PCIB method exhibited the most accurate classification in the southwest of Hulin City, reaching 82%, which is equal to that of random forest and higher than that of K-means (76%) and ISODATA (78%). The PCIB classification accuracy of Luobei County, Hegang City was reduced in 2017 (76%), and the accuracy  The classification accuracy of the PCIB method is generally slightly lower than that of random forest, but it is consistently higher than K-means and ISODATA. For years where ground and historic samples are insufficient, the classification accuracy of PCIB can potentially meet the classification requirements.
Confusion matrices are commonly used to evaluate the accuracies of algorithms. However, due to the small sample size of our data, applying just the confusion matrix data size to evaluate the classification accuracy is insufficient. Thus, we integrate the crop distribution map to assess the quality of the classification results. Figure 8 depicts the crop distribution map of 2016 for Luobei County. The maps are superimposed with false color images in the area. Figure 8a-d correspond to the random forest, K-means, ISODATA and PCIB results, respectively. All four methods are able to distinguish water and forests from maize and rice, but the magnified area demonstrates that the misclassification of roads as maize was more serious for the K-means, ISODATA and random forest classification methods. Moreover, the rice plot outline is not clear for K-means and ISODATA. The PCIB method exhibits improved classification results, with the clear identification of the rice plot contour, roads and buildings. This can be attributed to the initial PCA dimensionality reduction performed by the PCIB method, eliminating the influence of noise on the classification results by maintaining as much of the original information as possible, thus achieving better classification results on the crop distribution map compared to the other methods.

Advantanges, Deficiencies and Improvenments of PCIB
We propose a classification method based on principal components equidistant binning, denoted as PCIB, for the extraction of crop planting structure. PCIB essentially performs PCA dimensionality reduction on the classification features and subsequently divides the top k principal components into equidistant bins. The category of each bin is visually recognized based on a small number of field samples, and the same category bins are merged as a class label. PCIB includes PCA and data binning, PCA has low computational costs and is not limited by parameters. Furthermore, the dimensionality reduction technique can effectively remove the redundancy and noise of data by keeping the original information as far as possible, and thus it is employed widely in the remote sensing field [53][54][55]. Data binning is a common method in statistics, for the merging similar pixels in image processing. In statistics, nonparametric estimation [56] is mainly used for binning in datasets with unknown probability density, including histogram, kernel density estimation and nonparametric regression. In this study, we chose the simplest histogram, which has the advantages of constant bin distance, rapid binning and high efficiency, and is suitable for large data volumes.
In recent years, soft clustering methods such as GMM and PNN have also been gradually applied in the field of remote sensing for land cover and crop type, and promising classification results have been obtained. The GMM method has a long run time because it includes the time used for searching optimal components [32,57]. PNN is a special form of radial basis function neural networks (RBFNN), and it provides satisfactory results if the initial target classes are defined correctly. So, finding the basis function centers with their appropriate number is an important step

Advantanges, Deficiencies and Improvenments of PCIB
We propose a classification method based on principal components equidistant binning, denoted as PCIB, for the extraction of crop planting structure. PCIB essentially performs PCA dimensionality reduction on the classification features and subsequently divides the top k principal components into equidistant bins. The category of each bin is visually recognized based on a small number of field samples, and the same category bins are merged as a class label. PCIB includes PCA and data binning, PCA has low computational costs and is not limited by parameters. Furthermore, the dimensionality reduction technique can effectively remove the redundancy and noise of data by keeping the original information as far as possible, and thus it is employed widely in the remote sensing field [53][54][55]. Data binning is a common method in statistics, for the merging similar pixels in image processing. In statistics, nonparametric estimation [56] is mainly used for binning in datasets with unknown probability density, including histogram, kernel density estimation and non-parametric regression. In this study, we chose the simplest histogram, which has the advantages of constant bin distance, rapid binning and high efficiency, and is suitable for large data volumes.
In recent years, soft clustering methods such as GMM and PNN have also been gradually applied in the field of remote sensing for land cover and crop type, and promising classification results have been obtained. The GMM method has a long run time because it includes the time used for searching optimal components [32,57]. PNN is a special form of radial basis function neural networks (RBFNN), and it provides satisfactory results if the initial target classes are defined correctly. So, finding the basis function centers with their appropriate number is an important step to achieve suitable classification. However, this process has a high requirement for space-time complexity, so computational optimization is essential [33,34,[58][59][60]. Compared with these clustering methods, PCIB has high computational efficiency, but has some deficiencies in classification accuracy. In order to solve this problem, we can explore such dynamic methods as kernel density estimation and non-parametric regression to divide the dataset and automatically determine bins in our future research.

Analysis of Sources of the Errors
In this study, the classification accuracy of Luobei County, Hegang City in 2017 was low, which we believe resulted from two aspects. On the one hand, the available images for the region in 2017 only cover early April to late June. In July, rice is in the jointing stage with vigorous growth. Images from this period may be beneficial to distinguish maize from rice. However, the absence of images from July to September affects the classification accuracy. On the other hand, the planting structure of Luobei County was adjusted in 2017. The planting area of maize was reduced, and instead small amounts of soybeans, grains, fruits and vegetables were planted, which increased the crop diversity in this region and had a certain impact on the classification results [32].
Therefore, we consider the fusion of GF-1, Sentinel-2 and Landsat-8 medium-resolution images. Multiple data sources can alleviate the problem of missing images. In addition, both Sentinel-2 and Landsat-8 contain short-wave infrared bands, which perform well in crop classification [9].

Comparison of Computational Complexity of Three Clustering Algorithms
Computational complexity is an important evaluation indicator of an algorithm. Thus, we compared the clustering time between the PCIB, K-means and ISODATA clustering algorithms. To make the comparison more complete, we selected a commonly used clustering benchmark dataset (iris dataset). Our model was implemented in the Windows 10 Professional operation system with 1.60 GHz Intel Core i5-8250U. The times of the three clustering methods were shown in Table 9. K-means exhibits a faster clustering time than ISODATA, as the latter automatically merges and splits classes in the clustering process, thus making the algorithm more complex. The PCIB clustering time is far shorter than that of K-means and ISODATA, it does not require multiple iterations per pixel and can realize rapid classification.

Additional Experiments
We chose Luobei County and the southwest of Hulin City in Heilongjiang Province as the research areas. Both regions have a temperate monsoon climate and are cultivated once a year. The planting structure is relatively simple, and the main crops are maize and rice. However, considering that crop characteristics vary with different planting regions, we also selected the northwest of Wuwei City, Gansu Province (Figure 9) as the research area for additional experiments. The region has a temperate continental arid climate, which is characterized by a large temperature difference between day and night, sufficient sunshine, drought and little rain. With its extremely advantageous geographical, soil, water and climatic resources, the region provides a good planting environment for crops such as maize, pear, wheat and grape. ISPRS Int. J. Geo-Inf. 2020, 9, x FOR PEER REVIEW 19 of 23 We also selected GF-1 WFV images during the main growing seasons of main crops in 2018 (April-September) with cloud cover <10% as remote sensing data sources. The image phase is shown in Table 10. The samples for this region were also from field research. Sampling points are regularly distributed within the area. The collected data include crop types, growth situations, geographical coordinates and field photographs. The number of sample points are shown in Table 11. Others include buildings, water and small amounts of pepper, soybeans and greenhouses. The PCIB method described above was used to classify the region. The classification results are shown in Figure 10. Additionally, the classification accuracies are shown in Table 12, with the overall accuracy reaching 84%. The region is rich in crop types, including corn, spring wheat, grapes and pears. As crop planting is concentrated, good classification results are obtained. This also verifies the feasibility of PCIB method in areas with abundant crop types. We also selected GF-1 WFV images during the main growing seasons of main crops in 2018 (April-September) with cloud cover <10% as remote sensing data sources. The image phase is shown in Table 10. The samples for this region were also from field research. Sampling points are regularly distributed within the area. The collected data include crop types, growth situations, geographical coordinates and field photographs. The number of sample points are shown in Table 11. Others include buildings, water and small amounts of pepper, soybeans and greenhouses. The PCIB method described above was used to classify the region. The classification results are shown in Figure 10. Additionally, the classification accuracies are shown in Table 12, with the overall accuracy reaching 84%. The region is rich in crop types, including corn, spring wheat, grapes and pears. As crop planting is concentrated, good classification results are obtained. This also verifies the feasibility of PCIB method in areas with abundant crop types. ISPRS Int. J. Geo-Inf. 2020, 9, x FOR PEER REVIEW 20 of 23

Conclusions
In the current paper, we propose a classification method (PCIB) with GF-1 remote sensing images acquired during the growing season as the data source to extract the planting information of the key crops in the southwest of Hulin City and Luobei County, Heilongjiang province, China. The underlying motivation of this work is to reduce the dependence of crop distribution mapping on a large number of field samples and to determine the spatial distribution information of crops in a timely manner. We compare PCIB with the traditional random forest, K-means and ISODATA classifiers. The key results can be summarized as follows: (1) The overall accuracy of the PCIB method in the southwest of Hulin City in 2016 reaches 82%, exceeding the K-means and ISODATA algorithms. In 2017, Luobei County of Hegang City exhibited the lowest PCIB classification accuracy, with the other three methods also exhibiting low accuracies, (79%, 74% and 75% for random forest, K-means and ISODATA, respectively). Although the overall accuracy of PCIB is slightly lower than that of the random forest classifier,

Conclusions
In the current paper, we propose a classification method (PCIB) with GF-1 remote sensing images acquired during the growing season as the data source to extract the planting information of the key crops in the southwest of Hulin City and Luobei County, Heilongjiang province, China. The underlying motivation of this work is to reduce the dependence of crop distribution mapping on a large number of field samples and to determine the spatial distribution information of crops in a timely manner. We compare PCIB with the traditional random forest, K-means and ISODATA classifiers. The key results can be summarized as follows: (1) The overall accuracy of the PCIB method in the southwest of Hulin City in 2016 reaches 82%, exceeding the K-means and ISODATA algorithms. In 2017, Luobei County of Hegang City exhibited the lowest PCIB classification accuracy, with the other three methods also exhibiting low accuracies, (79%, 74% and 75% for random forest, K-means and ISODATA, respectively). Although the overall accuracy of PCIB is slightly lower than that of the random forest classifier, it meets the mapping accuracy requirements for years where large amounts of field samples are absent. (2) PCIB conducts the isometric binning of k principal components directly after the PCA dimensionality reduction. Multiple iterations per each pixel are not required, and the time complexity is linear. This consequently improves the computational efficiency compared with the Euclidean distance-based K-means and ISODATA classifiers. (3) The dependence on a large number of field samples for classification is reduced. In addition, the spatial distribution information of crops is determined in a timely and accurate manner. Our proposed method can potentially be applied to the mapping of crop classification.
The PCIB method proposed in the paper can obtain the spatial distribution information of crops accurately and timely, which is of great significance for guiding agricultural production. The classification results can be used to calculate the evapotranspiration of crops and calculate the planting area of crops, and so on.