The Effect of NDVI Time Series Density Derived from Spatiotemporal Fusion of Multisource Remote Sensing Data on Crop Classification Accuracy

Remote sensing data with high spatial and temporal resolutions can help to improve the accuracy of the estimation of crop planting acreage, and contribute to the formulation and management of agricultural policies. Therefore, it is important to determine whether multisource sensors can obtain high spatial and temporal resolution remote sensing data for the target sensor with the help of the spatiotemporal fusion method. In this study, we employed three different sensor datasets to obtain one normalized difference vegetation index (NDVI) time series dataset with a 5.8-m spatial resolution using a spatial and temporal adaptive reflectance fusion model (STARFM). We studied the effectiveness of using multisource remote sensing data to extract crop classifications and analyzed whether the increase in the NDVI time series density could significantly improve the accuracy of the crop classification. The results indicated that multisource sensor data could be used for crop classification after spatiotemporal fusion and that the data source was not limited by the sensor platform. With the increase in the number of NDVI phases, the classification accuracy of the support vector machine (SVM) and the random forest (RF) classifier gradually improved. If the added NDVI phases were not in the optimal time period for wheat recognition, the classification accuracy was not greatly improved. Under the same conditions, the classification accuracy of the RF classifier was higher than that of the SVM. In addition, this study can serve as a good reference for the selection of the optimal time range for base image pairs in the spatiotemporal fusion method for high accuracy mapping of crops, and help avoid excessive data collection and processing.


Introduction
The spatial pattern of crops is the manifestation of agricultural production activities on land use and the efficient usage of natural resources and scientific field management [1,2].With the growth of populations and the reduction of arable land, timely and accurate mapping of the spatial distribution of crops is an important foundation for growth monitoring, yield estimation, disaster assessment, grain production macro-control, and agricultural trade regulation [3,4].The spatial patterns are also the evidence that allows relevant departments to promptly formulate agricultural policies to guarantee food security [5].
For a long time, the traditional approach for obtaining crop planting area was based on sampling surveys and step-by-step summaries [6].However, this process is labor intensive and uses a large quantity of material resources, and the accuracy is greatly affected by a variety of subjective and objective factors [7,8].
Remote sensing data play an important role in many fields, such as air temperature estimation [9], land cover monitoring [10,11], fire incidence assessment [12], drought prediction [13], and monitoring of crop distribution [14,15].However, due to technological and budget constraints, space-borne sensors must compromise among spatial resolution, temporal resolution, and repeat periods [16], and data from any single sensor cannot fully reflect the spatiotemporal crop patterns [17,18], which generally presents a complex spatiotemporal heterogeneity [19].For example, a single medium or high spatial resolution remote sensing sensor usually has a long repeat cycle and has a limited potential for capturing the entire growth period of the crop [2,20]; the probability of acquiring high quality remote sensing data (cloud cover <10%) is less than 10% because of the influence of cloud contamination [21] on Landsat series with a 16-day repeat period.On the other hand, a single sensor with high temporal resolution can provide temporally dense data for the monitoring of large-scale vegetation [22,23], but its coarse spatial resolution data generally have a poor ability to describe the spatial heterogeneity [24][25][26]; one example is the moderate resolution imaging spectroradiometer (MODIS).Hence, few sensors currently provide high spatial and temporal resolution remote sensing data for mapping the complexity and diversity of crop planting structures and farmland fragmentation.
Remote sensing spatiotemporal fusion models have been developed to merge high spatial resolution data with high temporal resolution data to produce high spatiotemporal resolution data, which can be used to identify crop distributions with high accuracy [27,28].Researchers have used the original normalized difference vegetation index (NDVI) or time series derived from the spatial and temporal adaptive reflectance fusion model (STARFM) [29], which is the most widely used model, to study land cover classification or crop classification extraction through the supervised classification method [30][31][32][33][34][35][36].However, researchers often pair Landsat and MODIS in their studies [37].Here, we were interested in whether more kinds of sensor data can be used to provide more data for the target with the STARFM and accurately extract crop classifications.
In this study, we attempted to extend the NDVI time series from the ZY-3 satellite data, which are very suitable for crop monitoring [38], by combining the Landsat 8 Operational Land Imager (OLI) and HJ-1A/B charge coupled device (CCD) data, which have 30-m spatial resolution, through the STARFM.The support vector machine (SVM) and random forest (RF) algorithms are two widely used classifiers in the remote sensing community [33,36] and are used for crop classification in this study.The additional purposes of this study are to investigate the effect of the NDVI time series density on the classification accuracy by the two classifiers and to explore the feasibility of obtaining the optimal time period of wheat recognition for the NDVI fusion and classification by using the RF classifier.

Study Area
The Yucheng district in Shandong Province of China was selected as the study area and is located in the central eastern portion of the North China Plain, which is one of the most important commodity grain bases in China (Figure 1).The study area covers nearly 990 km 2 between 36 • 41 36"N-37

ZY-3 Data
The ZY-3 data with a 5.8-m spatial resolution were downloaded from the China Centre for Resources Satellite Data and Application (CCRSDA).These data cover the whole growth stage of winter wheat from December to June in the study area (Table 1).The surface reflectance data were obtained from the raw downloaded ZY-3 data through orthorectification, radiometric calibration, atmospheric correction, and geometric correction, and were then used to calculate the NDVI.For the preprocessing, the rational polynomial coefficient model was used to orthorectify the ZY-3 data.Atmospheric correction was conducted using Fast Line-of-sight-Atmospheric Analysis of Spectral Hypercubes (FLAASH), which was provided by ENVI version 5.0.We chose a quadratic polynomial calibration algorithm for the geometric correction referencing Google Earth.The co-registration error was less than 0.3 pixels.The UTM-WGS84 Zone 50N was selected as the projection coordinate system.

Landsat 8 OLI Data
Landsat 8 OLI is the newest satellite in the Landsat series.The Landsat 8 data (cloud cover <5%) were downloaded from the United States Geological Survey website (Table 1).The preprocessing of the Landsat 8 data mainly included radiation calibration and atmospheric correction to obtain the surface reflectance data to calculate the NDVI.The atmospheric correction was conducted using FLAASH.Since the downloaded Landsat 8 data had been geometrically corrected based on the terrain data and the identifiable ground feature points coincided with those of ZY-3 according to visual inspection, the bilinear interpolation method was used to directly resample the Landsat 8 data to be 5.8 m to meet the requirements of STARFM, and the projection coordinate system was the same as that of ZY-3.

HJ-1A/B CCD Data
The HJ-1A/B CCD data (cloud cover <5%) were provided by CCRSDA (Table 1) and were processed through radiometric calibration, atmospheric correction and geometric correction to obtain the surface reflectance data and then to obtain the NDVI.The radiometric calibration was completed using the 2015 HJ-1A/B absolute radiometric calibration coefficients provided by CCRSDA.The ZY-3 data were used as the reference to geometrically register the HJ-1A/B CCD data, and the co-

ZY-3 Data
The ZY-3 data with a 5.8-m spatial resolution were downloaded from the China Centre for Resources Satellite Data and Application (CCRSDA).These data cover the whole growth stage of winter wheat from December to June in the study area (Table 1).The surface reflectance data were obtained from the raw downloaded ZY-3 data through orthorectification, radiometric calibration, atmospheric correction, and geometric correction, and were then used to calculate the NDVI.For the preprocessing, the rational polynomial coefficient model was used to orthorectify the ZY-3 data.Atmospheric correction was conducted using Fast Line-of-sight-Atmospheric Analysis of Spectral Hypercubes (FLAASH), which was provided by ENVI version 5.0.We chose a quadratic polynomial calibration algorithm for the geometric correction referencing Google Earth.The co-registration error was less than 0.3 pixels.The UTM-WGS84 Zone 50N was selected as the projection coordinate system.

Landsat 8 OLI Data
Landsat 8 OLI is the newest satellite in the Landsat series.The Landsat 8 data (cloud cover <5%) were downloaded from the United States Geological Survey website (Table 1).The preprocessing of the Landsat 8 data mainly included radiation calibration and atmospheric correction to obtain the surface reflectance data to calculate the NDVI.The atmospheric correction was conducted using FLAASH.Since the downloaded Landsat 8 data had been geometrically corrected based on the terrain data and the identifiable ground feature points coincided with those of ZY-3 according to visual inspection, the bilinear interpolation method was used to directly resample the Landsat 8 data to be 5.8 m to meet the requirements of STARFM, and the projection coordinate system was the same as that of ZY-3.

HJ-1A/B CCD Data
The HJ-1A/B CCD data (cloud cover <5%) were provided by CCRSDA (Table 1) and were processed through radiometric calibration, atmospheric correction and geometric correction to obtain the surface reflectance data and then to obtain the NDVI.The radiometric calibration was completed using the 2015 HJ-1A/B absolute radiometric calibration coefficients provided by CCRSDA.The ZY-3 data were used as the reference to geometrically register the HJ-1A/B CCD data, and the co-registration error was less than 0.3 pixels.The other preprocesses were the same as those for Landsat 8.

Method
Figure 2 presents the flowchart of this study.First, all remote sensing data were clipped to the same region.The NDVI was calculated using red and near infrared (NIR) bands.Then, the time series of the Landsat 8 and HJ-1A/B NDVIs were fused with the ZY-3 NDVI using STARFM [29].To reduce the effect of noise on the vegetation data, the fused NDVI time series was filtered by Time-series Satellite data Analysis Tool (TIMESAT).Finally, the NDVI time series and its corresponding temporal features were used to classify crops using the SVM and RF, respectively, and to evaluate the accuracy of the classification results.
registration error was less than 0.3 pixels.The other preprocesses were the same as those for Landsat 8.

Method
Figure 2 presents the flowchart of this study.First, all remote sensing data were clipped to the same region.The NDVI was calculated using red and near infrared (NIR) bands.Then, the time series of the Landsat 8 and HJ-1A/B NDVIs were fused with the ZY-3 NDVI using STARFM [29].To reduce the effect of noise on the vegetation data, the fused NDVI time series was filtered by Time-series Satellite data Analysis Tool (TIMESAT).Finally, the NDVI time series and its corresponding temporal features were used to classify crops using the SVM and RF, respectively, and to evaluate the accuracy of the classification results.

STARFM
The STARFM is based on one pair of images comprising one medium-resolution reflectance image (M) and one high-resolution reflectance image (L) as the reference date (tk), and then blends the differences (L-M) with another medium spatial resolution reflectance image (M) from the prediction date (t0) to predict the corresponding high spatial resolution reflectance image (L).Under the assumption that the surface reflectance difference does not change over time, a weighted averaging method is used with a moving window centered at the target pixel to reduce the boundary

STARFM
The STARFM is based on one pair of images comprising one medium-resolution reflectance image (M) and one high-resolution reflectance image (L) as the reference date (tk), and then blends the differences (L-M) with another medium spatial resolution reflectance image (M) from the prediction date (t0) to predict the corresponding high spatial resolution reflectance image (L).Under the assumption that the surface reflectance difference does not change over time, a weighted averaging method is used with a moving window centered at the target pixel to reduce the boundary influence of the adjacent pixels.The surface reflectance of the target pixel is predicted by selecting the spectrally similar and cloudless pixels in the moving window as follows: where w is the size of the moving window, (x w/2 , y w/2 , t 0 ) is the center pixel of the moving window at date t 0 , L(x w/2 , y w/2 , t 0 ) is the predicted reflectance value of the high spatial resolution value, M (x i ,y j ,t 0 ) is the medium spatial resolution reflectance value at date t 0 , and L(x i ,y j ,t k ) and M (x i ,y j ,t k ) represent the high and medium spatial resolution reflectance values, respectively, at the base date t k .
The weight W ijk, , which determines how much each similar pixel contributes to the surface reflectance of the target pixel, is decided by the spectral difference, temporal difference, and location difference after the normalization processing [16,29].

NDVI Time Series and A-G Filter
The NDVI time series is usually continuous and smooth, and can reflect the growth process of the vegetation, but remote sensing data are influenced by various factors in the acquisition process [36].They will cause the NDVI time series to fluctuate irregularly and may influence the trend analysis and the temporal feature extraction of the NDVI time series [39].Therefore, it is necessary to filter the noise from the NDVI time series to improve the data quality.The TIMESAT software includes three filtering methods, i.e., asymmetric Gaussian functions (A-G), the double logistic function method (D-L), and the Savitzky-Golay method (S-G).The A-G filtering method has the highest fidelity to the original data [40,41]; therefore, it is more suitable for processing the NDVI time series of the grassland, farmland, and other vegetation types [41][42][43].Accordingly, this study adopted A-G to remove the noise from the NDVI time series.

Extracting Temporal Features from NDVI Time Series
The NDVI time series can appropriately reflect the growth of vegetation and contains the information used to distinguish vegetation types [44,45], such as the seasonal and phenological changes.The NDVI time series can represent the seasonal and annual variations in crop growth (i.e., periodic variations from planting, seedling emergence, maturity at harvest) [36].For winter wheat, there were two peaks in the NDVI curve during the growth period (Figure 3).The first is in early December.After the sowing period, the NDVI of winter wheat increased gradually and reached the first peak, while the NDVI of other vegetation decreased gradually with decreasing temperature.At this stage, the time-NDVI slope of wheat was greater than zero, while that of other vegetation was less than zero.The second occurred between late April and early May.The biomass of winter wheat increased rapidly after rising and jointing, and the NDVI also increased.By early May, the NDVI value reached its maximum, forming a second peak.Then, the leaves gradually withered, and the NDVI gradually decreased, and the curve slope of NDVI was less than zero.However, the curve slope of the NDVI of other vegetation was greater than zero, and the NDVI was still increasing.The change characteristics of the NDVI curve are an important basis for identifying winter wheat.
The three NDVI time series were constructed from 4 ZY-3 NDVIs, 14 ZY-3+Landsat 8 fused NDVIs, and 25 ZY-3+Landsat 8+HJ-1A/B fused NDVIs.The original or fused NDVI time series were not a good choice for land cover classification and could reduce the accuracy of the classification model due to their large amount of redundant information [46].Therefore, to eliminate the redundant information of the high-dimensional data [30], six temporal features were extracted from the denoised 4, 14, and 25 NDVI time series for classification, including the maximum, minimum, mean, standard deviation, time-accumulated NDVI, and the time of maximum value.

SVM and RF
The SVM has a unique advantage for analyzing nonlinear local minima with small sample sizes and high dimensional pattern recognition problems [47].It is widely used in land cover classification [48], identification of forest types [49,50], and crop monitoring [51], and has achieved higher classification accuracy compared with traditional supervised classification methods [32].The LIBSVM package (provided by http://www.csie.ntu.edu.tw/~cjlin/libsvm)provides multiple SVM classifiers to support multilevel classification and the function of cross-validated models [52].
In this study, we used the default SVM classifier (C-SVC) [52] for crop classification.The kernel functions used in the SVM are generally linear, polynomial, radial basis (RBF) and sigmoidal functions, which directly affect the performance of the classification model.Generally, the number of training data points is much higher than the number of features, and an optimal result can be obtained using the nonlinear kernel function.Therefore, we use the RBF kernel function.At the same time, with the grid search and k-fold cross validation methods, the variable pair (such as γ and the penalty parameter C) with the highest cross-validation accuracy were selected as the optimal global parameters [52].
The random forest (RF) classifier is a flexible and easy-to-use machine learning algorithm.Its essence is the ensemble learning method, which is widely used in land cover classification [33], biomass estimation [53], cloud detection [54], and species classification [55].Please refer to Breiman [56] for the detailed introduction to the RF.Crop mapping was carried out using RF in the EnMAP-Box tools provided by the German Environmental Mapping and Analysis Program [57].
The quantitative evaluation indexes, including the confusion matrix, kappa coefficient, overall accuracy, producer accuracy, user accuracy [60], and the true skill statistic (TSS) [61], were calculated using randomly selected sample pixels to evaluate the classification accuracies of NDVIS/R-

SVM and RF
The SVM has a unique advantage for analyzing nonlinear local minima with small sample sizes and high dimensional pattern recognition problems [47].It is widely used in land cover classification [48], identification of forest types [49,50], and crop monitoring [51], and has achieved higher classification accuracy compared with traditional supervised classification methods [32].The LIBSVM package (provided by http://www.csie.ntu.edu.tw/~{}cjlin/libsvm)provides multiple SVM classifiers to support multilevel classification and the function of cross-validated models [52].
In this study, we used the default SVM classifier (C-SVC) [52] for crop classification.The kernel functions used in the SVM are generally linear, polynomial, radial basis (RBF) and sigmoidal functions, which directly affect the performance of the classification model.Generally, the number of training data points is much higher than the number of features, and an optimal result can be obtained using the nonlinear kernel function.Therefore, we use the RBF kernel function.At the same time, with the grid search and k-fold cross validation methods, the variable pair (such as γ and the penalty parameter C) with the highest cross-validation accuracy were selected as the optimal global parameters [52].
The random forest (RF) classifier is a flexible and easy-to-use machine learning algorithm.Its essence is the ensemble learning method, which is widely used in land cover classification [33], biomass estimation [53], cloud detection [54], and species classification [55].Please refer to Breiman [56] for the detailed introduction to the RF.Crop mapping was carried out using RF in the EnMAP-Box tools provided by the German Environmental Mapping and Analysis Program [57].
The quantitative evaluation indexes, including the confusion matrix, kappa coefficient, overall accuracy, producer accuracy, user accuracy [60], and the true skill statistic (TSS) [61], were calculated using randomly selected sample pixels to evaluate the classification accuracies of NDVIS/R-TSCM4/14/25 and NDVIS/R-TFCM4/14/25.In addition, the statistical areas of the different classification results were also compared.
Combining ground surveys with land-cover type information in the region, we selected 8 kinds of land cover.This process was finished with the help of Google Earth tools and ground survey points.The numbers of wheat and non-wheat sample pixels were 882 and 1333, respectively (Table 2).These samples were selected using a random stratified sampling method to make them representative of the study area.Sixty-five percent of the samples were randomly selected for training, and the other 35% were selected for validation.

Quantitative Evaluation of Fusion Results
To validate STARFM, a small rectangular region of approximately 1390 × 865 pixels was randomly selected from the observed and predicted NDVI images from 2 December 2014 and 4 February 2015 (Figures 4 and 5).The results illustrated that the NDVI spatial distribution was consistent between the observed and predicted NDVIs, and the difference values between Figure 4a,b were mainly distributed around zero (Figure 4c).According to the correlation analysis, the distribution of the scatter points was concentrated along the line of x=y.The R 2 values were 0.969 and 0.950 at the 0.01 significance level, respectively (Figure 5); the RMSEs were 0.040 and 0.039, respectively, which meant that the estimation error of the predicted NDVI image was only approximately 4% (Figure 5).This result indicated that the deviation between the predicted and observed NDVIs was very small.
The AAD, AD, and SD values of Figure 4c were small, more than 97.55% of the pixel difference absolute values were less than 0.1, and almost all of the pixel difference absolute values were less than 0.2, which meant that the predicted results had rather high accuracies (Table 3).Accordingly, STARFM could effectively predict the missing high spatial resolution NDVI values by fusing the NDVIs of different spatial and temporal resolutions.
(3) By comparing Tables 4-6 for any case of NDVI-TSCM or NDVI-TFCM, with the increase in the NDVI phases from 4 to 14 and then to 25 using the SVM and RF classifiers, the overall accuracy, and the kappa and TSS coefficients, had different degrees of improvement.NDVI-TFCM was better than NDVI-TSCM for the SVM classifier, regardless of whether the NDVI phase was 4,14, or 25, but the performance of the RF classifier was the opposite when NDVI-TFCM was used.The NDVIS-TFCM had the highest impact on the classifiers with increasing NDVI time series.Therefore, we concluded that the increase in the NDVI phases could be helpful in improving the classification accuracy for the SVM and RF classifiers.

Verification of Wheat Area
The accuracy of the classification results was further evaluated by verifying the derived acreage.According to the data published by the Yucheng City Bureau of statistics, the wheat acreage of Yucheng was approximately 48,759.33 hectares in 2015.The statistics of wheat acreages from 12 classification results are shown in Table 7.The relative errors for NDVI-TSCM25 and NDVI-TFCM25 were smaller than those for NDVI-TSCM14/4 and NDVI-TFCM14/4, which also showed that with the increase of the NDVI phases from 4 to 14 and then to 25, the accuracies for NDVI-TSCM and NDVI-TFCM were both improved and were consistent with the results of the ground surveys (Tables 4-6).The classification accuracy for NDVI-TSCM was higher than that for NDVI-TFCM under the RF classifier, while it was the opposite under the SVM.Two small ranges in the classification results for NDVIS/R-TSCM4/14/25 and NDVIS/R-TFCM4/14/25 are shown in Figures 6 and 7. Using the ZY-3 data from 2 December 2014, we can obtain a standard false color composite image, which is beneficial for the identification of the vegetation distribution (Figures 6M and 7M). Figure 6N shows the classification result of the whole study area based on NDVIR-TSCM25.If the whole study area was used to compare the classification results, it would not show the detailed differences among the classification results.Therefore, the classification results for the first small-scale region are displayed in Figure 6A-L.By comparing Figure 6A-L, we found that the spatial distributions of the 12 classification results were basically the same, and the main differences were reflected in the linear and small patch features of the classification results.With the increase in the NDVI phases from 4 to 14 and then to 25, the fragmentation degree of the classification results gradually decreased.
A smaller area can better show the detailed differences in the classification results, so the second smaller area is shown in Figure 7. Figure 7N displays a small area that was obtained from Google Earth on 8 February 2015, to discern the wheat region.The classification accuracy with NDVIS-TFCM14/25 was higher than that with NDVIS-TSCM14/25 over the small areas, roads, or other linear features (Figure 7A,B,E,F).The classification accuracy with NDVIS-TSCM4 on the patch features was higher than that with NDVIS-TFCM4, but the identification accuracy of the linear features was weaker than that with NDVIS-TFCM4 (Figure 7I,J).With the increase in the NDVI phases from 14 to 25, the misclassification probability for the roads was reduced, and the ability to distinguish between wheat and non-wheat was stronger (Figure 7A,E).The recognition accuracy with NDVIS-TFCM4 on wheat was significantly weaker than those using NDVIS-TFCM25 and NDVIS-TFCM14 (Figure 7A,E,I).With the increase in the NDVI phases, the recognition accuracy of using NDVIS-TSCM on non-wheat was also improved (Figure 7B,F,J).At the same time, for the SVM classifier, the increase in the NDVI phases from 4 to 14 and then to 25 was helpful for improving the accuracy of classification results, which was consistent with the conclusion from the ground survey.

Effect of Base Images on Classification Accuracy
Extending the NDVI time series through spatiotemporal fusion and combining the machine learning model for crop mapping and land cover classification has been shown to be a valuable strategy.However, the effect of data selection on the performance of the fusion model has seemed to attract little attention.The time density of the NDVI time series directly affects the accuracy of crop classification.Generally, with the increase in the density of the NDVI time series, the classification accuracy also improves, which means that the workload also increases.A relevant question, therefore, is whether fusing all data for use in classification is necessary to obtain higher classification accuracy (e.g., overall accuracy > 95%), bearing in mind that data collection is costly.With the increase in the NDVI temporal phases, the importance of the image dates was judged individually with the RF classifier in this study to determine the selection of image date.The statistical results are shown in Table 8.According to the importance of the variables, the image on 14 April 2015 was the most important, followed by 2 December 2014, in the four-phase NDVI (Table 8).With the increase in the NDVI temporal phases, the optimal time range of the image pair was from 24 March to 30 April 2015, and the other time range was before 3 January 2015 (Table 8), which was basically consistent with the phenological calendar date.With the increase in the NDVI temporal phases, the classification accuracy gradually improved.If the added NDVI phases were not in the optimal time period of wheat recognition, the classification accuracy would not be greatly improved.At the same time, it also proves that the date of the base image pair can be selected by the phenological calendar and the crop From visual assessment, NDVIR-TFCM14/25 and NDVIR-TSCM14/25 could have similar classification accuracy, except for in some small areas (Figure 7C,D,G,H).Similarly, the recognition accuracy using NDVIR-TFCM4 and NDVIR-TSCM4 on wheat was significantly weaker than when using of NDVIR-TFCM25/14 and NDVIR-TSCM25/14 (Figure 7C,D,G,H,K,L).With the increase in the NDVI phases, the recognition accuracy of using NDVIR-TSCM on wheat was also improved.The classification accuracy with NDVIR-TSCM was significantly higher than that with NDVIS-TSCM, regardless of whether the NDVI phase was 4, 14, or 25 (Figure 7B,D,F,H,J,L).The increase in the NDVI phase could improve the classification accuracies of the SVM and RF classifiers, and the RF classifier performed slightly better than the SVM classifier.The reason for the high classification accuracy might be that the high spatiotemporal resolution of the NDVI time series contained information about the growth period of the vegetation and provided valuable information for recognizing the vegetation type.

Effect of Base Images on Classification Accuracy
Extending the NDVI time series through spatiotemporal fusion and combining the machine learning model for crop mapping and land cover classification has been shown to be a valuable strategy.However, the effect of data selection on the performance of the fusion model has seemed to attract little attention.The time density of the NDVI time series directly affects the accuracy of crop classification.
Generally, with the increase in the density of the NDVI time series, the classification accuracy also improves, which means that the workload also increases.A relevant question, therefore, is whether fusing all data for use in classification is necessary to obtain higher classification accuracy (e.g., overall accuracy > 95%), bearing in mind that data collection is costly.With the increase in the NDVI temporal phases, the importance of the image dates was judged individually with the RF classifier in this study to determine the selection of image date.The statistical results are shown in Table 8.According to the importance of the variables, the image on 14 April 2015 was the most important, followed by 2 December 2014, in the four-phase NDVI (Table 8).With the increase in the NDVI temporal phases, the optimal time range of the image pair was from 24 March to 30 April 2015, and the other time range was before 3 January 2015 (Table 8), which was basically consistent with the phenological calendar date.With the increase in the NDVI temporal phases, the classification accuracy gradually improved.If the added NDVI phases were not in the optimal time period of wheat recognition, the classification accuracy would not be greatly improved.At the same time, it also proves that the date of the base image pair can be selected by the phenological calendar and the crop NDVI curve [62].With the assistance of phenological calendar, better classification results can usually be obtained through the selection of remote sensing images near the nodes where the stage of crop growth changes.In addition, when the phenological calendar is not available, with the help of coarse spatial resolution remote sensing data (e.g., MODIS) and the RF classifier, the optimal time range of the image can be obtained, which can avoid excessive data collection and processing, and provides one method for selecting the input data for spatiotemporal fusion.

Discussion
According to the classification results, increasing the NDVI temporal phase of NDVI-TFCM and NDVI-TSCM can help to improve the classification accuracy.The reason may be that the NDVI temporal phases increased from 4 to 14 and then to 25, which meant an increase in the image acquisition frequency and a decrease in the time gap between adjacent NDVIs in the study period.This helped to improve the accuracy of crop type identification.Moreover, the temporal information can help determine the growth characteristics of vegetation more accurately, which is also helpful for crop classification, especially for the identification of vegetation types [31,63].
Although MODIS has the advantages of a short revisit period and open access data, compared with the ZY-3 data, its highest spatial resolution is only 250 m, and the mixed pixel phenomenon is very severe.Thus, it is unsuitable to combine MODIS with ZY-3 data to meet the refinement requirements of regional monitoring for condensed areas and small acreage [36].However, the Landsat 8 and HJ-1A/B data with 30-m spatial resolution can better reflect the spatial heterogeneity in the crop distribution than can MODIS.Therefore, they are more suitable to be combined with the ZY-3 data.Considering that the availability and quality of the Landsat 8 data are better than those of the HJ-1A/B data, we chose the Landsat 8 as the main data source and the HJ-1A/B as an auxiliary data source to provide temporal change information for ZY-3.By evaluating the fusion results, the STARFM can effectively predict the unavailable ZY-3 NDVIs caused by long recursive cycles (59 days) and cloud contamination.The constructed ZY-3-like NDVI time series data were fine enough to detect small objects and regional heterogeneity with spatial scales less than 30 m.
In this study, the accuracies of the fusion results were also affected by the following factors: 1) the qualities of the ZY-3, Landsat 8 and HJ-1A/B data.2) During the study period, considerable changes with a spatial scale of less than 30 m took place in vegetation growth; however, Landsat 8 and HJ-1A/B data cannot capture such changes.Thus, errors were introduced into the fused images.3) STARFM was proposed for reflectance fusion, so when it was directly used in the NDVI fusion, some deviations may have been produced.4) Due to the interference of atmospheric conditions and other factors, such as the bidirectional reflectance distribution function, after a series of image processing, some noise may be introduced into NDVI images inadvertently.Therefore, avoiding noise introduction will also improve the accuracy of fusion results.
In recent years, an increasing number of nonparametric classifiers have been developed, which would be preferable for use in improving classification accuracy in future studies.High spatial resolution images, such as GF-1/2, can be used with the mentioned method for land cover classification.Thus, the combination of the spatiotemporal fusion model and the advanced classification model can play an important role in regional forest cover mapping, land cover mapping, and the identification of crop types.

Conclusions
The research in this study showed that multisource sensor data can be used for crop classification after spatiotemporal fusion and that the data source was not limited by the sensor platform, which provides a flexible choice for data sources.The classification accuracy with the NDVI temporal features vector was not necessarily higher than that using the NDVI time series itself, which was related to the selection of the classifier.With the increase in the NDVI phases from 4 to 14 and then to 25, the classification accuracy with the use of NDVI-TSCM and NDVI-TFCM were both improved.The classification accuracy with NDVI-TSCM was higher than that with NDVI-TFCM under the RF classifier, while it had the opposite trend under the SVM.With the increase in the NDVI temporal phase, the classification accuracy was gradually improved.If the added NDVI phases were not in the optimal time period for wheat recognition, the classification accuracy was not greatly improved.In addition, this study can provide a good reference for the selection of the optimal time ranges of base image pairs for spatiotemporal fusion methods for high accuracy mapping of crops, and help avoid excessive data collection and processing.

18 Figure 1 .
Figure 1.The location of Shandong Province in China (A); the location of the study area in Shandong Province (B); the study area is shown in one true color (R: red, G: green, B: blue) image using ZY-3 data of 2 December 2014 (C).

Figure 1 .
Figure 1.The location of Shandong Province in China (A); the location of the study area in Shandong Province (B); the study area is shown in one true color (R: red, G: green, B: blue) image using ZY-3 data of 2 December 2014 (C).

Figure 2 .
Figure 2. The flowchart of crop classification based on the spatial and temporal adaptive reflectance fusion model (STARFM).

Figure 2 .
Figure 2. The flowchart of crop classification based on the spatial and temporal adaptive reflectance fusion model (STARFM).

18 Figure 3 .
Figure 3.The reference normalized difference vegetation index (NDVI) time series of typical objects in the study area.

Figure 3 .
Figure 3.The reference normalized difference vegetation index (NDVI) time series of typical objects in the study area.

Figure 4 .
Figure 4. Comparison of observed (a) and predicted NDVIs (b), and the distribution of difference NDVI (c).

Figure 4 .
Figure 4. Comparison of observed (a) and predicted NDVIs (b), and the distribution of difference NDVI (c).

Figure 4 .
Figure 4. Comparison of observed (a) and predicted NDVIs (b), and the distribution of difference NDVI (c).
• 12 13"N and 116 • 22 11"E-116 • 45 00"E.It is situated in a temperate climatic zone and has an annual average temperate of approximately 13.30 • C, an annual average precipitation of nearly 555.5 mm, and an average annual evaporation of 1884.80 mm.The highest elevation in the study area is 27.27 m, and the lowest is 19.20 m.

Table 1 .
List of the data used in this study.

Table 1 .
List of the data used in this study.

Table 2 .
Samples of the different land covers.

Table 3 .
Quantitative evaluation results of the difference images.

Table 3 .
Quantitative evaluation results of the difference images.

Table 3 .
Quantitative evaluation results of the difference images.

Table 5 .
Confusion matrix for the classification results using NDVI-TSCM14 and NDVI-TFCM14 under the support vector machine (SVM) and random forest (RF) classifiers (%).

Table 6 .
Confusion matrix for the classification results using NDVI-TSCM4 and NDVI-TFCM4 under the SVM and RF classifiers (%).

Table 7 .
Quantitative evaluation of wheat statistical area.

Table 8 .
Determination of the date range of important images.