Winter Wheat Mapping Based on Sentinel-2 Data in Heterogeneous Planting Conditions

Monitoring and mapping the spatial distribution of winter wheat accurately is important for crop management, damage assessment and yield prediction. In this study, northern and central Anhui province were selected as study areas, and Sentinel-2 imagery was employed to map winter wheat distribution and the results were verified with Planet imagery in the 2017–2018 growing season. The Sentinel-2 imagery at the heading stage was identified as the optimum period for winter wheat area extraction after analyzing the images from different growth stages using the Jeffries–Matusita distance method. Therefore, ten spectral bands, seven vegetation indices (VI), water index and building index generated from the image at the heading stage were used to classify winter wheat areas by a random forest (RF) algorithm. The result showed that the accuracy was from 93% to 97%, with a Kappa above 0.82 and a percentage error lower than 5% in northern Anhui, and an accuracy of about 80% with Kappa ranging from 0.70 to 0.78 and a percentage error of about 20% in central Anhui. Northern Anhui has a large planting scale of winter wheat and flat terrain while central Anhui grows relatively small winter wheat areas and a high degree of surface fragmentation, which makes the extraction effect in central Anhui inferior to that in northern Anhui. Further, an optimum subset data was obtained from VIs, water index, building index and spectral bands using an RF algorithm. The result of using the optimum subset data showed a high accuracy of classification with a great advantage in data volume and processing time. This study provides a perspective for winter wheat mapping under various climatic and complicated land surface conditions and is of great significance for crop monitoring and agricultural decision-making.


Introduction
Wheat (Triticum Aestivum L.) is the third-largest food crop in terms of production globally [1], providing a large number of nutritional sources for those suffering from nutrient deficiency.China is one of the major wheat-producing areas.In 2017, the wheat planting area in China reached 24,510 kha ranking third in the world, and more than 98% of the total acreage is winter wheat [2].Therefore, it is of great significance for the government to obtain accurate information on the planting area, growth and yield of winter wheat for formulating agricultural policies, estimating crop yield and ensuring food security [3].The traditional method to acquire the planting area of winter wheat is mainly a manual sampling survey [4], which is not only labor-intensive, time-consuming and expensive but also susceptible to subjective factors.Remote-sensing technology has been widely used in the field of crop identification and mapping [5][6][7] due to its spatial coverage, temporal resolution, availability at near real-time and low cost.
The development of various satellites makes it possible to monitor cropping areas at fine spectral, temporal and spatial scales [8].Moderate-resolution imaging spectroradiometer (MODIS) can provide a large amount of observation data that are both valuable and obligatory for global vegetation monitoring.It is sufficient to map large scale cropping areas [9,10], however, it is challenged by mixed pixels when the medium-spatial-resolution satellite data were used for crop characterization in finer-scale land distribution, such as southern China [11].Recently, high-spatial-resolution satellites such as QuickBird and SPOT5 have become available and provide new opportunities for more accurate mapping of crops [12,13].High-resolution data can provide more details of land tenure system, and improve accuracy of mapping [14].However, the cost of high-spatial-resolution data limits their general use.With the successful launch of the second Sentinel-2 satellite in March 2017, Sentinel-2 mission proposed by European Space Agency is generating unprecedented volumes of data at high spatial (up to 10 m), spectral (13 bands) and temporal resolutions (minimum five day), which makes it used widely in crop identification and mapping [15][16][17].
Using remote-sensing data to monitor crop is mainly based on spectral and crop phenological difference, so the identification methods can be divided into spectral methods, such as remote-sensing-based classification [13,18], mixed pixel decomposition [19], multi-source information integration [7] and phenological method, such as time series analysis [8,[20][21][22].There are some other methods like remote-sensing sampling [23], random forest and support vector machine [24].With the development of remote-sensing technology, more algorithms have been applied to crop identification.However, such study on winter wheat extraction is still limited in the areas which have difficulties and challenges such as [11]: unfavorable weather conditions; frequent cloud cover; complex terrain surface; high degree of fragmentation of cultivated land; and significant difference in crop planting structure between regions.Existing studies on winter wheat extraction in the areas with such conditions still have some shortcomings: 1) the methods were simple and the accuracy was not high; 2) few have focused on the remote-sensing mapping of wheat in the areas due to the high degree of surface fragmentation; and 3) few have discussed the differences of extraction methods under different planting conditions.
In 2017, the planting area of winter wheat in Anhui province reached 2822.2 kha, ranking the third in China after Henan and Shandong provinces [2].However, the application of remote-sensing technology in Anhui agricultural production is relatively insufficient compared with other provinces due to the difficulties mentioned above.In view of the important role of Anhui province in China's wheat production and the problems in the remote-sensing extraction of winter wheat, it is urgent to explore better methods for winter wheat mapping using remote-sensing imagery.
Therefore, the objective of the study was to explore the effect of different planting conditions (climate, scale, fragmentation) on the accuracy of winter wheat mapping using remote-sensing data.The specific goals were to (1) select the optimum phenological phase for extracting winter wheat from Sentinel-2 images based on key phenological periods; (2) acquire the optimum remote-sensing screening features of winter wheat based on the optimum phenological period; (3) get the optimum scheme for winter wheat mapping in the areas of interest; and (4) identify the uncertainties and future needs in winter wheat mapping.

Study Area
Anhui province abuts the Yangtze River Delta Economic Zone centered in Shanghai.The province is located in the middle and lower reaches of the Yangtze and Huaihe Rivers and belongs to the East China region (114 1).Affected by monsoon, Anhui has four distinct seasons and is a transitional region between warm temperate zone and subtropical zone [25].The topography and landforms are diverse, with plains, hills and low mountains [26].
Northern and central zones are the main producing areas of winter wheat in Anhui [27].Typical counties for winter wheat in the two areas were selected as study areas due to the differences in climatic conditions, topography and surface fragmentation.
Lingbi county and Sixian county were selected in northern Anhui (Northern Anhui Counties, NAC).The cultivated land area in northern Anhui accounts for about half in the whole province.It is mainly flat plain with mild and semi-humid climate, frequent drought and flood disasters, and affected by climate change.The annual mean precipitation and annual mean temperature were about 843 mm and 15.4 • C in the area, respectively [27].The winter wheat planting area in northern Anhui accounts for about 70% of the province's total [28].
Changfeng county and Dingyuan county were chosen in central Anhui (Central Anhui Counties, CAC).Central Anhui belongs to a hilly landform with a mild climate, moderate rainfall and sufficient light.During the study period, the annual mean precipitation was about 951 mm, and the annual mean temperature was 16.2 • C [27].The difference in climate between northern and central zones results in huge variations in crop types.The challenge of wheat extraction in this region is greater than the northern area because of the complex planting structures and discontinuous wheat fields.Generally, winter wheat is planted in late October and harvested in early June in the following year and the winter wheat phenology in NAC and CAC (http://data.cma.cn/site/index.html) is shown in Table 1.

Remote Sens. 2019, 11, x FOR PEER REVIEW 3 of 20
Anhui province abuts the Yangtze River Delta Economic Zone centered in Shanghai.The province is located in the middle and lower reaches of the Yangtze and Huaihe Rivers and belongs to the East China region (114°54′~119°37′E, 29°41′~34°38′N) (Figure 1).Affected by monsoon, Anhui has four distinct seasons and is a transitional region between warm temperate zone and subtropical zone [25].The topography and landforms are diverse, with plains, hills and low mountains [26].
Northern and central zones are the main producing areas of winter wheat in Anhui [27].
Typical counties for winter wheat in the two areas were selected as study areas due to the differences in climatic conditions, topography and surface fragmentation.  1.

Datasets
Sentinel-2 satellite carries a multi-spectral imager (MSI) with an orbital altitude of 786 km, 13 spectral bands, a swath width of 290 km and four bands at 10 m, six bands at 20 m and three bands at 60 m.There are three bands in the red-edge spectral region providing more band choices for vegetation monitoring [16,29,30].Planet has a spatial resolution of 3 meters and four reflective bands (Blue, 0.45-0.51µm; Green, 0.50-0.59µm; Red, 0.59-0.67µm; NIR, 0.78-0.86µm).It can update global data once a day (after 2016) [31] and provide guarantee for change monitoring of different frequencies, and it has an advantage in price compared with other satellite data with similar resolution worldwide [32].
In this study, the images were selected by referring to the average phenological period of wheat (http://data.cma.cn/site/index.html) in the area of interest and considering the coverage of available images.We employed five Sentinel-2 images in each of the two study areas and purchased eleven Planet images (each for 24×7 km 2 ) (http://www.kosmos-imagemall.com/)(Table 2).

Calculating the Separability Using Jeffries-Matusita (JM) Distance
Previous studies have shown that Jeffries-Matusita (JM) distance is an effective metric to evaluate the separability of training samples in remote-sensing-based classification [8,33,34].We calculated the separability between winter wheat and other land cover types by JM distance to determine the optimum period imagery.
In NAC, we selected seven main land cover types (winter wheat, water, urban, bare land, grass, forest and others) with 50 training sets for each type based on five key phenological period images.Oilseed rape and barley only account for 1.5% of the area planted for winter crops, which was ignored [27].In CAC, there was little barley in winter (3.0%), but more oilseed rape (26%) [27].we chose eight land cover types (winter wheat, water, urban, bare land, grass, forest, oilseed rape and others) with about seventy training sets for each type.
The JM distance is calculated as: where x represents a random variable, w i and w j are the two land cover types under consideration.Under normality assumptions, Equation ( 1) can simplify to: where, ρ i and ρ j are the averages of spectral reflectance of type-specific and i and j are estimates for the type-specific covariance matrices.JM distance ranges between 0 (low separability) and 2 (high separability).Values of JM > 1.8 indicates good separability between two samples [35].

Description for Spectral Features
Nineteen features were selected in total (Table 3).The reflectance of ten bands were selected as spectral features based on the optimum period imagery of winter wheat while reflectance values were used to calculate the normalized difference vegetation index (NDVI) [36], enhanced vegetation index (EVI) [37], soil-adjusted vegetation index (SAVI) [38], greenness normalized difference vegetation index (GNDVI) [39], modified normalized difference water index (MNDWI) [40] and normalized difference building index (NDBI) [41].Previous research stressed the importance of the red-edge bands [16,42,43], so three red-edge indices (NDVI 5 , NDVI 6 and NDVI 7 ) [44] were calculated from the three red-edge bands.

Description of the Classification Scheme
Four schemes of classification (Table 4) were designed for two purposes [45]: 1) examine the influence of different features on winter wheat extraction and determine their importance; and 2) explore the optimum schemes of winter wheat extraction in the area of interest.Scheme D was generated by the result of feature selection.Random forest (RF) is a popular algorithm for classification and feature selection [46,47].Recently, random forest has been widely used in many fields because of its high classification accuracy, strong anti-noise and anti-outlier ability.Moreover, the variable importance metric can be used as an effective tool for feature selection [48,49].
Not all features from imagery are useful in improving classification accuracy.It is a key step in how to choose the most important features for crop identification in image analysis process [4].Random forest not only can realize remote-sensing-based classification but also plays an important role in feature selection [50,51].The sample-set was selected based on optimum period imagery for wheat extraction in this study.Samples about three hundred evenly composed of winter wheat and other mainland cover types (water, urban, bare land, grass, forest, oilseed rape and others) were selected as the original classification samples in two study areas through visual interpretation with the help of Google Earth and Planet images.Two-thirds of the training samples were randomly selected from the original sample set.The attribute of the training samples and the features participating classifications were used to train the classifier.The remaining one-third of the unsampled samples were called out-of-bag (OOB) data [47,50].Out-of-bag-error generated by OOB data can evaluate the classification ability of the classifier and calculate the variables' importance (VI) for feature selection.The variables importance score of feature j is calculated as [51]: where VI j represents the importance score of feature j, N represents the number of decision trees generated, A j Ni represents the OOB error of decision tree i when noise is not added to feature j and A j Oi represents the OOB error of decision tree i when noise is added to feature j.If the accuracy of OOB is reduced greatly, it indicates that feature j has a great influence on the classification results after adding noise to it, that is to say, its importance is high.
In order to determine the influence of different feature variables (Table 3) on the extraction of winter wheat, we applied a random forest algorithm to score them and assess the importance of different features, which was realized by the random forest package in MATLAB 2018 (https: //ww2.mathworks.cn/products/matlab.html).There are two important parameters in the random forest function: the number of the decision tree: ntree, and the number of features selected by each split node: mtry.In this study, mtry = √ N is the default value [47,50] (N represents the number of all features).Theoretically, the larger the ntree is, the higher the accuracy of classification will be, but the computation and time cost will increase.We found that when ntree was greater than 100, the OOB error tends to be stable under the default condition of mtry, so the ntree parameter was set as 100 [52].

Accuracy Assessment
The results were tested from two perspectives of area extraction accuracy and spatial distribution.Percentage error (PE) was employed to quantify the difference of wheat mapping areas between results from Planet and results from Sentinel-2 as: where Re f erence represents the wheat area extracted from each Planet sample plot.Estimated represents the area extracted from each Sentinel-2 sample plot.
Confusion matrix [53] is a standard means to evaluate the accuracy of classification results from remote-sensing images, including the producer's accuracy (PA), user's accuracy (UA) and Kappa coefficient (Kappa).Kappa was calculated as [8,54]: where r is the number of rows, x ii is the number of pixels in row i and column i, x i+ is the total number of pixels in row i, x +i is the total number of pixels in column i, and N is the total number of pixels.Five sample plots (each for 6 × 6 km 2 ) in NAC (Figure 2a) and six sample plots (each for 6 × 6 km 2 ) in CAC (Figure 2b) based on Planet imagery were selected for evaluating the result derived from Sentinel-2.Since winter wheat at the heading stage is conducive to visual interpretation, this study applied Planet images (3 m) at the heading stage to extract distribution of winter wheat in sample plots, and the results were used as reference data to verify the accuracy.Specifically, Planet imagery was subset according to the sample plot size, then four spectral bands of Planet were employed as input to obtain the distribution of winter wheat by using the random forest algorithm.The accuracy of the winter wheat extraction in Sentinel-2 sample plots was validated by referring to the distribution of winter wheat in Planet sample plots.Since Planet's spatial resolution is 3 m and sentinel-2 is 10 m, all results from Sentinel-2 were re-sampled to 3 m, using the nearest algorithm to match the spatial resolution of Planet to achieve spatial corresponding for each pixel.
Five main steps were performed for winter wheat extraction and validation: 1) preprocessing data; 2) screening the optimum period based on five key phenological periods of winter wheat growing seasons; 3) screening the optimum feature for winter wheat extraction; 4) identifying the optimum scheme for winter wheat mapping; and 5) evaluating accuracy with Planet imagery.The workflow applied in this study is shown in Figure 3.   Five main steps were performed for winter wheat extraction and validation: 1) preprocessing data; 2) screening the optimum period based on five key phenological periods of winter wheat growing seasons; 3) screening the optimum feature for winter wheat extraction; 4) identifying the optimum scheme for winter wheat mapping; and 5) evaluating accuracy with Planet imagery.The workflow applied in this study is shown in Figure 3.

Selection of Optimum Periods
The separability was calculated using JM distance in NAC and CAC (Table 5).The result showed the separability between winter wheat and non-vegetation (water, urban, bare land, etc.) was good (JM > 1.8) in different phenological periods in both study areas.However, the separability between winter wheat and vegetation (forest, grass and oilseed rape) was poor.From Table 5, we can see only on 7 April, 2018, the JM distance between winter wheat and other land cover types were all greater than 1.8, indicating that the heading stage was the optimum period to distinguish winter wheat from other land cover types.

Selection of Optimum Features
We scored the features (Table 3) generated from Sentinel-2 image on 7 April 2018.The scores for each feature in NAC (Figure 4a) and in CAC (Figure 4b) were calculated by random forest algorithm.
The result showed: In NAC, the NDVI had the highest score (3.64), which was the key feature for winter wheat extraction, MNDWI had the lowest (0.07), so it had little impact.In CAC, band 6 scored the highest (5.57) while NDBI scored the lowest (0.26), so band 6 was the most important feature.

Selection of Optimum Periods
The separability was calculated using JM distance in NAC and CAC (Table 5).The result showed the separability between winter wheat and non-vegetation (water, urban, bare land, etc.) was good (JM > 1.8) in different phenological periods in both study areas.However, the separability between winter wheat and vegetation (forest, grass and oilseed rape) was poor.From Table 5, we can see only on 7 April, 2018, the JM distance between winter wheat and other land cover types were all greater than 1.8, indicating that the heading stage was the optimum period to distinguish winter wheat from other land cover types.

Selection of Optimum Features
We scored the features (Table 3) generated from Sentinel-2 image on 7 April 2018.The scores for each feature in NAC (Figure 4a) and in CAC (Figure 4b) were calculated by random forest algorithm.
The result showed: In NAC, the NDVI had the highest score (3.64), which was the key feature for winter wheat extraction, MNDWI had the lowest (0.07), so it had little impact.In CAC, band 6 scored the highest (5.57) while NDBI scored the lowest (0.26), so band 6 was the most important feature.

Winter Wheat Mapping in NAC and CAC
For the determination of scheme D, nineteen features were arranged in descending order according to the importance score (Figure 4).A feature with the lowest score was removed from the feature set by Sequential Backward Selection (SBS) [55], and a classification model was constructed with the remaining features.The number of optimum feature subsets was determined by the classification accuracy, which refers to the prediction accuracy of OOB data by the classifier (Figure 5).

Winter Wheat Mapping in NAC and CAC
For the determination of scheme D, nineteen features were arranged in descending order according to the importance score (Figure 4).A feature with the lowest score was removed from the feature set by Sequential Backward Selection (SBS) [55], and a classification model was constructed with the remaining features.The number of optimum feature subsets was determined by the classification accuracy, which refers to the prediction accuracy of OOB data by the classifier (Figure 5).In NAC (Figure 5a), at first, less important features (features with the lowest score in importance score) were deleted, and the number of features decreased from 19 to 17.As a result, the classification accuracy increased generally, and the deletion of redundant features resulted in improved classifier performance.Secondly, the number of important features was reduced from 17 to 5. The classification accuracy changed little with the average classification accuracy of 96.86%, which indicated the classification accuracy remained stable with the reduction of the number of features participating in the classification.Finally, when the number of important features was reduced from 5 to 1, the classification accuracy was greatly reduced with the deletion of the features, which was caused by the elimination of useful features.In CAC (Figure 5b), there was a similar trend as in NAC.The classification accuracy fluctuated slightly with the reduction of the number of features from 19 to 7, and the classification accuracy decreased greatly when important features were deleted (7 to 1).
The classification accuracy with the first five features in NAC was 95.94% and that of the first seven features in CAC was 93.58% while the data volume was reduced by more than 60% compared with the original nineteen features.It effectively reduced the data volume and ensured a higher classification accuracy.Therefore, scheme D was done using the first five features (NDVI, NDVI6, band 11, band 2 and band 8) presented in Figure 4a for NAC and the first seven features (band 6, NDVI, GNDVI, band 2, band 11, EVI and band 4) in Figure 4b for CAC.
Random forest classifier was used to extract the planting information of winter wheat in the study area of the four experimental schemes (Figure 6).In NAC (Figure 6a), the spatial distribution of winter wheat was continuous, and the scale of planting was large.In CAC (Figure 6b), the planting of winter wheat was relatively small and discontinuous, with fragmented landscape and small patches of farmland.In NAC (Figure 5a), at first, less important features (features with the lowest score in importance score) were deleted, and the number of features decreased from 19 to 17.As a result, the classification accuracy increased generally, and the deletion of redundant features resulted in improved classifier performance.Secondly, the number of important features was reduced from 17 to 5. The classification accuracy changed little with the average classification accuracy of 96.86%, which indicated the classification accuracy remained stable with the reduction of the number of features participating in the classification.Finally, when the number of important features was reduced from 5 to 1, the classification accuracy was greatly reduced with the deletion of the features, which was caused by the elimination of useful features.In CAC (Figure 5b), there was a similar trend as in NAC.The classification accuracy fluctuated slightly with the reduction of the number of features from 19 to 7, and the classification accuracy decreased greatly when important features were deleted (7 to 1).
The classification accuracy with the first five features in NAC was 95.94% and that of the first seven features in CAC was 93.58% while the data volume was reduced by more than 60% compared with the original nineteen features.It effectively reduced the data volume and ensured a higher classification accuracy.Therefore, scheme D was done using the first five features (NDVI, NDVI 6 , band 11, band 2 and band 8) presented in Figure 4a for NAC and the first seven features (band 6, NDVI, GNDVI, band 2, band 11, EVI and band 4) in Figure 4b for CAC.
Random forest classifier was used to extract the planting information of winter wheat in the study area of the four experimental schemes (Figure 6).In NAC (Figure 6a), the spatial distribution of winter wheat was continuous, and the scale of planting was large.In CAC (Figure 6b), the planting of winter wheat was relatively small and discontinuous, with fragmented landscape and small patches of farmland.

Accuracy of Winter Wheat Maps
With Planet imagery extraction results as a reference, the five sample plots in NAC (Figure 2a) and six sample plots in CAC (Figure 2b) evenly distributed in the study area were employed to verify the accuracy of four schemes.The difference between the winter wheat distribution map by the Planet imagery and the map from Sentinel-2 by the four classification schemes for the sample plots in NAC and CAC are shown in Figure 7.It can be seen clearly that scheme A and B generated greater differences compared to scheme C and D, indicating that scheme C and D had better classification results.

Accuracy of Winter Wheat Maps
With Planet imagery extraction results as a reference, the five sample plots in NAC (Figure 2a) and six sample plots in CAC (Figure 2b) evenly distributed in the study area were employed to verify the accuracy of four schemes.The difference between the winter wheat distribution map by the Planet imagery and the map from Sentinel-2 by the four classification schemes for the sample plots in NAC and CAC are shown in Figure 7.It can be seen clearly that scheme A and B generated greater differences compared to scheme C and D, indicating that scheme C and D had better classification results.From Table 6, the producer's accuracy and user's accuracy of the four wheat maps in NAC were relativity high ranging from 91% to 97% while that in CAC were from 72% to 92%.The Kappa coefficient was about 0.85 and percentage error (PE) was lower than 5% for NAC.The Kappa coefficient was between 0.7 and 0.8 and PE ranged from 12% to 30% in CAC.The overall extraction effect was inferior to that of the main producing areas in NAC.Moreover, it reflected in both study areas that the accuracy of scheme A and B was lower than that of scheme D, while the result of scheme C was slightly better (0.1 higher in Kappa) than D. From Table 6, the producer's accuracy and user's accuracy of the four wheat maps in NAC were relativity high ranging from 91% to 97% while that in CAC were from 72% to 92%.The Kappa coefficient was about 0.85 and percentage error (PE) was lower than 5% for NAC.The Kappa coefficient was between 0.7 and 0.8 and PE ranged from 12% to 30% in CAC.The overall extraction effect was inferior to that of the main producing areas in NAC.Moreover, it reflected in both study areas that the accuracy of scheme A and B was lower than that of scheme D, while the result of scheme C was slightly better (0.1 higher in Kappa) than D.

Discussion
A range of winter wheat mapping approaches has been developed in previous studies based on remote-sensing imagery [4,7,8,10].However, many of them have left problems in their research that need to be solved in the future.For example, Zhang et al. paid attention to the influence of features generated from different period images on extraction results, and the lack of discussion on the difference of periods, which was the focus of their later work [56].Some researchers employed multi-spectral data, vegetation indices and phenological metrics to enrich the information available for crop mapping, which may increase computation time with little improvement in accuracy [57,58].The method we demonstrated can solve the problems and achieve the goal of giving consideration to both periods and features and also can be applied to the problems associated with large volumes of data.

Winter Wheat Mapping in Heterogeneous Planting Conditions
After analysis, the heading stage was found to be the optimum period for winter wheat extraction from other land cover types in both study areas, but the selection of features showed variability.The maximum and minimum contribution of winter wheat extraction in NAC were NDVI and NDWI, while in CAC were band 6 and NDBI, respectively.Based on the Sentinel-2 image at the heading stage, the distribution of winter wheat could be achieved according to the results of feature selection.The producer's accuracy and user's accuracy of winter wheat in NAC was about 95% and 93%, respectively, while those in CAC were about 80% and 78%, respectively.The difference between the results proved the importance of considering the planting conditions for mapping.

Factors Influencing the Accuracy of Winter Wheat Mappings
The difference in mapping accuracy between the two areas may be due to the specific conditions of the two study areas.Located in the northern plain of Anhui province, NAC is a winter wheat intensive planting area with continuous spatial distribution, regular patches and large planting area.Compared with NAC, the winter wheat planting scale in CAC is relatively small and most of which are discontinuous and irregularly distributed making it difficult to identify winter wheat fields.Previous studies have shown that fragmentation has a greater impact on crop mapping [6,12].The higher the land fragmentation level, the more serious the mixed pixels are and the lower the mapping accuracy of winter wheat is, which may partially explain the phenomenon that the extraction accuracy of winter wheat in NAC was higher than that in CAC.Moreover, crop types and planting patterns determined the complexity of winter wheat extraction.The winter crops in NAC were only barley and wheat, and the planting area of wheat was large, so barley can be neglected compared to it.The winter crops in CAC were barley, oilseed rape and wheat.The large area of oilseed rape would affect the extraction results and make the wheat mapping in CAC more complex.The mapping accuracy could be improved when there is a better method to eliminate the influence of winter crops, such as oilseed rape.

Winter Wheat Mapping Using Optimum Feature Subset
In our study, the percentage error of winter wheat extraction was lower than 5% and Kappa was greater than 0.83 by using the first five features with the highest score in NAC.The percentage error of the first seven features in CAC was lower than 25%, and Kappa was greater than 0.7.The higher classification accuracy was achieved based on the optimum features of scheme D compared to those in scheme A with all indices and the features in scheme B with all spectral bands.The reason may be that the combination of optimum subsets of all types of features takes advantage of multi-source information to maximize useful information compared with a single feature.Although the highest classification accuracy was obtained using scheme C for both study areas, scheme D removed features with low importance and only retained those contributed significantly to winter wheat extraction, so the workload was reduced and the work efficiency was improved significantly.

Uncertainty Analysis and Future Needs
In this study, we explored the mapping of winter wheat in heterogeneous planting conditions and got good results.However, there are still some uncertainties that need to be addressed in the future.First, this study lacks field investigation data.Second, only four winter wheat planting counties were selected as the study areas since the available images were reduced due to cloud cover and bad weather, and only one winter wheat growing season from 2017 to 2018 was selected for the study.More work needs to be put into force in the future, such as expanding the scope of the study area and choosing multiple growing seasons to verify the applicability and generalization of the conclusions in this study.Third, we only used machine learning methods to extract winter wheat in the study area.Further efforts could be implemented to evaluate the influence of different methods (such as deep learning) on the accuracy of wheat mapping.

Conclusions
An integrative analysis of the optimum period, optimum screening feature and optimum extraction scheme was explored for winter wheat mapping in Anhui province in China using high-spatial-resolution Sentinel-2 images.In both study areas, the optimum period for winter wheat extraction was the heading stage and the optimum features were NDVI, NDVI 6 , band 11 (1614 nm), band 2 (496 nm) and band 8 (835 nm) for NAC, and band 6 (740 nm), NDVI, GNDVI, band 2 (496 nm), band 11 (1614 nm), EVI and band 4 (665 nm) for CAC.Based on the optimum feature scheme, random forest classifier generated a Kappa of about 0.85 in NAC and a Kappa of 0.75 in CAC, accompanied by a reduction of more than 60% in computational cost of image analysis.The wheat maps had high accuracies, which can support the utility of the maps for depicting the spatial distribution of winter wheat.The value of our research lies in that relatively few resources in terms of datasets and timing would be employed to obtain timely and accurate wheat planting information, so as to provide valuable references in methods and make up for the shortage of wheat study for the areas where challenged by high degree of fragmentation, complex terrain surface and changeable climate.The result provides references for agricultural and government departments to make decisions, as well as food security issues.

Figure 1 .
Figure 1.Maps of the study areas.

Figure 1 .
Figure 1.Maps of the study areas.

Figure 2 .Figure 3 .
Figure 2. The distribution of sample plots derived from Planet images.(a) Northern Anhui counties

Figure 2 .
Figure 2. The distribution of sample plots derived from Planet images.(a) Northern Anhui counties (NAC).(b) Central Anhui counties (CAC).(The base images are Sentinel-2 image on 7 April 2018 illustrated in false-color composite (R: NIR, G: Red, B: Green) and the points represent the sample plots number).

Figure 2 .
Figure 2. The distribution of sample plots derived from Planet images.(a) Northern Anhui counties (NAC).(b) Central Anhui counties (CAC).(The base images are Sentinel-2 image on 7 April 2018 illustrated in false-color composite (R: NIR, G: Red, B: Green) and the points represent the sample plots number).

Figure 3 .
Figure 3.The workflow of the winter wheat extraction and validation.

Figure 3 .
Figure 3.The workflow of the winter wheat extraction and validation.

Figure 5 .
Figure 5.The relationship between the number of features and classification accuracy based on sequential backward selection.(a) Northern Anhui counties (NAC).(b) Central Anhui counties (CAC).(The red dot represents the number of optimum feature subset).

Figure 5 .
Figure 5.The relationship between the number of features and classification accuracy based on sequential backward selection.(a) Northern Anhui counties (NAC).(b) Central Anhui counties (CAC).(The red dot represents the number of optimum feature subset).

Figure 6 .
Figure 6.Four schemes for winter wheat mapping during the 2017-2018 growing season.(a) Northern Anhui counties (NAC).(b) Central Anhui counties (CAC).(A, B, C, D represent the wheat distribution based on scheme A, B, C, D).

Figure 7 .
Figure 7. Detailed distribution difference maps of winter wheat between Sentinel-2 and Planet imagery in the sample plots in NAC (a) and CAC (b).Black color indicates the difference between Sentinel-2 results and Planet results.1-5 or 1-6 represents the sample plot number.P represents wheat distribution in the sample plots by the Planet imagery, A, B, C, D represent the wheat distribution in the sample plots based on scheme A, B, C, D.

Table 2 .
Description of the satellite data used in the study.

Table 4 .
Design of experimental scheme.

Table 5 .
Jeffries-Matusita (JM) distance between other main land cover types and winter wheat.

Table 5 .
Jeffries-Matusita (JM) distance between other main land cover types and winter wheat.

Table 6 .
Confusion matrix and percentage error (PE) for NAC and CAC 1 .