An Improved Approach Considering Intraclass Variability for Mapping Winter Wheat Using Multitemporal MODIS EVI Images

Winter wheat is one of the major cereal crops in the world. Monitoring and mapping its spatial distribution has significant implications for agriculture management, water resources utilization, and food security. Generally, winter wheat has distinguished phenological stages during the growing season, which form a unique EVI (Enhanced Vegetation Index) time series curve and differ considerably from other crop types and natural vegetation. Since early 2000, the MODIS EVI product has become the primary dataset for satellite-based crop monitoring at large scales due to its high temporal resolution, huge observation scope, and timely availability. However, the intraclass variability of winter wheat caused by field conditions and agricultural practices might lower the mapping accuracy, which has received little attention in previous studies. Here, we present a winter wheat mapping approach that integrates the variables derived from the MODIS EVI time series taking into account intraclass variability. We applied this approach to two winter wheat concentration areas, the state of Kansas in the U.S. and the North China Plain region (NCP). The results were evaluated against crop-specific maps or statistical data at the state/regional level, county level, and site level. Compared with statistical data, the accuracies in Kansas and the NCP were 95.1% and 92.9% at the state/regional level with R2 (Coefficient of Determination) values of 0.96 and 0.71 at the county level, respectively. Overall accuracies in confusion matrix were evaluated by validation samples in both Kansas (90.3%) and the NCP (85.0%) at the site level. Comparisons with methods without considering intraclass variability demonstrated that winter wheat mapping accuracies were improved by 17% in Kansas and 15% in the NCP using the improved approach. Further analysis indicated that our approach performed better in areas with lower landscape fragmentation, which may partly explain the relatively higher accuracy of winter wheat mapping in Kansas. This study provides a new perspective for generating multiple subclasses as training inputs to decrease the intraclass differences for crop type detection based on the MODIS EVI time series. This approach provides a flexible framework with few variables and fewer training samples that could facilitate its application to multiple-crop-type mapping at large scales. Remote Sens. 2019, 11, 1191; doi:10.3390/rs11101191 www.mdpi.com/journal/remotesensing Remote Sens. 2019, 11, 1191 2 of 24


Introduction
Wheat is the world's third largest food crop in terms of production [1] and is the most widely grown crop globally [2].The wheat growing area covers over 200 million hectares worldwide [3], with winter wheat accounting for more than 80% [4].Maintaining and increasing global winter wheat production is strongly linked to food security [5][6][7][8].Accurate and timely information of temporal and spatial variations in winter wheat areas is therefore essential for crop yield estimation, climate impacts assessment, and agricultural policy-making [9,10].
The development of the remote sensing technology makes it possible to monitor crop areas at fine spectral, temporal, and spatial scales realistically [11,12].Specifically, data with high temporal resolution provides time-series images with daily signatures and has been successfully used for crop area monitoring [13][14][15].Previous studies have shown that optical satellite remote sensing is a viable means to detect winter wheat fields over large-scale and longtime series [4,15].The Moderate Resolution Imaging Spectroradiometer (MODIS) on board the NASA Earth Observing System Terra and Aqua satellite platforms offers unprecedented capabilities for large-area crop mapping by providing global coverage, half-day revisit capacities, and medium spatial resolution [16,17].MODIS Vegetation Index (VI) time-series has been proven to be a powerful tool for crop type characterization and has been successfully used for crop mapping across a wide range of scales and geographic locations [18,19].Some studies have used the MODIS VI data for mapping and monitoring winter wheat across different scales [15,20].
Recently, a variety of algorithms combined with specific features (e.g., spectral, temporal, and phenological features) have been employed in remote sensing-based crop mapping, such as Parallelepiped Classifier [21], Minimum Distance [22], Maximum Likelihood [23], Spectral Angle Mapper [24], Support Vector Machines [25][26][27], Neural Network Classifier [24], Classification and Regression Trees (CARTs) [28], etc.Many factors may act to affect the remote sensing-based mapping accuracy, in which training samples have a more significant impact than the mapping techniques [26,29,30].For a given crop type, the upper and lower bounds thresholds of training datasets mainly depend on intraclass variability, which results from differences in crop environmental status and management practices [14].Different growth conditions and landscape factors (e.g., irrigation, fertility, climate, topography, fragmentation) may cause intraclass differences affecting the spectral signatures of the same crop type [31].Intra-and interclass confusion due to these factors would degrade the ability of the image classifier to produce accurate maps for specific crop types.
Until now, few studies for winter wheat mapping have considered the intraclass features at different stages with different conditions, which might introduce large uncertainties in estimating and mapping crop area [8,14,17].An approach considering the intraclass variability of winter wheat is therefore needed to generate the timely and reliable information on crop mapping areas, which will benefit for ensuring food security in the face of crop yield gaps, climate change, and extreme climate events.
The global cropland datasets showed the spatial distribution of cropland at 30 m spatial resolution, but it did not classify specific crop types [32].For the United States, the 250 m MODIS-based crop types distribution were developed for the period 2001-2013, but did not separate wheat into winter wheat and spring wheat [33].This situation justifies the importance of this study.Therefore, a robust approach for winter wheat mapping at the regional scale is highly needed.
In this study, an improved approach for mapping winter wheat based on intraclass variability was presented that integrated the angles and distances of multidimensional vectors and adopted multiple subclasses as training samples.The presented approach was used for mapping winter wheat based on the MODIS EVI time series data in two study areas; Kansas and the North China Plain region (NCP).Kansas has large areas of consecutive wheat fields which generally range from 30 to 150 ha and has a comprehensive and reliable county-level archive of crop statistics [4].The NCP accounts for approximately two-thirds of China's total wheat area and production [34], representing the most important wheat-producing area in China.Reference data at the administrative level, county level, and site level are also available in this region.
The objectives of this study are to (1) present an improved approach considering intraclass variability for winter wheat mapping based on the MODIS EVI time series images; (2) apply the new approach to two areas of interest, Kansas and the NCP; (3) analyze the effects of landscape structure on remote sensing-based crop mapping accuracy using two landscape metrics, i.e., Fragmentation Index (FRG) and Percentage of the Landscape (PLAND); and (4) identify the uncertainties and future needs in remote sensing-based crop area mapping.

Study Area
Two major winter wheat dominant areas, i.e., Kansas, USA (94-102 • W and 37-40 • N) (Figure 1a) and the NCP, China (111-123 • E and 33-41 • N) (Figure 1b), were selected as the regions of interest.Most US wheat is grown in the Great Plains from Texas to North Dakota, accounting for about 75% of national wheat production (http://www.fao.org/docrep/006/y4011e/y4011e04.html).About 41% of the total US wheat production is of the Hard Red Winter (HRW) class, and most HRW wheat is grown in the central and southern Great Plains states.Kansas lies in the heart of the central Great Plains region and is one of the major winter wheat producing states in the Great Plains [35].The NCP is the largest agricultural area in China [36], also known as the "breadbasket of China", accounting for about 71% of the country's wheat production [34,[37][38][39].It covers two metropolises (Beijing and Tianjin) and five provinces (i.e., Anhui, Hebei, Henan, Jiangsu, and Shandong) [40]. the most important wheat-producing area in China.Reference data at the administrative level, county level, and site level are also available in this region.
The objectives of this study are to 1) present an improved approach considering intraclass variability for winter wheat mapping based on the MODIS EVI time series images; 2) apply the new approach to two areas of interest, Kansas and the NCP; 3) analyze the effects of landscape structure on remote sensing-based crop mapping accuracy using two landscape metrics, i.e., Fragmentation Index (FRG) and Percentage of the Landscape (PLAND); and 4) identify the uncertainties and future needs in remote sensing-based crop area mapping.

Study Area
Two major winter wheat dominant areas, i.e., Kansas, USA (94-102° W and 37-40° N) (Figure 1a) and the NCP, China (111-123° E and 33-41° N) (Figure 1b), were selected as the regions of interest.Most US wheat is grown in the Great Plains from Texas to North Dakota, accounting for about 75% of national wheat production (http://www.fao.org/docrep/006/y4011e/y4011e04.html).About 41% of the total US wheat production is of the Hard Red Winter (HRW) class, and most HRW wheat is grown in the central and southern Great Plains states.Kansas lies in the heart of the central Great Plains region and is one of the major winter wheat producing states in the Great Plains [35].The NCP is the largest agricultural area in China [36], also known as the "breadbasket of China", accounting for about 71% of the country's wheat production [34,[37][38][39].It covers two metropolises (Beijing and Tianjin) and five provinces (i.e., Anhui, Hebei, Henan, Jiangsu, and Shandong) [40].
Kansas has a temperate continental climate with a strong east-west precipitation gradient and a strong north-west-south-east temperature gradient.The annual mean precipitation ranges from < 450 mm in the west to > 1,200 mm in the southeast, whereas the annual mean temperature varies from < 11 °C in the northwest to > 15 °C in the southeast.On the NCP, the elevation is less than 50 m above sea level [34].The plain has a temperate, subhumid, and continental monsoon climate [41].The annual rainfall varies between 480 mm in the north and 850 mm in the south of the NCP [10].The annual mean temperature is around 12.2 °C [38], while the minimum (January) and maximum (July) monthly average temperature were -6 to 0 °C and 25 to 28 °C, respectively [39]. (a)

Crop Distribution Data
To train and validate the proposed approach, we collected the Cropland Data Layer (CDL) from CropScape (http://nassgeodata.gmu.edu/CropScape/),developed by the United States Department of Agriculture National Agricultural Statistics Service (USDA-NASS).The spatial resolution of the CDL varies from 30 to 56 m, depending on the imagery source [33,44].Winter wheat was mapped at an annual interval since 2006 for Kansas, with overall accuracies over 90% for both producer's accuracy and user's accuracy in 2017 CDL (https://www.nass.usda.gov/Research_and_Science/Cropland/metadata/metadata_ks17.htm).The producer's accuracy is a measure of the omission error and is defined as the number of correctly classified pixels of the class relative to the total number of pixels of that class used in the assessment.The user's accuracy is a measure of the commission error associated with a class, which is derived from the number of pixels correctly allocated to a class relative to the total number of pixels predicted to belong to that class [45].For this study, we used 2017 USDA CDL for Kansas as the reference map (Figure 1a), which provided the training and verification samples at the site level.For the NCP, no such crop cover data is available.Kansas has a temperate continental climate with a strong east-west precipitation gradient and a strong north-west-south-east temperature gradient.The annual mean precipitation ranges from < 450 mm in the west to > 1,200 mm in the southeast, whereas the annual mean temperature varies from < 11 • C in the northwest to > 15 • C in the southeast.On the NCP, the elevation is less than 50 m above sea level [34].The plain has a temperate, subhumid, and continental monsoon climate [41].The annual rainfall varies between 480 mm in the north and 850 mm in the south of the NCP [10].The annual mean temperature is around 12.2 • C [38], while the minimum (January) and maximum (July) monthly average temperature were −6 to 0 • C and 25 to 28 • C, respectively [39].

Crop Distribution Data
To train and validate the proposed approach, we collected the Cropland Data Layer (CDL) from CropScape (http://nassgeodata.gmu.edu/CropScape/),developed by the United States Department of Agriculture National Agricultural Statistics Service (USDA-NASS).The spatial resolution of the CDL varies from 30 to 56 m, depending on the imagery source [33,44].Winter wheat was mapped at an annual interval since 2006 for Kansas, with overall accuracies over 90% for both producer's accuracy and user's accuracy in 2017 CDL (https://www.nass.usda.gov/Research_and_Science/Cropland/metadata/metadata_ks17.htm).The producer's accuracy is a measure of the omission error and is defined as the number of correctly classified pixels of the class relative to the total number of pixels of that class used in the assessment.The user's accuracy is a measure of the commission error associated with a class, which is derived from the number of pixels correctly allocated to a class relative to the total number of pixels predicted to belong to that class [45].For this study, we used 2017 USDA CDL for Kansas as the reference map (Figure 1a), which provided the training and verification samples at the site level.For the NCP, no such crop cover data is available.

Statistical Data and Agrometeorological Stations Data
Statistical data at the administrative level were obtained to evaluate the mapped winter wheat areas in Kansas and the NCP.The 2017 winter wheat production areas at the county and state levels were collected from USDA-NASS website (https://quickstats.nass.usda.gov/)for Kansas.The 2017 winter wheat phenology information for Kansas was obtained from crop progress and condition reports at USDA-NASS website (https://www.nass.usda.gov/Charts_and_Maps/Crop_Progress_&_Condition).For the NCP, winter wheat planting statistics at the county level and provincial levels acquired from the Ministry of Agriculture of the People's Republic of China (http://202.127.42.157/moazzys/nongqingxm. aspx/).Moreover, we collected phenological and locations information for the winter wheat crop from agrometeorological stations (http://data.cma.cn)across the NCP (Figure 1b).

Methods
The overall process diagram of this study is presented in Figure 2. Statistical data at the administrative level were obtained to evaluate the mapped winter wheat areas in Kansas and the NCP.The 2017 winter wheat production areas at the county and state levels were collected from USDA-NASS website (https://quickstats.nass.usda.gov/)for Kansas.The 2017 winter wheat phenology information for Kansas was obtained from crop progress and condition reports at USDA-NASS website (https://www.nass.usda.gov/Charts_and_Maps/Crop_Progress_&_Condition).For the NCP, winter wheat planting statistics at the county level and provincial levels acquired from the Ministry of Agriculture of the People's Republic of China (http://202.127.42.157/moazzys/nongqingxm.aspx/).Moreover, we collected phenological and locations information for the winter wheat crop from agrometeorological stations (http://data.cma.cn)across the NCP (Figure 1b).

Winter Wheat Crop Calendars
The typical crop season length of winter wheat is about seven to nine months, including the prewinter growing period.For defining the time ranges of EVI time series, crop calendars acquired from USDA-NASS and agrometeorological stations for two study areas were used (Table 1).In Kansas, winter wheat is planted in mid-September, and 80% of winter wheat planting is done before mid-October.Harvest begins from mid-June, and 80% of wheat areas are harvested before late July in 2017.In the NCP, there are some differences in winter wheat phenology between the northern and the southern areas.Winter wheat is usually sown in October, but the earliest planting dates occur in late September.Winter wheat overwinters from late December to the next February and greens up in late February.In spring, winter wheat starts growing rapidly and is generally harvested in mid-June [46].  1 Green color represents sowing, light green represents the development of winter wheat, gray represents over-winter, and orange represents harvesting.E, M, and L represent the early 10 days, the

Winter Wheat Crop Calendars
The typical crop season length of winter wheat is about seven to nine months, including the prewinter growing period.For defining the time ranges of EVI time series, crop calendars acquired from USDA-NASS and agrometeorological stations for two study areas were used (Table 1).In Kansas, winter wheat is planted in mid-September, and 80% of winter wheat planting is done before mid-October.Harvest begins from mid-June, and 80% of wheat areas are harvested before late July in 2017.In the NCP, there are some differences in winter wheat phenology between the northern and the southern areas.Winter wheat is usually sown in October, but the earliest planting dates occur in late September.Winter wheat overwinters from late December to the next February and greens up in late February.In spring, winter wheat starts growing rapidly and is generally harvested in mid-June [46].  1 Green color represents sowing, light green represents the development of winter wheat, gray represents over-winter, and orange represents harvesting.E, M, and L represent the early 10 days, the middle 10 days, and the last 10 days of a month, respectively.

Data Preprocessing
The MODIS Reprojection Tool (MRT) (http://edcdaac.usgs.gov/datatools.asp)was used to generate EVI mosaics spanning three MODIS scenes (h09v05, h10v04, h10v05) for Kansas and five scenes (h26v04, h26v05, h27v04, h27v05, h28v05) for the NCP region for each 16-day composite images, respectively [36].Based on the winter wheat phenology at two areas, EVI time series for the growing season for Kansas from mid-September 2016 to mid-July 2017 and for the NCP from late September 2011 to mid-June 2012 were used (Table 1).Images were reprojected to an Albers Equal-Area Conic projection by the nearest neighbor resampling method using the MRT [37].Preprocessed data was then subset to the study boundaries (Kansas and the NCP), resulting in MODIS EVI time series stacks that cover the whole growth period of winter wheat for each study area.

EVI Time Series Reconstruction by a Savitzky-Golay Filter
In this study, the Savitzky-Golay (S-G) filter was used to reconstruct the essential shape of the EVI time-series curve [47].The S-G smoothing filtering is based on an asymmetric Gaussian function-fitting, also known as least squares or digital smoothing polynomial, which can be used to reduce the random noise from time series data [48,49].This filter is widely used for the reconstruction of time series of remote sensing vegetation index [50].Invalid points affected by external factors in the EVI time series will be eliminated during the S-G filtering course.We used ENVI software extensions to implement the S-G filter to perform an image-based time series filtering.The algorithm can be summarized as follows [47]: where f i represents the original EVI value in time-series; g i is the smoothed EVI value, which is the linear combination of C n and f i ; n is the width of the moving window to perform filtering, and nL and nR are the left and right edge of the signal component, respectively.Originally, if C n is a constant defined as C n = 1/(nL + nR + 1), then the S-G filtering becomes a moving window smoothing.The idea of Savitzky-Golay method is to find filtering coefficients for C n that preserves higher moments.Therefore, in Eq. ( 2), the C n is not a constant but a polynomial fitting function.Then a least squares fit is solved ranging from nL to nR to obtain C n .For a specific dataset of a time-series in a moving window, we defined the fitting function as a quadratic polynomial for a specific f i [51,52]: where t corresponds to the day of the year in the EVI time series.The S-G filtering is defined as a weighted-moving-average with weighting given as a polynomial of a certain degree.The filter can use any number of points for this weighted average.The returned coefficients, when applied to a signal, perform a polynomial least squares fit within the filter window.This polynomial is designed to preserve the high moments within the data and reduce the bias introduced by the filter.After the S-G filtering, the EVI time series were constructed for the two study areas (Figure 3) [47].
Where  corresponds to the day of the year in the EVI time series.
The S-G filtering is defined as a weighted-moving-average with weighting given as a polynomial of a certain degree.The filter can use any number of points for this weighted average.The returned coefficients, when applied to a signal, perform a polynomial least squares fit within the filter window.This pol

Generating Subclasses for the Two Study Areas
We randomly generated both training and validation sample locations for winter wheat and validation samples for no-winter wheat in the two study areas (Table 2).The 2017 CDL map was reprojected to Albers Equal-Area Conic projection and resampled to a 250 m spatial resolution by the nearest neighbor resampling method using ArcGIS ™ 10.1 to correspond to the MODIS EVI time-series.The random samples for the winter wheat sites were obtained from the CDL map for Kansas and randomly divided into two sets, i.e., training samples and validation samples.Based on the Google Earth images, samples in the NCP were selected combined with the recorded agrometeorological stations, and all were randomly divided into training and validation samples.Winter wheat has unique phenological characteristics different from other crops, which might be easier to identify and differentiate from other land features on remote sensing images.However, the EVI profiles may show different patterns for the same crop type as a result of intraclass variability resulting from regional variations in environmental conditions and management practices [8,14,17].EVI curve morphology is affected by many factors, such as planting density, winter wheat varieties, climate factors, soil types, topographical conditions, etc.
In this study, the intraclass variability of winter wheat was fully considered in the process of building the subclass sets of training samples.The over-wintering period is an obvious physical feature of winter wheat fields, which divides the entire winter wheat growth period into two phases and results in double peaks in the EVI curve.In Figure 4, the EVI peak values before the wintering phase change from 0.1 to nearly 0.65 and the second peak values vary from about 0.35 to around 0.85.The wide differences at each EVI peak exhibit obvious winter wheat intraclass variability, which was adopted as the unique features to extract subclass training samples.Selecting these two EVI peaks as the segmentation points to divide subclasses is more appropriate than other points.
The training samples were randomly selected across the study area, which represent the winter wheat crop for the whole study area.Figure 4 shows that the EVI values at two peaks were distributed roughly even from low to high.Taking the medians of EVI peak values as the thresholds can divide the total training samples into each subclass with similar amounts.The median of second EVI peak values was 0.53, which was used for dividing all training samples to two first-level subclasses.The thresholds 0.3 and 0.35 were the medians of two first-level subclasses and subdivided training sets into four second-level subclasses.Thus, the segmentation thresholds of four subclasses in two study areas are shown in Table 3.

Calculating the Separability of Subclasses Using Jeffries-Matusita (JM) Distance
In combination with the Jeffries-Matusita (JM) Distance, we provided a measure of the overall separability between four subclasses and other land cover types [53].JM distance has been demonstrated to be an effective metric for evaluating the separability of training samples in remote sensing-based classification [14,54].We examined the JM distances using the EVI time series for each pair of types (Table 4) to evaluate the feasibility of segmentation thresholds of each subclass.
To calculate the JM distance, we randomly selected six main land cover types (not including winter wheat) as the training sets from CDL maps in Kansas.Combining with Google Earth images and the GlobeLand30 dataset (http://www.globeland30.org/GLC30Download/index.aspx), five major land cover types were also selected in the NCP.Land cover types obtained for calculating JM  In combination with the Jeffries-Matusita (JM) Distance, we provided a measure of the overall separability between four subclasses and other land cover types [53].JM distance has been demonstrated to be an effective metric for evaluating the separability of training samples in remote sensing-based classification [14,54].We examined the JM distances using the EVI time series for each pair of types (Table 4) to evaluate the feasibility of segmentation thresholds of each subclass.
To calculate the JM distance, we randomly selected six main land cover types (not including winter wheat) as the training sets from CDL maps in Kansas.Combining with Google Earth images and the GlobeLand30 dataset (http://www.globeland30.org/GLC30Download/index.aspx), five major land cover types were also selected in the NCP.Land cover types obtained for calculating JM distances for two study areas are shown in Table 4.The calculation of JM distance between two types is [14]: where, x represents a span of EVI time series values, and C j and C k are the two types under consideration.
Under normality assumptions, Equation (3) reduces to JM = 2 1 − e −B , where where, µ j and µ k correspond to type-specific, expected EVI values, and j and k are unbiased estimates for the type-specific covariance matrices.The range of JM distance distribution is between 0 to 2. Larger JM distances indicate more distinct distributions between two types, which favors successful type identification.

Calculating Standard Vectors for Two Study Areas
Considering the EVI time series as n-dimensional vectors, the average of each subclass training dataset was calculated as the standard vector.We calculated the standard vector in each subclass as: where n is the dimension of EVI time series, i = n − 1.
Based on the above equation, we calculated four standard vectors (e.g., subclass 1, subclass 2, subclass 3, and subclass 4) for the two study areas, respectively.The differences between four standard vectors illustrating the intraclass variability of winter wheat are shown in Figure 5.
Where,  and  correspond to type-specific, expected EVI values, and ∑  and ∑  are unbiased estimates for the type-specific covariance matrices.The range of JM distance distribution is between 0 to 2. Larger JM distances indicate more distinct distributions between two types, which favors successful type identification.

Calculating Standard Vectors for Two Study Areas
Considering the EVI time series as n-dimensional vectors, the average of each subclass training dataset was calculated as the standard vector.We calculated the standard vector in each subclass as: Where  is the dimension .

Calculating Two Parameters
Based on the standard vector calculated for each subclass (Figure 5), two parameters cosθ and D were generated for both study areas in ENVI-IDL environment.Two parameters (angle and distance) were calculated between the total EVI time series for the whole study areas and each standard vector, respectively, as [55-58] (7): V is EVI time series vector, → V s is standard vector, i = n − 1, n is the dimension of the vector, θ and D are the angle and distance between the EVI time series vectors and four standard vectors, respectively.
We generated eight parameters, including four angle parameters (e.g., cosθ 1 , cosθ 2 , cosθ 3 , and cosθ 4 ), and four distance parameters (e.g., D 1 , D 2 , D 3 , and D 4 ) were then generated for each study area.Parameters with the same number label were calculated from the same standard vector.For example, the parameter cosθ 1 and D 1 were computed based on the standard vector extracting from subclass 1.

The Sensitivity Tests to Thresholds of Parameters
We calculated the ranges of the angles and distances between the standard vectors and relative subtraining samples based on Equation (7) to determine the thresholds.We also performed a sensitivity study to test the effects of thresholds on the mapping results.Eight tests were examined for each study area based on the same training samples (Table 5).The optimum thresholds of four subclasses were determined by the evaluation of the sensitivity test.

The Algorithm to Extract Winter Wheat Mapping
Nonvegetated areas were masked out based on the smoothed EVI time series.According to the training samples, maximum EVI values of winter wheat were determined to be greater than 0.35 at both study areas [59,60].Four submaps were extracted based on individual thresholds of the four subclasses.We integrated all four submaps and areas under the mask using Equation (8) to generate the winter wheat maps in both study areas.
where thresholds a 1 ~a4 , d 1 ~d4 were determined according to subclass training samples.

Statistical Analysis
We evaluated our results using the statistical data at the state/regional level for the two study areas.Percentage Error (PE) was used to quantify the differences of winter wheat mapping areas between the statistical data and our results as: The accuracy was calculated as: where Observed and Estimated represent the statistical data and results in this study.Root mean square error (RMSE) and coefficient of determination (R 2 ) were used to compare the statistical data and estimated winter wheat areas at the county level for the two study areas as: where n represents the number of counties; y i and x i are the statistical and estimated data, respectively.
For the site-level evaluation, the confusion matrix was created, including the overall accuracy (OA), Kappa coefficient (KAPPA), producer's accuracy (PA), and user's accuracy (UA).The overall accuracy (Equation 13) represents the percentage of validation samples that are correctly identified [61].The Kappa coefficient measures the agreement between observations and prediction results.Kappa coefficient value of 1 represents perfect agreement, while a value of 0 represents no agreement.(the definitions of producer's accuracy and user's accuracy have been explained in Section 2.2.2) Overall accuracy and the Kappa coefficient were computed as following [62]: where r is the number of rows in the matrix, x ii is the number of observations in row i and column i, x i+ and x +i are the marginal totals of row i and column i, respectively, and N is the total number of observations.

Landscape Metrics Analysis
Remote sensing-based crop mapping accuracy achieved based on moderate resolution data is related to the spatial heterogeneity of the observed cropland [28].In this study, we quantified the relationship between landscape metrics and winter wheat mapping accuracy.We reclassified all land cover types into two classes (winter wheat and others) using ArcGIS ™ 10.1 based on the CDL dataset for Kansas.We then generated individual maps for 75 counties from the reclassified map and calculated the percentage errors of our results based on the CDL dataset using Equation (5).Two landscape fragmentation metrics were calculated in FRAGSTATS version 4.2 to characterize the landscape structures for these 75 counties.As a metric of spatial configuration, the fragmentation index (FRG) was calculated based on the number of patches and pixels of a given class [63].Percentage of the landscape (PLAND) is the other class metric, which is related to landscape fragmentation because it measures the fraction or proportional abundance of a particular patch type [64].The relationships between landscape metrics and extraction errors of our improved approach at the county level were analyzed using a linear regression model.The metrics were calculated as follows: where n is the number of winter wheat patches, and m is the number of pixels composing the patches of winter wheat.
where i = 1, . . ., s. s is the total number of patch types, a ij is an area of the jth patch that belongs to cover type i, A is the total area of the landscape.

The Approach Integrated the Angles and Distances without Considering Intraclass Variability
To validate the importance of intraclass variability on winter wheat mapping accuracy, we conducted a comparative experiment to map winter wheat for both study areas using the same training samples.The same approach described above was used (Section 3.5) but without considering intraclass variability.
In the comparative experiment, we did not divide subclasses.Based on Equation ( 6), we calculated the standard vector using all training samples for both study areas, respectively.Equation ( 7) in Section 3.5.2was used for generating two parameters (cosθ and D).Equation ( 8) was reduced to the following: where thresholds a and d were determined according to all training samples.

The Traditional Classification Methods without Considering Intraclass Variability
To further compare the results of our approach to existing methods, we generated winter wheat maps using three traditional classification methods including the maximum likelihood (MLC) [23], support vector machine (SVM) [25] and artificial neural network (ANN) [24].Based on the training samples shown in Table 4, we conducted winter wheat mapping using these three traditional classification methods for two study areas, respectively.The accuracies were evaluated using the same validation samples.

Separability Comparisons Based on the Jeffries-Matusita (JM) Distance
The separabilities between each winter wheat subclass and other land cover types in the time-series EVI data were investigated using the JM distance in Kansas and the NCP (Table 6).The JM distances among four subclasses are higher than 1.92 in both study areas, indicating that four subclasses are separable under the determined segmentation in this study.In addition, all four subclasses of winter wheat had JM distances over 1.98 when compared with each of the other land cover types.

Sensitivity Study for Testing Thresholds of Parameters
Accuracies based on CDL maps in Kansas and statistical data in the NCP (in Section 3.6) were used to evaluate the effects of the thresholds of parameters on mapping accuracies in two study areas (Table 7).The thresholds with the best performance (Test 6) for each study area were selected as the optimum parameters.Thus, the winter wheat mapping distribution was generated based on the optimum parameters in both study areas.

Winter Wheat Distribution Mapping for Kansas and the NCP
We used the optimal parameter values (Table 5) from the sensitivity study (Test 6) to map winter wheat in Kansas and the NCP (Figure 6).For Kansas, the spatial distribution of winter wheat (Figure 6a) is consistent with that derived from the CDL maps developed based on the Landsat images (Figure 1a).Winter wheat fields mainly concentrate in the middle and west area and distribute sporadically in the southeast of Kansas.There are only minor areas of winter wheat cultivation in the Northeast.For the NCP, winter wheat is the predominant crop type in the summer growing season, accounting for more than 90% of the farmland in this area [65].The primary region of winter wheat cultivation is in south-central Hebei province, almost the entire Henan province, the west and southwest of Shandong province, and the north of Anhui and Jiangsu provinces (Figure 6b).sporadically in the southeast of Kansas.There are only minor areas of winter wheat cultivation in the Northeast.For the NCP, winter wheat is the predominant crop type in the summer growing season, accounting for more than 90% of the farmland in this area [65].The primary region of winter wheat cultivation is in south-central Hebei province, almost the entire Henan province, the west and southwest of Shandong province, and the north of Anhui and Jiangsu provinces (Figure 6b). (a)

Evaluation of Winter Wheat Maps at the State/Regional Level
For evaluating the winter wheat distribution maps, we calculated percentage errors and accuracies for Kansas and the NCP (Table 8).The evaluation results showed favorable performances in the two areas with percentage errors lower than 10%, i.e., the accuracies were both higher than 90% as compared with the statistical data at the state/regional scale.Specifically, the winter wheat mapping accuracy in Kansas (95.1%) was slightly higher (2.2%) than that in the NCP (92.9%).

Evaluation of Winter Wheat Maps at the State/Regional Level
For evaluating the winter wheat distribution maps, we calculated percentage errors and accuracies for Kansas and the NCP (Table 8).The evaluation results showed favorable performances in the two areas with percentage errors lower than 10%, i.e., the accuracies were both higher than 90% as compared with the statistical data at the state/regional scale.Specifically, the winter wheat mapping accuracy in Kansas (95.1%) was slightly higher (2.2%) than that in the NCP (92.9%).Moreover, the accuracy at the state level in this study was 99.2% compared with the Landsat-derived CDL maps for Kansas [66,67].The estimated winter wheat areas were slightly larger than those from statistical data partially because the farmland's fragmentation resulted in some discrete pixels of other land types being assigned to winter wheat, which cannot be detected at a 250 m spatial resolution [18].

Evaluation of Winter Wheat Maps at the County Level
We compared our winter wheat distribution maps in 2017 for Kansas and 2012 for the NCP against the ground reference data at the county level (75 counties for Kansas and 33 counties for the NCP).The results showed that our estimated winter wheat areas are in good agreement with that of the county-level for Kansas (Figure 7a), with a coefficient of determination (R 2 ) as high as 0.97.For the NCP, our results showed an R 2 of 0.71 as compared with census based on 33 counties.The RMSE value in Kansas was about one-fourth of that in the NCP, illustrating that the winter wheat mapping approach performed better in Kansas than in the NCP.
We compared our winter wheat distribution maps in 2017 for Kansas and 2012 for the NCP against the ground reference data at the county le

Evaluation of Winter Wheat Maps at the Site Level
We further evaluated our winter wheat maps at the site level based on randomly selected sample units derived from the CDL for Kansas and agrometeorological stations for the NCP [62,68].The evaluation samples were selected across the entire study area to ensure they reflected diverse environmental conditions and management practices [18].These samples were independent of the training data [69].
Confusion matrix suggested that the overall accuracies are 90.3% and 85.0% for Kansas and the NCP, respectively.The Kappa coefficient was 0.81 for Kansas, 0.11 higher than that for the NCP.The producer's and user's accuracy in Kansas were 87% and 93.2%, respectively, slightly higher than those for the NCP (Table 9).

Correlation between Landscape Metrics and Winter Wheat Mapping Accuracy
For estimating the effects of landscape fragmentation on the mapping accuracy, we calculated two class-scale fragmentation metrics (FRG and PLAND) for 75 counties based on the CDL map in Kansas.We divided the calculated landscape metrics into five categories according to the fragmentation levels and examined the correlations between the metrics and the percentage errors of our winter wheat maps at the county level (Table 10).Our analysis showed a strong positive correlation (r = 0.99) between the winter wheat FRG and the percentage errors, indicating a better performance for our approach in areas with lower fragmentation.Similarly, higher PLAND values correspond to higher mapping accuracies.When the PLAND value was less than 1%, the average percentage error reaches 59.4%; when the PLAND values were larger than 20%, the average percentage errors were lower than 10%.−0.79 1 The percentage errors were calculated between CDL maps and our results at the county level using Formula (9); * P < 0.05.

Comparisons with Other Methods without Considering Intraclass Variability
Table 11 shows the overall accuracies and Kappa coefficients of winter wheat mapping using methods with and without considering intraclass variability in the two study areas.The overall accuracies were calculated based on the same validation samples using the confusion matrix in each study area.Similar performance was detected by SVM and ANN methods in both study areas.The MLC method showed the lowest overall accuracy (73% in Kansas).In the NCP, overall accuracy is lower than 70% using the integrated angles and distances of the EVI time series without considering intraclass variability.The improved approach presented in this study performed well, with overall accuracies over 85% in both study areas.As a result, the overall accuracies in Kansas and the NCP have been improved by 17% and 15% with Kappa coefficients increased by 0.35 and 0.38, respectively, using the improved approach.The comparisons among methods with and without considering the intraclass variability reveal that our approach can effectively improve winter wheat mapping accuracy.

Winter Wheat Mapping Approach Considering Intraclass Variability
A range of crop type mapping approaches has been developed in previous studies based on the similarities of the same class and dissimilarities of different categories [12,14,70,71].However, few of them have included intraclass variability caused by the differences in environmental status and management practices [22].Characterizing the intraclass variability (e.g., crop phenology, growth conditions, field management) over large areas remains a big challenge [8,14,17].The improved quality of remote sensing images (spectral, spatial, and temporal resolutions) led to an increase not only in the interclass variability but also in the intraclass variability [31], which might compromise the mapping accuracy when the intraclass variability is more apparent than interclass differences.Therefore, considering intraclass variability is necessary for individual crop type identification.
Although high-resolution remote sensing imagery provides the possibility to identify the differences within a class [12], it is still difficult to characterize such factor-driven differences using the traditional crop type mapping methods [31,71].In our study, the intraclass variability was considered for winter wheat areas mapping based on MODIS EVI time series data.We found wide differences among all MODIS EVI time series curves of winter wheat samples at both study areas (Figure 4).Two peaks of the EVI time series curve were used to express the intraclass discrepancy of training samples for introducing intraclass variability into the mapping approach.We divided all training samples into four subclasses based on the differences of EVI values at the two peaks, and then introduced two parameters (the angle and distance of multidimensional vector) to describe the characteristics of subclasses.JM distances were evaluated between each subclass of winter wheat and other land cover types, which showed that all subclasses were highly distinguishable from others.The evaluation of results considering intraclass variability suggested that the two parameters were important for characterizing the intraclass variability.Our mapping accuracies of winter wheat were both over 92% (Table 10) compared with statistical data at the regional/state level.
We also compared our results with methods without considering the intraclass variability of winter wheat.Considering intraclass variability improved the winter wheat mapping accuracies by 17% in Kansas and 15% in the NCP, respectively.Our results demonstrated the importance of introducing subclass training samples into crop mapping.This study also showed that the MODIS 250 m EVI data could serve as a powerful tool to assess the intraclass variability for crop type identification [14].

Factors Influencing the Accuracy of Winter Wheat Maps
The differences in mapping accuracies at two study areas might be attributed to the specific conditions of these two study areas.Compared with Kansas, winter wheat fields in the NCP are difficult to identify partially due to the relatively small field sizes [20,72], irregular patterns of crop system [65], and relatively fewer training samples.Same as most of the cropland areas in China, croplands in the NCP are split into many small parcels allocated to households, and farmers have their freedom on planting under private land use rights [73].Our landscape metrics analysis demonstrated the impacts of cropland fragmentation on the mapping accuracies [20].Higher land fragmentation level generally corresponds to lower winter wheat mapping accuracies, which may partly explain the relatively higher mapping accuracies in Kansas.Moreover, due to the specific land allocation policy in China, croplands in the same region might be dominated by different cropping patterns or growing times, which shows a high degree of fragmentation on remote sensing images and makes it more complicated for crop mapping.This also suggested that the MODIS EVI 250 m data has a significant advantage for crop mapping in intensively managed landscapes with a low degree of fragmentation [70].In the U.S. Central Great Plains, the field sizes are generally larger than 30 ha, which spatially corresponds to five or more 250 m pixels [18].In addition, the lack of training samples in the NCP might be another factor influencing the mapping accuracy.The CDL map at 30 m spatial resolution provides a useful dataset for generating enough effective training samples for crop mapping.In China, the existing crop mapping products are very limited for choosing training samples.In this study, training samples collected from the agrometeorological stations have coarser spatial resolution than those extracted from the CDL map for Kansas.The mapping accuracy in the NCP could be improved as more ground information or more high-resolution crop distribution maps are available.

Comparison with Other Studies
Currently, intraclass variability has not been well considered in crop mapping.Few efforts have been made to use areas segmentation to analyze and reduce intraclass differences.For example, Wardlow et al. calculated and compared average VI profiles for each crop type at the ASD-USDA NASS Agricultural Statistics District level to assess their intraclass regional variations based on MODIS data [14].They detected that the regional intraclass variations reflect the state's climate and planting date gradient across Kansas.In another study [18], considering the notable regional differences in crop area and locations, Wardlow et al. presented a hierarchical crop mapping approach using a decision tree classifier and identified individual crop types including winter wheat based on the crop/noncrop maps with accuracies ranging from 84% to 94%.Different from Wardlow et al. [18], our approach is capable of detecting winter wheat from MODIS products without the need for a cropland-based map or a large number of field sites as inputs, and therefore has potential to be applied over large areas where crop maps and site-level observations are scarce.Using MODIS 250 m data, Massey et al. divided the U.S. cropland areas into 299 subzones for reducing the intraclass variability and classified wheat-barley with 84.2% and 74.5% in producer's accuracy and user's accuracy, respectively [33].In this study, we divided the training samples into four subclasses and obtained higher producer's accuracy (87.0%) and user's accuracy (93.2%).
For the NCP, a large number of studies have focused on winter wheat mapping.For example, Pan et al. constructed a Crop Proportion Phenology Index (CPPI) to estimate the winter wheat area using MODIS EVI time series and achieved accuracies with R 2 varied from 0.5 to 0.9 and RMSE ranging from 5% to 20% compared with the Landsat-based maps [20].Their study highlighted the importance of the representativeness, quality, and quantity of samples for the mapping accuracies at a large scale because the agricultural areas may have diverse planting structures, climate conditions, and topographies.Notably, this study also highlighted that dividing the large study area into different subregions based on various factors and selecting different samples in each subregion would be helpful and need to be tested in future work [20].Tao et al. also performed winter wheat mapping on the NCP and showed the accuracies in seven provinces to range from 87 to 96% compared with statistical data [72].In this study, our mapping accuracy for the NCP region was 92.9%, and the R 2 was 0.71 compared with the county-level census, falling within the range reported in previous studies.Our accuracies were calculated based on the entire NCP region (vs.seven provinces in Tao et al. [60]) or county-level census (33 counties vs. samples from the Landsat-based maps in Pan et al. [20]).Moreover, our study introduced the subclasses as training samples for winter wheat mapping, combined with two parameters, i.e., the angles and distances based on EVI time series data, which represents an advanced approach compared to previous studies.The introduction of the angles and distances allowed us to capture the unique phenological features of winter wheat and achieved a better mapping performance.

Uncertainty Analysis and Future Needs
In this study, we demonstrated the advantages of introducing intraclass variability to winter wheat mapping.However, some uncertainties should be addressed in the future.First, the parameter thresholds used for winter wheat mapping are different for Kansas and the NCP region, and are related to crop field management, the degree of cropland fragment, crop condition, climate, etc.These factors should be quantified in the future to examine how they affect the setting of thresholds under different conditions.Second, we envision that our approach has the potential to be well suited for relatively wide agroclimatic regions.Further efforts are needed to evaluate the feasibility in other winter wheat areas where the environmental conditions may be substantially different from those in our two study areas.Third, more work is needed for extending this approach to other crop types such as rice.In addition, although the 250 m MODIS data used can be applied to agricultural areas where winter wheat is the dominant crop, finer resolution satellite data such as Sentinel-2 MSI are needed to provide imagery with a high spatial resolution for crop mapping over regions where highly fragmented crop fields occur.As more field observations and high-resolution images become available, the performance of this approach could be further enhanced in future research.

Conclusions
In this study, we presented an improved approach for winter wheat mapping based on the MODIS EVI time series, in which intraclass variability effects were considered.To the best of our knowledge, this study offered the first attempt to introduce the subclass training samples for interpreting the intraclass variability based on the angles and distances of EVI time series in winter wheat area mapping.Comparison with other methods showed that our approach considering intraclass variability improved the winter wheat mapping accuracies by 17% in Kansas and 15% in the NCP, respectively.The mapping results compared well with those from multiscale ground information at the two areas of interest.Further analysis demonstrated that the mapping accuracy could be significantly affected by landscape fragmentation.This study provides a new perspective for decreasing the influences of intraclass variability in cropland mapping.Our winter wheat identification algorithm can be applied to other agricultural areas where ground information is available for parametrization and validation.In addition, the derived winter wheat maps can be used to drive crop or ecosystem models for investigating regional variations in crop productivity and associated environmental consequences.

Figure 1 .
Figure 1.Maps of study areas.(a) Major crop distribution across Kansas (Image based on data from the 2017 National Agricultural Statistics Service Cropland Data Layer).(b) North China Plain (NCP).

Figure 2 .
Figure 2. Overall process diagram of this study.

Figure 3 .Figure 3 .
Figure 3. Enhanced vegetation index (EVI) time series before and after S-G filter smoothing.

Figure 4 .
Figure 4. EVI curves of training samples.(a) Kansas.(b) NCP. and  are the first and second peaks of EVI time series, respectively.

Figure 4 .
Figure 4. EVI curves of training samples.(a) Kansas.(b) NCP.EVI i and EVI j are the first and second peaks of EVI time series, respectively.

Figure 5 .
Figure 5. Four standard vectors in Kansas (a) and the NCP plain (b).

4 Figure 5 .
Figure 5. Four standard vectors in Kansas (a) and the NCP plain (b).

Figure 6 .
Figure 6.Winter wheat mapping results for (a) Kansas in 2017 and (b) the NCP in 2012.

Figure 6 .
Figure 6.Winter wheat mapping results for (a) Kansas in 2017 and (b) the NCP in 2012.

Figure 7 .
Figure 7. Winter wheat mapping verification at the county level (each point represents a county).(a) Kansas.(b) the NCP.The solid line is the 1:1 line.

wheat mapping in Kansas/the NCP region Accuracy assessment in Kansas/the NCP region
Figure 2. Overall process diagram of this study.

Table 2 .
The numbers of site-level training and verification samples for the two study areas.

Table 3 .
Segmentation points and training samples of four subclasses in the two study areas1.EVI i and EVI j are the first and second peak of the EVI time series, respectively.3.4.2.Calculating the Separability of Subclasses Using Jeffries-Matusita (JM) Distance

Table 4 .
Training sets for evaluating the Jeffries-Matusita (JM) distance among different types for two study areas.

Table 5 .
The settings of thresholds of parameters for sensitivity study in two study areas1 .
1 EVI values were multiplied by 10 4 to save calculation time.Parameters a 1 ∼ a 4 and d 1 ∼ d 4 represent the cosine of the angle and distance between EVI vectors of training samples and four standard vectors, respectively.

Table 6 .
Jeffries-Matusita (JM) distances for all pairs in Kansas and the NCP.

Table 7 .
The accuracies of the sensitivity study.

Table 8 .
Winter wheat distribution mapping evaluation at the state/regional level.

Table 9 .
Confusion matrix for Kansas and the NCP 1 .

Table 10 .
Correlations between landscape metrics and percentage errors at the county level 1 .

Table 11 .
Comparisons of overall accuracies between different methods using confusion matrix.Approach (without intraclass) does not consider the intraclass variability.