Evaluation of the 2010 MODIS Collection 5.1 Land Cover Type Product over China

Although the MODIS Collection 5.1 Land Cover Type (MODIS v5.1 LCT) product is one of the most recent global land cover datasets and has the shortest updating cycle, evaluations regarding this collection have not been reported. Given the importance of evaluating global land cover data for producers and potential users, the 2010 MODIS v5.1 LCT product IGBP (International Geosphere-Biosphere Programme) layer was evaluated based on two grid maps at scales of 100-m and 500-m,which were derived by rasterizing the 2010 data from the national land use/cover database of China (NLUD-C). This comparison was conducted based on a new legend consisting of nine classes constructed based on the definitions of classes in the IGBP and NLUD-C legends. The overall accuracies of the aggregated classification data were 64.62% and 66.42% at the sub-pixel and pixel scales, respectively. These accuracies differed significantly in different regions. Specifically, high-quality data were obtained more easily for regions with a single land cover type, such as Xinjiang province and the northeast plain of China. The lowest accuracies were obtained for the middle of China, including Ningxia, Shaanxi, Chongqing, Yunnan and Guizhou. At the sub-pixel scale, relatively high producer and user accuracies were obtained for cropland, grass and barren regions; the highest producer accuracy was obtained for forests, and the highest user accuracy was obtained for water bodies. Shrublands and wetlands were associated with low producer and user accuracies at the sub-pixel and pixel scales, of less than 10%. Based on dominant-type reference data, the errors were classified as mixed-pixel errors and labeling errors. Labeling errors primarily originated from misclassification between grassland and barren lands. Mixed pixel errors increased as the pixel diversity increased and as the percentage of dominant-type sub-pixels decreased. Overall, mixed pixels were sources of error for most land cover types other than grassland and barren lands; whereas labeling errors were more prevalent than mixed pixel errors when considering all of the land cover data over China, due to the large amount of misclassification between the pure pixels of grassland and barren lands. Next, the accuracy of cropland/natural vegetation mosaics was assessed based on the qualitative (a mosaic of croplands, forests, shrublands, and grasslands) and quantitative (no single component composes more than 60% of the landscape) parts in the definition, which resulted in accuracies of 91.43% and less than 19.26%, respectively. These results are summarized with their implications for the development of the next generation of MCD12Q1 data and with suggestions for potential users of MCD12Q1 v5.1.


Introduction
Land cover research is important, because land is the material base for human activities, particularly in countries with large populations, such as China and India. Vegetation growing on land provides humans with food, fuel and fiber [1]. Buildings for human habitation are primarily constructed on the land surface. Water resources that flow on the land surface are essential for all forms of life. However, humans alter the land surface by converting natural vegetation to agriculture, urban development, inundated lands, reservoirs and tree plantations [2].The anthropogenic modification of land cover is one of the most important sources of global land cover change, particularly due to current rapid population growth and economic development [3].Alterations in global land cover also affect the Earth's climate and biogeochemistry patterns, as well as its biodiversity through terrestrial surface processes, such as energy exchange, water cycles, carbon cycles, hydrological cycles, biogeochemical cycles and climate, which influence the distribution of land cover classes. Thus, land cover will respond to the changing climate. To thoroughly analyze the interactions between climate and land cover using regional-to global-scale Earth system models, accurate global land cover information is required [4,5]. Land cover data are also useful for planning and practicing land use resource management and weather forecasting [6]. Accurate and updated land cover data provide important information regarding the state of land cover for policy makers and the scientific community.
Remote sensing has now become a basic source for mapping global land cover data since the first global land cover map was compiled and produced from remote sensing data [7]. In the past two decades, many remote sensing-based global land cover datasets have been produced for different national or international initiatives. These datasets can be divided into three categories based on the resolution of the remote sensing data used: (1) coarse resolutions equal to or greater than 1 km [8][9][10][11][12][13]; (2) moderate resolutions between 100 m and 1 km [14][15][16]; and (3) datasets based on Landsat satellite 30-m data [17]. The spatial resolution of the remote sensing data used in global cover mapping gradually increases with time. Most coarse spatial resolution global land cover datasets are one-time datasets, and newer datasets are mainly derived from the MODIS 1-km monthly product from 2003 [13]. However, the global land cover products that are derived from MODIS 500-m data and MERIS 300-m data are updated annually [18] and at four-year intervals [16,19], respectively. Datasets based on Landsat satellite data are considered as next-generation global land cover datasets because of their finer spatial resolution and can provide sufficient spatial and thematic details for global change studies [20]. Although the first global land cover datasets derived from Landsat satellite data have been obtained, several challenges must be overcome before producing a high-quality product, such as the unavailability of consistent satellite data with global coverage. From the perspective of the land cover user community, land cover data should be current (no more than 10 years old) and periodically updated and improved [21]. According to these criteria, the two global land cover products derived from moderate spatial resolution satellite data are more suitable for real-time applications for the user community of global land cover maps than datasets from the other two categories.
Evaluation is important for obtaining accurate and credible applications of global land cover products [22,23] and is a continuous process that must be performed in parallel with the derivation of new global land cover datasets. The evaluation of global land cover products was usually conducted in three ways: data was evaluated by producers, who selected sample sites all over the world [24]; different datasets were compared without any reference data [25]; and regional subsets of global land cover data were evaluated by regional scientists based on regional land cover data [26].This paper adopted the third method. Existing research has primarily focused on evaluating coarse resolution global land cover datasets. Those studies have provided significantly useful information for the user community and for producers [27][28][29][30], including accuracy assessments, spatial agreements and spatial disagreements between different datasets, as well as error source analysis. Therefore, these studies are essential for users and producers [31][32][33][34][35]. However, research regarding the evaluation of two moderate resolution global land cover products is relatively scarce.
The MODIS v5.1 LCT product is one of the most recently available global land cover products and has the shortest updating cycle. A comprehensive accuracy assessment of the MODIS v5.1 LCT product is required to highlight regional differences in its overall accuracy and in thematic accuracies. This assessment is also important to allow producers and potential users to understand the strengths and weaknesses of this product. For example, differences between data quality from different regions can be remarkable. Furthermore, while the total areas for different land cover types in the classification and reference data may be similar, the spatial distribution of one land cover type in two datasets can be vastly different. This difference varies for different land cover types, leading to differences in the class-specific accuracies. For the producers of MODIS LCT products, it is useful to obtain related information regarding the spatial agreement and disagreement between the MODIS land cover data and the reference data. The areas of spatial disagreement may require additional training data to generate the next collection of the MODIS LCT product. However, users could directly utilize the data from the areas of agreement and for some classes with high thematic accuracies. However, users might have to replace these data with other available and more accurate datasets when considering areas of disagreement or when using data with low class-specific accuracies.
The goal of this paper is to highlight the general patterns of disagreement between the MODIS v5.1 LCT product IGBP (International Geosphere-Biosphere Programme) layer over China and the national land use/cover database of China (NLUD-C) and to analyze how and where mixed pixels influence mapping accuracy in support of future efforts to provide improved land cover mapping of China in the next collection of the MODIS LCT product. In this study, we used the evaluation results from the MODIS land cover data to explore specific challenges in global land cover mapping with moderate spatial resolution remote sensing images that are primarily based on class accuracy statistics and the qualitative analysis of the error distribution over China.

Classification Data
The MODIS Land Cover Type product was derived for scientific applications that require land cover information at regional to global scales [11].The MODIS v5.1 LCT product (MCD12Q1 v5.1) provides global land cover maps with a spatial resolution at 500-m using five types of classification systems and with annual time steps from 2001 to 2011 [15]. In this research, we used the IGBP layer of the MODIS land cover data, which contains 17 land cover classes.
The MODIS LCT data were downloaded from NASA [36]. We acquired data that were available in the Hierarchical Data Format (HDF) on the MODIS sinusoidal grid and projected them to an Albers projection system (Table 1), which was adapted in the NLUD-C. The data were downloaded as 40 tiles and mosaicked together using the MODIS Re-projection Tool. Then, the mosaic tile was clipped according to a polygon boundary of China to include Mainland China, Hainan and Taiwan. The validation of land cover data requires similar classification data derived from independent sources. These independent products must be considerably more accurate than the products that are being evaluated. The reference data used in this research were from 2010 and were obtained from the NLUD-C. The NLUD-C was constructed to provide accurate statistics for 25 Level 2 land cover classes using the NLUD-C nomenclature with a scale of 1:100,000. The database was produced by experts who visually interpreted the Landsat MSS/TM/ETM+ images and manually delineated the boundaries of the objects in a GIS environment (similar to the CORINE Land Cover Project) [37]. All experts were from the institutes of the Chinese Academy of Science (CAS) from different provinces and were familiar with the local land cover conditions with extensive experience in visual interpretation. The NLUD-C provides 6 land cover datasets from the 1980s to 2010. The NLUD-C is updated by manually observing the land cover changes [38]. According to the sampled field investigations, both the overall and Level 1 class accuracy of land cover changes exceed 90% [39]. The classification errors in the extraction of land cover changes were modified after the field investigation. More information regarding the NLUD-C is provided in [38]. Two grid maps at spatial resolutions of 100 m and 500 m were obtained by rasterizing the 2010 data from NLUD-C and were used as reference data for evaluating the IGBP layer of the 2010 MODIS v5.1 LCT product over China. The 100-m reference data were derived because the smallest polygon size in NLUD-C was approximately 4 × 4 pixels. During this process, the value of the grid was decided by using the maximum area principle, in which the class type of the pixel possesses the highest area percentage according to the sub-pixels within the extent of the pixel. The data were evaluated by comparing the classification and reference data. This comparison can only be accomplished when the two types of data use an identical legend. However, the legend for the classification data included the IGBP classification system, which was composed of 17 categories. In contrast, the legend for the reference data consisted of 25 categories. Thus, a new legend comprised of 10 categories was constructed, as shown in Table 2. The new legend reduces the thematic details of the two original legends. The greatest change that occurred when converting between the IGBP classification system and the new legend occurred in the vegetation classes. Seven natural vegetation classes in the IGBP system were aggregated into one class, namely the forest class. Among these classes, woody savannas and savannas are unique and cannot be found in the NLUD-C classification system. Based on the definition of forests (natural or artificial forests with forest canopy cover greater than 30%) and sparse woodlands (lands with forest canopy cover of 10%-30%) in the NLUD-C nomenclature and the definition of woody savannas and savannas in the IGBP legend, aggregating woody savannas and savannas into forests was optimal. Croplands/natural vegetation mosaics are common in low-resolution land cover maps. A category corresponding to this mixed category cannot be found in the NLUD-C classification system; thus, the mosaic type was evaluated separately. The new legend was specifically established for comparison; thus, classification based on this new legend is not suggested. The aggregated classification and reference data are presented in Figure 1.

Dominant-Type and Sub-Pixel Confusion Matrices
A confusion matrix can provide a site-specific assessment of the classification data that corresponds to the ground conditions [40] and is the most commonly used validation method. However, forming a reliable confusion matrix is difficult, because this matrix is related to several factors, such as sample design, reference data accuracy and the registration of the datasets [41]. Sample design is not required in this research, because reference data cover the entire country. As previously indicated, the accuracy of the reference data is sufficiently high for evaluating the MODIS land cover data.
The establishment of a confusion matrix is based on pixels, the size of which is generally identical to the spatial resolution of the classification map. However, conditions differed such that the spatial resolution of classification data is 500 m, while the resolution of the reference data is 100 m. Under these conditions, two types of confusion matrices were established, namely a dominant-type confusion matrix and a sub-pixel confusion matrix. The pixel size in the dominant-type confusion matrix is 500 m, while that in the sub-pixel confusion matrix is 100 m. The processing flow of a dominant-type confusion matrix and a sub-pixel matrix is presented in Figure 2. Therefore, two types of overall accuracies, producer accuracies and user accuracies, were derived from the two matrices. The overall accuracy at the sub-pixel level cannot achieve 100% as long as mixed pixels exist, because the resolution of classification data (500 m) is lower than that of the 100-m reference data. Reference-dominant land cover type accuracy presents the maximum accuracy that the overall accuracy at the sub-pixel level can achieve, namely the accuracy that is obtained by comparing the 500-m dominant land cover type reference data and 100-m reference data (RDA).

Mixed Pixel and Pure Pixel
Mixed pixels are a primary source of error when classifying low-resolution remote sensing images, and the number of mixed pixels generally increases with decreasing spatial resolution. In theory, a pixel is determined to be a mixed pixel from the field of information that corresponds to the pixel. In practice, obtaining field information for classifying all of the data is impossible; thus, the mixed pixel in this research was determined from the pixels in the reference data at 100 m within the extent of the pixels in the classification data. The pixels in the classification data were considered as a mixed pixel when the corresponding pixels in the 100-m reference data belonged to more than one land cover type. Mixed pixels were extracted by calculating the local diversity of each pixel in a classification map according to the reference data. The local diversity of a pixel in a classification map was measured by the number of land cover classes that occur in the corresponding reference sub-pixels within the extent of the classification pixel. In addition, a mixed pixel was defined based on the number of dominant-type sub-pixels within a classification pixel. If the number of land cover types that occurred in the reference data are greater than 1 or the number of dominant-type sub-pixels is less than 25, then this pixel is labeled as a mixed pixel. Theoretically speaking, the number of land cover classes that occur in the 100-m reference data must be greater than 1 if fewer than 25 dominant-type pixels occur. However, an exception occurs for pixels that are located on the boundary of China, for which the number of corresponding pixels in the reference data that have a value may be less than 25. For these exceptions, a pixel is considered as a mixed pixel when the number of dominant-type pixels is less than 25. The opposing concept to a mixed pixel is a pure pixel. All corresponding pixels within the scope of a pure pixel should belong to one type of land cover. The distributions of the two types of pixels are presented in Figure 3A. According to the dominant-type reference data or the 500-m reference data, correctly-classified pixels were divided into correctly-classified mixed pixels (CMs) and correctly-classified pure pixels (CPs). In addition, misclassified pixels were divided into misclassified mixed pixels (WMs) and misclassified pure pixels (WPs). The four types of pixels are presented in Figure 3B, C. In this paper, the mixed pixel error was computed as (WMs/sum) × 100%, whereas the labeling error was computed as (WPs/sum) × 100%, and the sum is the number of total pixels in the 500-m classification data.

Results and Discussion
This section discusses a sub-pixel level comparison of the classification data and 100-m reference data, including spatial agreement/disagreement and class-specific accuracies. Then, a pixel-by-pixel comparison of the classification data and dominant-type reference data is performed with an analysis of the factors that are related to classification error. These discussions are implemented for regions where crop/natural vegetation mosaics are eliminated. Finally, the mosaic type of croplands and natural vegetation is evaluated independently.

Spatial Agreement and Disagreement Analysis
The spatial agreement between the aggregated classification and reference data can be measured by the overall classification accuracy in the sub-pixel confusion matrix, which is 64.62%. However, overall accuracy is a national average result and does not reflect regional differences. A direct comparison of the classification and reference data is presented in Figure 4A. The accuracies in different regions differed considerably. High agreement was observed for regions with a single land cover type, such as the vast barren lands in northwest China, the vast forestlands in northeast China and the croplands of the Chengdu plain. By contrast, high disagreement was observed for regions with varied types, such as the barren grasslands in the northwestern Tibetan plateau, the agro-pastoral zone in the southeastern region of Inner Mongolia and the crop/natural vegetation mixing region in the second steppe of China (an area with an elevation of 2000 m to 4000 m). An accuracy map of different provinces was derived to indicate specific regional differences. The overall accuracies ranged from 36.11% to 76.52% ( Figure 4B). The highest accuracy was obtained for Xinjiang Province due to its vast barren lands. The most obvious low-accuracy region was comprised of five provinces in the middle of China, namely Ningxia, Shaanxi, Chongqing, Guizhou and Yunnan. All five provinces are located in the second steppe of China, a region that features varied land cover types. The low accuracies of the varied-type regions are caused by mixed pixels, and an analysis of mixed pixels is presented in Section 3.3.  To classify the data derived from pixel-based algorithms, the difference between the RDA and sub-pixel accuracy is superior for evaluating the classification results, because the RDA represents the maximum that can be achieved at the sub-pixel level of this classification. The RDA is 88.78%, which was 24.16% greater than the sub-pixel accuracy. The conditions of the provinces are depicted in Figure 5, where the sub-pixel accuracies gradually decreased from left to right. The RDA values of nearly all 32 provinces were approximately 80%, with an obvious decrease in the sub-pixel accuracies, which implied that considerable differences occurred between sub-pixel accuracies in the different provinces due to the pixel-based algorithm, and not the RDA, which represents the physical conditions of the landscape.

Spatial and Class Distributions of the Error in the Sub-Pixel Confusion Matrix
The user accuracies of different classifications have significant values that enable users and scientists from different fields to determine if the data for different classes are high-quality data. For each aggregated class in the classification map, Table 3 shows the percentages of pixels that belonged to each of the nine classes in the 100-m reference data. The water body and shrubland land cover types had the best and the worst qualities, respectively. Shrublands, most of which were actually grass, had a user accuracy at the sub-pixel level of 3.74%, which was the largest commission error. In addition, wetlands had low user accuracy due to the misclassification of croplands, forests and water bodies as wetlands. Misclassification between water bodies and wetlands is serious and may result from time series of images used in the MODIS LCT IGBP product and NLUD-C, which did not use multi-temporary images. Paddy fields, which belong to croplands in aggregated classes, may be a source of error for the misclassification of croplands and wetlands. Forests may be classified as wetlands, because some wetlands may have vegetation growing on them. The data for the different classes, such as shrublands and wetlands, were used with caution, because most of the pixels in these classes were misclassified. Barren regions  accounted for a high percentage of snow and ice due to the similar spectral features of these classifications.  Table 3 presents the commission errors for each class in the classification data, and the spatial distribution of the commission errors for the six classes are depicted in Figure 6. The classes misclassified as croplands were primarily urban and built-up, forestlands and grasslands ( Table 3) that were dispersed in cropland regions ( Figure 6A). Among them, the misclassified urban and built-up pixels in the North China Plain were mainly rural settlements that were dispersed and small in area. The grasslands that were misclassified as shrublands occupied 60.98% of the shrublands and were likely responsible for the low user accuracy of shrublands. This result indicated that the spectral features of the grasslands and shrublands could easily be confused. Classes that were misclassified as forests primarily consisted of croplands, shrublands and grasslands distributed in southwestern China ( Figure 6C). These types of lands were relatively dispersed and intermixed. Other regions, such as the northeastern plains of China and Taiwan were mostly correctly classified ( Figure 6C). Classes misclassified as grasslands, primarily consisted of croplands and barren areas. These types of land were located in relatively concentrated areas, with distinct borders between them. Misclassification between croplands and grasslands occurred in the agro-pastoral zone of China ( Figure 6D), where croplands and grasslands are distributed alternately, resulting in numerous mixed cropland and grassland pixels. Classes that were misclassified as urban and built-up primarily consisted of croplands, potentially due to mixed pixels. The regions around the cities were primarily occupied by croplands in China. Thus, mixed pixels of the two categories were common in these regions. Misclassification between built-up and croplands were intensive in the plain of North China, where rural residential areas were densely distributed ( Figure 6E). For barren areas, the misclassification between barren and grassland is notable due to the broad distribution of the barren regions. The considerably low overall accuracy in northwestern Tibet is likely the main source of misclassification between the grasslands and barren lands, as shown in Figure 6F.
This result does not indicate that most of the water bodies in the reference data were correctly classified, because the user accuracy only indicates the probability that a pixel from a land cover map matches the real-world or reference data. In fact, water bodies had high user accuracy and lower producer accuracy (Figure 7), which indicates that many of the water bodies in the reference data were misclassified (i.e., as wetlands). A similar class is the urban and built-up land class. The shrubland and wetland classes both had low user and producer accuracies, which indicated that the two categories were easily confused with the other classes. The classes with higher producer accuracies than user accuracies (i.e., forests and snow and ice) may include a considerable percentage of pixels that belong to other land cover types. Overall, the highest producer accuracy was obtained for forests, and the highest user accuracy was obtained for water bodies, both of which exceeded 80%.

Analysis of Error According to the Dominant-Type Reference Data
The overall accuracy in a dominant-type confusion matrix is generally greater than that in a sub-percentage confusion matrix [26]. In this research, the overall accuracy at the pixel scale was 66.42%, which was 1.8% higher than that at the sub-pixel level. The class-specific producer accuracy according to the dominant-type reference data differed greatly for different aggregated classes (Figure 8). At the sub-pixel scale, shrublands and wetlands possess the lowest producer accuracy due in part to the low percentage of pure pixels in the two classes.
The omission error, which is 100% minus the producer accuracy, can be divided into two parts, mixed pixel error and the labeling error at the pixel scale. Figure 8A presents the labeling error (WPs/sum), the mixed pixel error (WMs/sum) and the producer accuracy (the sum of WPs/sum and WMs/sum) for each land cover type. The greatest mixed pixel error was obtained for shrublands, followed by wetlands and urban and built-up lands. This result can be attributed to the high percentage of mixed pixels in the three land cover types (74.22% for shrublands, 64.32% for wetlands and 69.82% for urban and built-up). In addition, this result reflects the high percentage of WPs in the mixed pixels ( Figure 8B). The lowest mixed pixel error was obtained for barren regions, because they had the lowest percentage of mixed pixels (17.31%). In contrast with the mixed pixel error, wetlands had the greatest labeling error, followed by shrublands, primarily due to the high percentage of WPs in the pure pixels (97.72% for wetlands and 98.63% for shrublands) ( Figure 8C). The smallest labeling error was obtained for the forests, primarily due to the low percentage of WPs in the pure pixels ( Figure 8C). A high percentage of pure pixels do not assure a high labeling error. For example, the barren class had the highest percentage of pure pixels (up to 82.69%), whereas the labeling error for the barren class was much lower than those of the wetlands and shrublands. For all types of land cover, except the grasslands and barren lands, the mixed error was greater than the labeling error. Although the difference was small, these exceptions resulted from the high percentage of pure pixels. The classification error in low spatial resolution remote sensing images should be primarily caused by mixed pixels; however, the mixed pixel error was lower than the labeling error for the entire aggregated classification, with a percentage of WPs of 20.18% and a percentage of WMs of 13.4%. Overall, mixed pixels were sources of error for most of the land cover types, whereas labeling errors were the main classification error for the MODIS land cover data over China, due to the large number of misclassified pure pixels in the grasslands and barren lands.
The higher labeling error than mixed error just implied that the number of WPs was bigger than the number of WMs occurring in the 500-m classification data, whereas WPs/pure was higher than WMs/mixed for all land cover types. Figure 9 presents the percentages of the CPs in the pure and mixed pixels. The lowest percentages were obtained for the shrublands and wetlands. In addition, the blue line was always above the red line, which indicated that mixed pixels were more likely classified incorrectly than pure pixels. Among these pixels, the differences between the blue and red points for the water bodies and barren regions were greater than those for the other land cover types. Specifically, the effects of the mixed pixel are particularly obvious for water bodies and barren regions relative to the other types of land cover.   Labeling errors can provide important information for improving classification accuracy. Although up to 70 types of labeling errors are possible when comparing the aggregated classification and dominant-type reference data, the main source of labeling errors includes two types of misclassification, the misclassified grassland as barren land, representing up to 33.82% of the error, and the misclassification of barren land as grassland, representing up to 18.67%. Figure 10 presents the spatial distributions of these two types of misclassification over China. The regions that are encircled by two rectangles in Figure 10A represent the regions where most of the labeling errors occurred. Both of these regions are transition regions between grassland and barren land, which were referred to as barren grasslands. A continuous classification approach may be more suitable for classifying land cover transition zones, because land cover types tend to exhibit continuous, rather than categorical or discrete, variations in transition zones [20]. The percentages of the other misclassifications were all relatively small (less than 10%). However, the misclassification between croplands and grasslands should be highlighted, because this misclassification mainly occurred in the agro-pastoral zone of China ( Figure 10B). The algorithm used in MODIS Collection 5.1 consists of a decision tree technique that is highly sensitive to the training data used in the classification estimation stage [11]. For these regions with a high labeling error, more training samples are required to improve the classification accuracy.
The mixed pixel effect is an important source of error for classifying remote sensing images with low spatial resolution. However, the influences of mixed pixels on accuracy are particularly important. Figure 11 shows the relationship between accuracy at the pixel scale, the local diversity of the pixels and the relationships between the accuracy and the number of dominant-type sub-pixels within a classification pixel. The accuracy decreased as the diversity value increased, which could imply that more complex classification pixels correspond with a greater probability of misclassification ( Figure 11A). Figure 11B shows that the accuracy increases as the number of dominant-type sub-pixels within the

Validation of Croplands/Natural Vegetation Mosaics
The croplands/natural vegetation mosaic land is a special category, because it generally exists in the classification system as a low spatial resolution image. Pixels that are composed of several land cover types may have different spectral and texture features than single-type pixels. Therefore, the mosaic category is considered as an independent class.
The proportion of mosaic pixels in the aggregated classification was 3.82%, most of which were distributed in northeastern and southwestern China (Figure 1). In contrast with the single-type classes, the croplands/natural vegetation mosaic land is defined as land with a mosaic of croplands, forests, shrublands and grasslands, in which no one component composes more than 60% of the landscape [42]. This definition can be divided into qualitative and quantitative parts. The qualitative definition indicates that the land cover types of the sub-pixels cannot be any land cover types other than cropland and the three vegetation categories. In contrast, the quantitative definition indicates that the percentages of these types should not exceed 60%. The percentage of classes other than croplands and natural vegetation in the reference data of the mosaics region is 8.57%, which is very low. Thus, according to the qualitative definition, the accuracy is relatively high. However, according to the quantitative definition, these results are not optimistic. In this paper, we applied three indices to assess the quantitative qualities of the mosaics class, pixel diversity, the dominant land-cover type and the area percentage of the dominant type of land cover. The pixel diversity, which is the number of land cover types that occur in the corresponding reference sub-pixels within a classification pixel, should be 2, 3 or 4, according to the quantitative definition. The percentage of pixels that satisfies this condition in the aggregated classification is 67.61%. Again, the dominant land cover type of the mosaic pixels should be one of the four land cover classes, and the percentage of pixels with one of the four classes as the dominant land cover type is 92.92%. A pixel in the aggregated classification corresponds to 25 pixels in the reference data (which is called a sub-pixel). Thus, the number dominant-type sub-pixels cannot exceed 15 according to the condition that no single component composes more than 60% of the landscape. Although the percentage of pixels that satisfied the former two conditions are was now sufficiently low, the percentage of pixels satisfying the third condition should be the considered. Only 21.14% of pixels in the mosaic region satisfied the condition that the percentage of the dominant land cover type is less than or equal to 60%. The third condition in the definition of the crop/natural vegetation mosaics is the main factor that limits the accuracy of the mosaic type. This result suggests that the pixel-based algorithm used in the extraction of quantitative land cover type is not likely to obtain high accuracy. The percentage of pixels that simultaneously satisfied the three conditions was 19.26%. However, the percentage of pixels that completely satisfied the definition of crop/natural vegetation mosaics should be less than 19.26%, because the compositions of the land cover types of the pixels were not considered. In addition, the non-dominant type may include land cover types other than the four certain types.

Conclusions
The results of its evaluation of the MODIS LCT product IGBP layer from 2010 provide important information for producers and potential users. For potential users, the agreement and disagreement analyses present regions with high-quality and low-quality data, and the sub-pixel error matrix provides type-specific accuracy and the distribution of commission errors for each land cover type. The producer accuracies at the pixel level indicate land cover types with considerably low producer accuracy, which requires the producer's focus. The spatial and class distribution of labeling errors is important for producers to improve the classification accuracy during the production of the next generation of MODIS land cover data.
In this study, a comprehensive evaluation of the IGBP layer of the MODIS Collection 5.1 Land Cover Type product over China was conducted. A new cover-type legend was constructed that includes nine land cover classes without including the mosaics type in the IGBP legend. Forestland is a general category that contains seven land cover types, as shown in the IGBP legend. Savannas and woody savannas are included in forests based on their definitions. All of the comparative results presented in this paper were based on this new legend.
The overall accuracy was 64.62% at the sub-pixel level, which was higher than that of the Version 2 land cover data of MODIS [30]. However, the sub-pixel accuracies differed significantly in different regions. The most accurate provinces were Xinjiang, which features vast barren lands, and Heilongjiang and Jilin, which feature vast forestlands. The lowest accuracy provinces included five provinces in the middle of China with varied land cover types. Shrublands and wetlands both had low producer and user accuracies at the sub-pixel scale. Misclassifications between croplands and grasslands were mainly distributed in the agro-pastoral zone. Mixing between croplands and residential settlements was concentrated in the North China Plain. Misclassified forests were concentrated in Yunnan Province, and misclassifications between barren lands and grasslands occurred in barren grasslands, such as those in northwestern Tibet.
Based on the dominant-type reference data, we sorted the omission error into the mixed pixel error, which is the percentage of WMs, and the labeling error, which is the percentage of WPs at the pixel level. At the pixel level, shrublands and wetlands exhibited the lowest producer accuracies. For all land cover types, except the grasslands and barren lands, the mixed pixel errors were higher than the labeling error, partly because of the high percentage of pure pixels for these two land cover types, which increased the overall labeling error. The classification error increased as the number of land cover types in the reference data increased and as the patch size of the dominant land cover type decreased. Specifically, the classification error increased as the landscape heterogeneity increased. Although it was easier to misclassify mixed pixels than pure pixels, the labeling error was still greater than the mixed pixel error due to the high percentage of pure pixels. The main source of labeling errors was the misclassification between grasslands and barren lands in the transition zones. This finding suggests that flexible techniques based on continuous field characteristics may be preferable to "hard" classification approaches in transition zones. Another important source of labeling errors was the misclassification between croplands and grasslands in the agro-pastoral zone. Thus, more training samples should be selected in regions with high labeling errors. For regions where mixed pixel error are located, higher resolution data are preferable to coarse resolution images in regions with large heterogeneity, if more training samples cannot improve the accuracy in those regions.
Croplands/natural vegetation mosaics are a special type of land cover in the IGBP legend that are not found in the reference data. Based on the definition of the mosaic class in the IGBP legend, its accuracy should be less than 19.26%. The primary factor that limits the accuracy of the mosaic type is that by definition, no one component can compose more than 60% of the landscape. Definitions of land cover types are intended to be quantitative; however, such low accuracy raises questions regarding whether a quantitative condition is considered in the classification of the MODIS land cover data. Classification technologies that are based on quantitative definitions of land cover types should be the primary focus. This paper is focused on an evaluation of the 2010 MODIS v5.1 LCT product. Thus, confidence scores of the MODIS LCT product itself and comparisons among the different evaluation results for different years are not considered in this paper. However, these types of comparison are very important for the producers and users, and future research should be conducted to evaluate the MODIS v5.1 LCT product.