Evaluation of the Consistency of MODIS Land Cover Product ( MCD 12 Q 1 ) Based on Chinese 30 m GlobeLand 30 Datasets : A Case Study in Anhui Province , China

Land cover plays an important role in the climate and biogeochemistry of the Earth system. It is of great significance to produce and evaluate the global land cover (GLC) data when applying the data to the practice at a specific spatial scale. The objective of this study is to evaluate and validate the consistency of the Moderate Resolution Imaging Spectroradiometer (MODIS) land cover product (MCD12Q1) at a provincial scale (Anhui Province, China) based on the Chinese 30 m GLC product (GlobeLand30). A harmonization method is firstly used to reclassify the land cover types between five classification schemes (International Geosphere Biosphere Programme (IGBP) global vegetation classification, University of Maryland (UMD), MODIS-derived Leaf Area Index and Fractional Photosynthetically Active Radiation (LAI/FPAR), MODIS-derived Net Primary Production (NPP), and Plant Functional Type (PFT)) of MCD12Q1 and ten classes of GlobeLand30, based on the knowledge rule (KR) and C4.5 decision tree (DT) classification algorithm. A total of five harmonized land cover types are derived including woodland, grassland, cropland, wetland and artificial surfaces, and four evaluation indicators are selected including the area consistency, spatial consistency, classification accuracy and landscape diversity in the three sub-regions of Wanbei, Wanzhong and Wannan. The results indicate that the consistency of IGBP is the best among the five schemes of MCD12Q1 according to the correlation coefficient (R). The “woodland” LAI/FPAR is the worst, with a spatial OPEN ACCESS ISPRS Int. J. Geo-Inf. 2015, 4 2520 similarity (O) of 58.17% due to the misclassification between “woodland” and “others”. The consistency of NPP is the worst among the five schemes as the agreement varied from 1.61% to 56.23% in the three sub-regions. Furthermore, with the biggest difference of diversity indices between LAI/FPAR and GlobeLand30, the consistency of LAI/FPAR is the weakest. This study provides a methodological reference for evaluating the consistency of different GLC products derived from multi-source and multi-resolution remote sensing datasets on various spatial scales.

In this study, the GlobeLand30 and MCD12Q1 are compared by analyzing the consistency of MCD12Q1 in Anhui Province, China by normalizing the land cover classification types between two products based on the knowledge rule (KR) and the C4.5 decision tree (DT) classification algorithm.In addition, the whole study area is divided into three sub-regions that have different dominant land cover types.Four evaluation indicators of area consistency, spatial consistency, classification accuracy, and landscape diversity indices are selected to evaluate the consistency of five land cover classification schemes of MCD12Q1 in the three sub-regions.The novelty of this study is depicted as the following: (1) The harmonization of land cover classification between both the GLC products is performed based on the KR and C4.5 DT classification algorithm.A total of five land cover types are categorized from the two datasets to evaluate the consistency of five classification schemes of MCD12Q1, specifically including "woodland", "grassland", "cropland", "wetland" and "artificial surfaces".
(2) Three sub-regions of the study area with different spatial heterogeneities are used to evaluate the consistency of five classification schemes of MCD12Q1 at a regional scale.The results show that the consistency of woodland in the LAI/FPAR scheme is the worst.This can provide a methodological reference for selecting a classification scheme from MCD12Q1 at a regional scale.
(3) It shows that the higher the landscape heterogeneity is, the larger the landscape diversity indices are and the lower the consistency is.

Study Area
Anhui Province, China, is located in the mid-latitude zone, at longitudes ranging from 114°54′E to 119°37′E and latitudes ranging from 29°41′N to 34°38′N.The province also lies in the transition zone from alternating subtropical to temperate, with a mild and humid climate characterized by four distinct seasons.Two major river systems-the Yangtze and the Huaihe-divide the province into the Wanbei region, Wanzhong region and Wannan region, which form three natural areas characterized by distinctly different geographical features (Figure 1).The Wanbei region consists of the area lying in the north of the Huaihe River, which belongs to the North China Plain.In contrast, the Wanzhong region lies between the Huaihe River and Yangtze River, and belongs to the agro-ecological zones of the Yangtze River Plain.Finally, the Wannan region lies in the south of the Yangtze River, and belongs to the Tianmu Mountains-Huaiyushan montane evergreen broadleaf ecological zone.

Data Sources and Preprocessing
The datasets utilized in this study are MCD12Q1 and GlobeLand30 (Table 2), with the former used as the verified data and the latter as the reference data.The overall accuracy of "74.8% ± 1.3%" indicates the accuracy ranges of the five data layers of MCD12Q1, e.g., the overall accuracy of IGBP is 74.8%.With the best spatial resolution of 30 m, GlobeLand30 is currently considered to be the most suitable GLC product for evaluating the consistency of MCD12Q1.In addition to these two datasets, shape format data regarding China's provincial administrative regions and China's state sector datasets are also employed.

MODIS Dataset
As Table 2 shows, land cover data are obtained from MCD12Q1, a Level 3 product of the MODIS land cover datasets.This product is derived from the MODIS output first released by the United States National Aeronautics and Space Administration (NASA) at the end of 2008, with processed yearly observation data from the Terra and Aqua satellites applied to depict land cover types.The chosen dataset consist of five land cover classification systems: IGBP global vegetation classification scheme [32]; UMD vegetation classification scheme based on the modified IGBP classification system [33]; LAI/FPAR scheme adopted by MODIS Leaf Area Index and Fractional Photosynthetically Active Radiation (LAI/FPAR) products (MOD15) [34,35]; NPP scheme adopted by the MODIS net primary productivity (NPP) product (MOD17) [36]; and Plant Functional Type (PFT) land cover classification scheme [37].In our study, five MCD12Q1 data layers updated in 2014 are selected.The time period covered by the chosen data range from 1 January 2010 to 31 December 2010, and the track numbers are h27v05, h27v06, h28v05 and h28v06.

GlobeLand30 Data
Data from the American land resources satellite (Landsat) Thematic Mapper (TM5), an Enhanced Thematic Mapper Plus (ETM+) multi spectral image, China's Environmental Disaster Monitoring and Forecasting Small Satellite Constellation (HJ-1A/B) and other 30 m multispectral images are applied in the GlobeLand30 data.These data are also subsequently improved via the use of other reference data to support processes such as sample selection and auxiliary classification.The integration of pixel-and object-based methods with knowledge (POK) is used to control quality, which combines pixel level and object-oriented classification [38].The data processing is made in accordance with the order of water, wetland, ice and snow to reduce the synonyms spectrum phenomenon, i.e., the same object has different spectra.China donated this global 30 meters surface coverage dataset to the United Nations on 23 September 2014.This product is based on the World Geodic System (WGS) 84 coordinate system and the Universal Transverse Mercator (UTM) projection, and comprises ten land cover types including "water bodies", "wetland", "artificial surfaces", "tundra", "permanent snow and ice", "grassland", "barren land", "cultivated land", "shrubland" and "forest" (Table 3).Furthermore, the overall accuracy (OA) of GlobeLand30-2010 data is 83.50% and the Kappa coefficient (K) is 0.78.Some research on the analysis and application of the GlobeLand30 data has been made and their results have shown good performance [39,40].In the present study, the GlobeLand30 product for the 2010 reference year is selected, with the time period ranging from 1 January 2010 to 31 December 2010, and map numbers N50_25 and N50_30.

Data Preprocessing
As the original MCD12Q1 product is stored in hierarchical data format (HDF) and with the sinusoidal projection, data pre-processing is necessary, including format conversion, reprojection, resampling, image mosaicking, and sub-area masking.The MODIS Reprojection Tools (MRT) professional projection conversion system is employed for this purpose.Here, the MODIS HDF data format is converted into Geotiff.At the same time, the data projection is converted from SIN to WGS84/UTM and the image mosaicking and subsetting are also completed.Conversely, the acquired GlobeLand30 are just processed by mosaicking and subsetting in ENVI 4.7 (the ENvironment for Visualizing Images software), due to the original Geotiff format and UTM projection.Additionally, to compare the MCD12Q1 and GlobeLand30, the spatial resolution of GlobeLand30 is resampled at 500 m using the nearest neighbor resampling method, which keeps and basically does not destroy the gray values of the original image, in comparison with the bilinear interpolation and the cubic convolution interpolation method.Table 3. Definitions of the ten land cover types of GlobeLand30.

Type Definition
Cultivated land Lands used for agriculture, horticulture and gardens, including paddy fields, irrigated and dry farmlands, vegetation and fruit gardens.

Forest
Lands with trees, with vegetation cover over 30%, including deciduous and coniferous forests, and sparse woodlands with cover from 10% to 30%, etc.

Grassland
Lands covered with shrubs with cover over 10%, etc.

Shrubland
Land with shrubs cover over 30%, including deciduous and evergreen shrubs and deserts steppe with cover over 10%, etc.

Wetland
Lands covered with wetlands plants and water bodies, including inland marsh, lake marsh, river floodplain wetland, forest/shrub wetland, peat bogs, mangrove and salt marsh, etc.

Water bodies
Water bodies in the land area, including river, lake, reservoir and fish pond, etc.
Tundra Lands covered by lichen, moss, hardy perennial herb and shrubs in the polar regions, including shrub tundra, herbaceous tundra, wet tundra and barren tundra, etc.

Artificial surfaces
Lands modified by human activities, including the various habitation, industrial and mining area, transportation facilities, and interior urban green zones and water bodies, etc.

Barren land
Lands with vegetation cover lower than 10%, including desert, sandy fields, Gobi, bare rocks, saline and alkaline lands, etc.

Permanent snow and ice
Lands covered by permanent snow, glacier and icecap.

Arrangement of Sections
To evaluate the consistency of MCD12Q1, the conceptual diagram and designed method are shown in Figure 2. When the data sources and preprocessing are conducted in Section 3, harmonization of land cover classification is performed based on the knowledge rule to reclassify the land cover types between MCD12Q1 and GlobeLand30 in Section 4.2.Meanwhile, the C4.5 decision tree classification algorithm is also introduced to finish the classification in Section 4.3.Finally, the evaluation of consistency is performed in Section 4.4 through four indicators of area consistency (Section 4.4.1),spatial consistency (Section 4.4.2),classification accuracy (Section 4.4.3) and landscape diversity (Section 4.4.4).

Harmonization of Land Cover Classification
A unified classification system is a necessary step when comparing different GLC products.Since the classification schemes and categorical scales vary between MCD12Q1 and GlobeLand30, it is highly important to reclassify the land cover categories prior to the evaluation of consistency.Hou et al. proposed a novel method of land cover classification based on knowledge rule (KR), in which they established the accurate LULC data by the MODIS Normalized Difference Vegetation Index (NDVI), Digital Elevation Model (DEM) of Shuttle Radar Topography (SRTM), US Geological Survey (USGS) classification system and two land use maps of China [41].Ren et al. made the integration and comparison for the IGBP, UMD, LAI/FPAR, NPP, PFT and terrestrial ecosystem features and land cover classification schemes with six types, "farmland", "forest", "grassland", "water bodies and wetland", "settlement" and "wilderness" [42].Considering the land cover classification method based on KR in [41] and the classification system in [42], the harmonization of land cover classification is performed in our study (Table 4).A total of five land cover types are obtained including "woodland", "grassland", "cropland", "wetland" and "artificial surfaces".In addition, there are no DN (digital number) values for the "tundra" of GlobeLand30, and it can be removed.

C4.5 Decision Tree (DT) Classification Algorithm Data Preprocessing
Format  Figure 3 illustrates the harmonization method of land cover classification based on KR, which is used to harmonize the land cover classification types between the five MCD12Q1 data layers and GlobeLand30.Firstly, the two datasets are prepared to produce the clustering center of each land cover type.The 16-day composite MODIS NDVI dataset at a 500-m resolution of 2010 and the SRTM DEM at a 90-m resolution provide respectively the feature space for the classification and the auxiliary information for improving classification accuracy.Since the MCD12Q1 product has input features comprised of the nadir BRDF-adjusted reflectance (NBAR) data, the Land Surface Temperature (LST) data and the enhanced vegetation index (EVI) data [18].It has the seasonal land cover region (SLCR) characteristics, i.e., the phonological features and the first productivity in the same SLCR are the same and they are distinctly different from those in other SLCR.In addition, the K-Means unsupervised classification method of the multi-temporal MODIS NDVI datasets express the key classification information and characterize the quantitative traits of each category based on SLCR.The SRTM DEM data improve the difference between categories by the elevation data.Moreover, the phenological variation characteristics of five MCD12Q1 data layers and GlobeLand30 form the feature vectors through the database attribute table.These feature vectors represent the clustering centers of each category.Subsequently, the category with the minimum Euclidean distance belongs to the corresponding land cover type, through computing the Euclidean distance between the feature vectors of each category and the clustering centers of each land cover type.Thus the mapping matrix is made through the mutual mapping of different classification systems and the KR is then established as shown in Table 4.A logical look-up table is constructed to harmonize the land cover classes.The "barren land" dose not present in Wanbei and Wannan and dominates a very small portion of the total area, so it is ignored when evaluating the area consistency and treated as "others".Consequently, a total of five categories are harmonized between the five MCD12Q1 data layers and GlobeLand30: "Woodland" is defined as the woody plant community; "grassland" is defined as the annual or perennial herbaceous vegetation dominated by plant communities; "cropland" is defined as the artificial cultivated vegetation cover for the purpose of the harvest; "wetland" is defined as the surface with the saturated water for a long time in the vegetation area and the non-vegetation area; and "artificial surfaces" refers to the lands modified by human activities.

C4.5 Decision Tree Classification
The decision tree classification technique is one of the important toolsets in data mining and has been widely used in many fields, such as biology, computer science and technology, clinical medicine, geology, management science and engineering [43][44][45][46][47].In accordance with the top-down induction of decision tree, a DT consists of some root nodes, which are then split into more branches [48].The univariate decision trees like C4.5 algorithm, using only one feature at an internal node, are the most popular methods due to their low computational complexity [49].In this study, we adopted the C4.5 algorithm to achieve our goal based on expert knowledge.
The classification process is divided into four steps: defining classification rules, constructing decision tree, implementing decision tree and evaluating the classification results.Here, the classification rules are obtained using the KR in Section 4.2 (Table 4).The new land cover types are, respectively, extracted by the supervised decision tree operation.Specially, DT is built by the corresponding DN values in the datasets and the classification scheme is performed in ENVI.

Evaluation of Consistency
The "consistency" is defined as the similarity characteristics of classification results for both the land cover products.Four evaluation indicators are selected to evaluate the consistency between MCD12Q1 and GlobeLand30.

Area Consistency
The Pearson's correlation coefficient (R) [50] and the percentage disagreement (PD) [51] are used to evaluate the area consistency.Specifically, R (Equation ( 1)) represents the overall correlation degree and PD (Equation ( 2)) represents the correlation degree on each type.It is a method of measuring the correlation degree between two datasets and used to identify a linear correlation for a number of features between the validated MCD12Q1 data, based on the five different classification schemes and GlobeLand30 reference data.Similarly, the PD is used to depict the ratio of the different classification results of the same classification type shared by the verified five data layers of MCD12Q1 and the GlobeLand30 reference data, i.e., the degree of consistency between the two datasets.Thus, the smaller the value of ||, the closer the results and the better the consistency.
where n is the classification number; xk and yk are, respectively, the total area of type k in MCD12Q1 and GlobeLand30; and ̅ and  ̅ are, respectively, the average area of all land cover categories in MCD12Q1 and GlobeLand30.

Spatial Consistency
The pixel-by-pixel comparison method is here adopted in order to verify the accuracy of spatial positions.The spatial consistency of "woodland" is just considered in this study.First of all, the five types of classification datasets are divided into the two categories of "woodland" and "non-woodland" via a binarization processing method, with the five classification results of MCD12Q1 product, respectively, superimposed with those of the GlobeLand30 product in space.Following this process, four new types of classification data are obtained: "woodland/woodland", "woodland/non-woodland", "non-woodland/woodland" and "non-woodland/non-woodland".These new types depict the features of the spatial consistency of "woodland" between the two datasets.Secondly, the spatial similarity of the five MCD12Q1 classification results and those of the reference data is analyzed in the three sub-regions and two levels (provincial and regional), as expressed by the following formula [52].
where O is the spatial similarity coefficient, and A, B and C are, respectively, the total number of pixels of three types of classification data: woodland/woodland, woodland/non-woodland, and non-woodland/woodland, respectively.

Accuracy Verification
Accuracy verification in the present study consists of the producer accuracy (   ), user accuracy (   ) and overall accuracy (OA) in the confusion matrix [53].The    is a measure indicating the probability that the classifier has labeled an image pixel into Class i given that the ground truth is Class i. (Equation ( 4)).The    is a measure indicating the probability that a pixel is Class i given that the classifier has labeled the pixel into Class i (Equation ( 5)).The OA is calculated by summing the number of pixels classified correctly and dividing by the total number of pixels (Equation ( 6)).The formulas used to calculate the    ,    and OA are as follows: The Kappa coefficient (K) measures data coincidence via a discrete multivariate technique [54].Moreover, considering the possibility of accidental consistency between two groups of data sets, the K reflects the classification accuracy of land cover products more exactly (Equation ( 7)).
where pii is the constituents in which the type i of the classification results of MCD12Q1 is consistent with the type i of GlobeLand30, i.e., the number of the correctly classified pixels; P is the sum of all pixels in the GlobeLand30 classification result; pi+ is the sum of the type i in the classification result of MCD12Q1 in line i; p+i is the sum of the type i in the classification result of GlobeLand30 in column i; and N is the number of the pixels used for the accuracy evaluation.

Landscape Diversity
In the present paper, two landscape diversity indices are selected [55], including the modified Simpson's diversity index (MSIDI) (Equation ( 8)) and the modified Simpson's evenness index (MSIEI) (Equation ( 9)), in order to analyze the characteristics of the landscape mosaic.The landscape diversity indices of both the five MCD12Q1 layers and the GlobeLand30 are measured and compared.MSIDI is applied for calculating the ecological community and landscape diversity, with the stronger sensitivity for rare patch types.MSIEI, as a supplement of patch dominance, reflects the equilibrium ratios of the area proportions of different patch types and their maximum values in the landscape.The classification results of the MCD12Q1 and GlobeLand30 products are recorded in 8-bit unsigned integer format using the Fragstats 4.2 software program.The diversity index shows the species diversity of animals and plants in an area and is widely used in landscape ecology.Diversity index values are mainly affected by the richness and evenness of landscape composition, which, respectively, illustrate the diversity of landscape composition and landscape structure.In the present study, the landscape diversity metric in Fragstats is applied for the comparison of the Wanbei, Wanzhong and Wannan regions.The Simpson's diversity index is based on the establishment of information theory, in view of the measurement of biological communities.A widely applied method, Simpson's diversity index values represent the probability that two randomly selected grid units belong to different patch types.
The MSIDI is expressed by The MSIEI is given by 2 1 ln ln where pi is the area proportion of patch type i in the landscape, with the total area of the landscape not including background values; and m is the number of patch types in the landscape.All these indicators have no units.Index value ranges are MSIDI ≥ 0 and 0 ≤ MSIEI ≤ 1.When the whole landscape contains only one patch, MSIDI = 0 and MSIEI = 0.With an increasing number of landscape patches and the continuous equalization of their area proportions, the value of MSIDI increases.When the proportion of each patch in the landscape is the same, MSIEI = 1.With a more and more unbalanced proportion of different patch types in the landscape, the values of MSIEI approach zero.

Validation of Classification Accuracy of GlobeLand30
Since GlobeLand30 data are used as the reference data, it is highly necessary to firstly evaluate the classification accuracy for ensuring the reliability.The statistics of primary land cover types are selected, including the crop and forest area from the Anhui Statistical Yearbook [56] and the wetland area from the second China wetland survey [57].The fractional error is used to validate the accuracy of GlobeLand30, which represents a ratio of the absolute value of the differences between the actual area and estimated area and the actual area.
As shown in Table 5, in addition to the "woodland" in Wanbei and "wetland" in Wanbei and Wannan, all the other fractional errors are less than 30%.Furthermore, "woodland" is taken as a study case to compare the MCD12Q1 and GlobeLand30.The "woodland" area derived from the five classification schemes of MCD12Q1 is, respectively, 42.75 km 2 of IGBP, 53.50 km 2 of UMD, 8.75 km 2 of LAI/FPAR, 52.75 km 2 of NPP and 57.00 km 2 of PFT in Wanbei, while it is 106.25 km 2 of GlobeLand30 and is much closer to the yearbook statistics.Therefore, the GlobeLand30 have higher classification accuracy and can be used as the reference data for evaluating the consistency of MCD12Q1.Table 5. Accuracy validation for "woodland", "cropland" and "wetland" of GlobeLand30.

Region
Land Cover Type Area (km 2 ) Yearbook Statistics Fractional Error (%)

Evaluation of Area Consistency
Table 6 compares the R values of IGBP, UMD, LAI/FPAR, NPP and PFT in Wanbei, Wanzhong, Wannan and Anhui Province, respectively.Significantly, all the R values of IGBP, UMD and PFT are more than 97% in the three sub-regions.Their consistency shows pretty good on the whole.Considering the R values vary slightly from 98.22% to 99.61% in the three sub-regions and the whole Anhui Province, the compatibility and robustness of IGBP is best, indicating that the consistency of IGBP is best on a regional scale.The obtained result is the same with the reference [42], which shows that the area of land cover types of IGBP are more close to the land cover classification schemes based on the terrestrial ecosystem features and remote sensing (TEFRS).
On the other hand, the R of LAI/FPAR and NPP has poor performance, which shows that the regional variation is obvious.For the LAI/FPAR, the R decreases from 56.40% of Wannan to −17.78% of Wanbei.The R of the NPP scheme varies from −31.47% of Wanbei to 69.26% of Wannan.To analyze the high regional variation of LAI/FPAR and NPP, PD is used to provide the correlation degree on each type in the five schemes.Combining the R and PD, more detailed consistency evaluation of the five schemes can be presented.
Figure 4 shows the comparison of area consistency of five types derived from the five schemes.As shown in blue and orange columns, the PD of IGBP and UMD are smaller than that of the other three schemes on each type in the three sub-regions, showing their consistency are better.The "woodland" and "grassland" (orange column) in the three sub-regions show particularly the best performance among the five schemes, which highlights that the consistency of UMD is best for "woodland" and "grassland".The study [58] shows the same result that UMD is best to reflect the temporal and spatial distribution of grassland.
The grey column illustrates the PD of LAI/FPAR and it shows that the classification results of "woodland" in the three sub-regions are worst among the five schemes, particularly in Wanbei with less forest resources.The result is also similar with the conclusion of the reference [58].Moreover, the consistency of "artificial surfaces" in the three sub-regions is also worst among the five schemes.
Due to the lack of "cropland" of NPP, the consistency of this type is obviously poor, as shown in green column.Similarly, the consistency of "grassland" is also poor.Conversely, the PD of "woodland", "wetland" and "artificial surfaces" has a similar performance to UMD and PFT.
The purple column shows that the consistency of "grassland" of PFT is worse than the other four land-cover types.The regional variation is obviously high due to the big difference of PD in the three sub-regions, while the values of the PD of all the other land-cover types present to be low in the three sub-regions.Specifically, the PD in Wannan is the best among the five schemes but it is the worst in Wanzhong.

Analysis of Spatial Consistency
The spatial consistency of "woodland" is just considered in this study.Firstly, the spatial distribution of woodland shows highly different in the three sub-regions.It occupies only a little in Wanbei; but accounts for nearly one fifth in Wanzhong and most of Wannan is covered by this type.The classification results indicate the there is a significant difference for "woodland" in the five schemes.It shows that the O of LAI/FPAR is the worst of 58.17% (Table 7), indicating that the spatial consistency is weak for "woodland".The O of NPP reaches up to 86.70%, followed by 85.68% of PFT, which shows that NPP has the best spatial consistency of "woodland" among the five schemes.Figure 5 compares the spatial consistency of "woodland" between the five data layers of MCD12Q1 and GlobeLand30 at provincial and regional scales (Anhui Province and three sub-regions).It illustrates the change of land cover types in the same area from the GlobeLand30 data to five data layer of MCD12Q1, respectively.According to the pixel-by-pixel comparison of the results, the classification errors of "woodland" appear mainly in Wannan and the south of Wanzhong with most forest area.As shown in Figure 5, "cropland" is misclassified into "woodland" in many areas, which results in serious influence on the consistency of these schemes.In the reference [42], the same reason is investigated that the forest area is increasing of NPP, PFT and LAI/FPAR, while the forest areas show a decreasing trend of IGBP and UMD from the year of 2001 to 2009, in Gansu Province, China.Moreover, Figure 5 reveals that NPP eliminates the potential causes of the misclassification from "woodland" to "cropland" due to the lack of "cropland" in NPP, but "woodland" is categorized into "grassland" in some certain places.
Consequently, another driving factor affecting the consistency of "woodland" are analyzed.The "woodland" of IGBP, UMD, LAI/FPAR and PFT are categorized into "cropland" in Wanbei and Wanzhong to a great degree.Considering both the factors, it reveals that the misclassification between "woodland" and "cropland" exhibits a big influence on the consistency of "woodland" in IGBP, UMD, LAI/FPAR and PFT.Moreover, other inconsistent areas of IGBP, UMD, NPP and PFT of "woodland" are mainly concentrated around the Yangtze River and Huaihe River Basin -two natural dividing lines of the three sub-regions of Anhui Province.The "wetland" of these four schemes is classified into "woodland" in the area.As shown in Table 7, the consistency of "woodland" of LAI/FPAR is the weakest in accordance with Figure 5. Significantly, in addition to the above driving factors, the misclassification between "others" and "woodland" is also the main cause leading to the worst consistency of LAI/FPAR.Since the above Section 4.2 has presented the harmonization of land cover classification by treating "barren land" as "others", the "woodland" type is distinctly misclassified into the "barren land" and it accounts for the main cause leading to the poor performance of LAI/FPAR.

Comparison of Classification Accuracy
The OA of IGBP, UMD and PFT in the three sub-regions are more than 71%, while it is less than 57% of LAI/FPAR and NPP (Table 8).Considering the absent "cropland" of NPP, the accuracy of the other types is still similar with that of IGBP, UMD and PFT.In general, it indicates that the OA of LAI/FPAR is the worst and the extremely low accuracy of "artificial surfaces" leads to the weak consistency due to the smallest K value.Moreover, the OA values of NPP vary from 1.61% in Wanbei to 56.23% in Wannan, while they are less than 10% of other schemes in the three sub-regions, indicating that the consistency of NPP is the worst among the five schemes.Meanwhile, it shows that all the    of "wetland" of the five schemes are good, but the difference of "woodland" is large in the three sub-regions.Reference [59] refers to the same situation that the uncertainty of MODIS data is small for water body but large for forestland.Since the GlobeLand30 data present the finer spatial resolution, its increasing number of landscape patches and the continuous equalization of area proportions lead to the larger diversity indices.It indicates that the number of landscape patches of GlobeLand30 for the same area is higher than that of five MCD12Q1 data layers.Meanwhile, GlobeLand30 has a more balanced proportion of different patch types in the landscape.In comparison with other schemes, the LAI/FPAR shows the higher values and the difference between the results of the LAI/FPAR scheme and the GlobeLand30 data is the lowest in the three sub-regions, which reveals that the consistency of LAI/FPAR is the best on the number of landscape patches and the equalization of area proportions.More significantly, the values of diversity indices for the three sub-regions are ordered as follows: Wanbei < Wanzhong < Wannan.In Section 5.3, the same result can be found that the K values of Wanbei are the smallest in the three sub-regions (Table 8), which corresponds to the same rank of the values of diversity indices for the three sub-regions.Specifically, "cropland" is distinctly characterized by a large proportion in Wanbei, and the other types present a fragmentized distribution.For example, as Figure 6 shows, the classification results of "grassland" and "wetland" of Suzhou City of Wanbei in the five MCD12Q1 data layers are distinctly different from the GlobeLand30.It can be concluded that the higher the landscape heterogeneity, the larger the diversity indices and the weaker consistency.

Conclusions
An evaluation of the consistency of five land cover classification schemes of MCD12Q1 are performed based on the 30 m Chinese GlobeLand30-2010 GLC product in Anhui Province.Two primary methods are specifically used, including harmonization of the land cover classification based on the knowledge rule and C4.5 decision tree classification algorithm.In general, there is a strong consistency of area percentage, spatial consistency, classification accuracy and landscape diversity among the five classification schemes of MCD12Q1 based on GlobeLand30.
(1) The R of IGBP is found to be slightly different, ranging from 98.29% to 99.63%, in the three sub-regions and in Anhui Province.Thus, IGBP shows the best compatibility and robustness, indicating that the consistency of IGBP is the best on a regional scale.The PD of IGBP and UMD are the smallest on each type in the three sub-regions, indicating that both IGBP and UMD present the best consistency.
(2) The O of LAI/FPAR has the lowest value, 58.17%, and its spatial consistency of "woodland" is weak.Conversely, NPP has the best spatial consistency of "woodland" among the five schemes with an O of 86.70%.Specially, the misclassification between "others" and "woodland" is the main cause leading to the worst consistency of LAI/FPAR; that is, the "woodland" type of GlobeLand30 is largely misclassified into the "barren land of LAI/FPAR.
(3) The    and    of "artificial surfaces" of LAI/FPAR scheme are the lowest in the three sub-regions in comparison to the four other schemes of MCD12Q1 (Table 8).Furthermore, the OA of NPP varies greatly in the three sub-regions, indicating that the consistency of NPP is the worst among the five schemes.
(4) Since the landscape diversity indices of Wanbei, Wanzhong and Wannan are different from each other, it shows the obvious spatial heterogeneity of the three sub-regions.The consistency of LAI/FPAR is found to be the best on the landscape diversity in the three sub-regions.Meanwhile, the more heterogeneous the landscape is, the weaker the consistency of the five land cover classification results of MCD12Q1 become.
Given the great significance to evaluate and validate the quality and consistency of different GLC products, this study evaluates the consistency of 500 m MCD12Q1 based on 30 m GlobeLand30 on a provincial scale.This study can provide a methodological reference for evaluating the consistency of different GLC products derived from multi-source and multi-resolution remote sensing datasets on various spatial scales.

Figure 1 .
Figure 1.Location of Anhui Province and the Wanbei, Wanzhong and Wannan regions.

Figure 2 .
Figure 2. Conceptual diagram and the general workflow.

Figure 3 .
Figure 3.The flow chart of harmonizing land-cover classification based on KR.

Table 1 .
Temporal and spatial resolutions of the produced GLC products.

Table 4 .
A total of five harmonized land cover types between MCD12Q1 and GlobeLand30.

Table 6 .
Comparison of R values of the five classification schemes of MCD12Q1.

Table 7 .
Spatial similarity of five classification schemes of MCD12Q1 in Anhui Province.

Table . 8
Comparison of the classification accuracy of five classification results.