Index for the Consistent Measurement of Spatial Heterogeneity for Large-Scale Land Cover Datasets

: Recognizing land cover heterogeneity is essential for the assessment of spatial patterns to guide conservation planning. One of the top research priorities is the quantiﬁcation of land cover heterogeneity using e ﬀ ective landscape metrics. However, due to the diversity of land cover types and their varied distribution, a consistent, larger-scale, and standardized framework for heterogeneity information extraction from this complex perspective is still lacking. Consequently, we developed a new Land Cover Complexity Index (LCCI), which is based on information-theory. The LCCI contains two foundational aspects of heterogeneity, composition and conﬁguration, thereby capturing more comprehensive information on land cover patterns than any single metric approach. In this study, we compare the performance of the LCCI with that of other landscape metrics at two di ﬀ erent scales, and the results show that our newly developed indicator more accurately characterizes and distinguishes di ﬀ erent land cover patterns. LCCI provides an alternative way to measure the spatial variation of land cover distribution. Classiﬁcation maps of land cover heterogeneity generated using the LCCI provide valuable insights and implications for regional conservation planning. Thus, the LCCI is shown to be a consistent indicator for the quantiﬁcation of land cover heterogeneity that functions in an adaptive way by simultaneously considering both composition and conﬁguration.


Introduction
Recognizing land cover spatial heterogeneity is crucial for ecological process modeling, spatial pattern understanding, and environmental change analysis [1][2][3][4]. Land cover heterogeneity is a key concept of land system science, a discipline that has long focused on regional structure and patterns. The heterogeneity of land cover can be quantitatively described in different forms, such as fragmentation [5], diversity [6], connectivity [7], and complexity [8]. Recently, some studies have focused on land cover heterogeneity in terms of land surface parameterization and land cover classification quality [9,10]. The need for consistent and accurate information on land cover heterogeneity to support large-scale geospatial applications has been increasingly acknowledged and emphasized [11]. Therefore, it is necessary to extract standardized land cover heterogeneity information at fine resolution and large scales to meet the requirements of scientists and policy-makers.
In this study, a readily applicable measure is proposed to address the lack of a consistent and standardized framework for heterogeneity information extraction at large spatial scales. Specifically, information-theoretical metrics were employed for fusing a consistent indicator, the Land Cover Complexity Index (LCCI), for large-scale land cover heterogeneity quantification at 1 km resolution. The main objectives of this study are as follows: (1) to describe a methodology suitable for quantifying the characteristics of large-scale land cover heterogeneity; (2) to build a database of continent land cover heterogeneity for large-scale geospatial sampling and ecological assessment; and (3) to discover the heterogeneous distribution characteristics of the different continents.
This paper is organized as follows. In Section 2, we illustrate the inconsistency problems in large-scale quantifications of land cover heterogeneity and give a solution. Section 3 introduces the key concepts for the construction of methodology in this paper. In Section 4, we present the quantification results and compare LCCI with different single metric approaches using path analysis. Section 5 provides a summary of our results and discussion and includes our conclusions.

Inconsistency Problems at Large Scales
Spatially continuous coverage of earth observatory data and the fast development of geoinformation technologies encourage land cover heterogeneity research with the aim of gaining more robust and continuous information. This facilitates further understanding of ecological processes and monitors the distribution of our natural resources and their dynamics. Moreover, land cover heterogeneity is becoming increasingly helpful in large-scale geospatial research, especially for surface parameterization and space sampling. Therefore, the scientific measurement of large-scale heterogeneity requires new quantitative methods. Due to the complexity and the variation of land cover types in a large area, using a single index for measuring the heterogeneity results in low heterogeneity values for areas where the heterogeneity is actually high.
As shown in Figure 1a, there are three areas in the world with different degrees of heterogeneity. Region a and region b have the same composition, but due to differences in their configuration, the degree of heterogeneity of region b is greater than that of region a. Similarly, region b and region c have comparably complex configurations, but region c is more diverse than region b. Thus, the heterogeneity of region c is higher than that of region b. However, using the traditional configuration landscape metrics such as the SHDI and edge density (ED) index shown in Figure 1b, the difference is indistinguishable. These circumstances result in inconsistent heterogeneity values for large-scale quantifications of land cover heterogeneity that persist even if other landscape metrics are utilized. The fundamental reason is that the composition and the configuration are not considered simultaneously. To quantify land cover heterogeneity more accurately, thereby meeting the demand for large-scale applications, efficiently combining the composition and the configuration indices is critical.
In other words, due to the failure of factoring in both composition and configuration, the land cover heterogeneity quantified using a single indicator is not consistent with the true degree of heterogeneity. Over time, the land cover type of an area may be converted to a different land cover type, and in such cases, the composition of the land cover data increases, whereas the configuration remains the same. The change may thus actually enhance the heterogeneity of this area, and the quantized value needs to reflect this change. For instance, in Figure 2, the type of conversion increases the heterogeneity of the land cover, but using only one configuration index fails to capture the difference. Adopting two different indices simultaneously may resolve the problem but is not suitable for large-scale monitoring.
is indistinguishable. These circumstances result in inconsistent heterogeneity values for large-scale quantifications of land cover heterogeneity that persist even if other landscape metrics are utilized. The fundamental reason is that the composition and the configuration are not considered simultaneously. To quantify land cover heterogeneity more accurately, thereby meeting the demand for large-scale applications, efficiently combining the composition and the configuration indices is critical. In other words, due to the failure of factoring in both composition and configuration, the land cover heterogeneity quantified using a single indicator is not consistent with the true degree of heterogeneity. Over time, the land cover type of an area may be converted to a different land cover type, and in such cases, the composition of the land cover data increases, whereas the configuration remains the same. The change may thus actually enhance the heterogeneity of this area, and the quantized value needs to reflect this change. For instance, in Figure 2, the type of conversion increases the heterogeneity of the land cover, but using only one configuration index fails to capture the difference. Adopting two different indices simultaneously may resolve the problem but is not suitable for large-scale monitoring.

Solution: Land Cover Complexity Index (LCCI) Design
With LCCI, we aim to construct a consistent indicator of land cover heterogeneity. Land cover composition and configuration express the complexity of land cover categories and adjacencies/spatial distributions, respectively. A comprehensive measurement of land cover heterogeneity should, therefore, account for both. To achieve this, we (i) constructed a co-occurrence histogram to express both the composition and the configuration in a single histogram, (ii) quantified two entropy-based indices called marginal entropy and conditional entropy, (iii) calculated the relative mutual information to measure the difference between the two entropy-based indices, and (iv) combined the entropy-based indices to create the LCCI. Simultaneously considering different aspects of heterogeneity can consolidate the results and enhance robustness. Moreover, entropybased indices not only record the correlation between configuration and composition, but, based on our empirical tests, also capture details of the complexity of small patch distribution. Accounting for the asymmetry of data confidence, we combined the information-theoretical metrics into a complex fusion technique, which is based on the dependence between random variables.
The flow chart in Figure 3 details the heterogeneity quantification process.

Solution: Land Cover Complexity Index (LCCI) Design
With LCCI, we aim to construct a consistent indicator of land cover heterogeneity. Land cover composition and configuration express the complexity of land cover categories and adjacencies/spatial distributions, respectively. A comprehensive measurement of land cover heterogeneity should, therefore, account for both. To achieve this, we (i) constructed a co-occurrence histogram to express both the composition and the configuration in a single histogram, (ii) quantified two entropy-based indices called marginal entropy and conditional entropy, (iii) calculated the relative mutual information to measure the difference between the two entropy-based indices, and (iv) combined the entropy-based indices to create the LCCI. Simultaneously considering different aspects of heterogeneity can consolidate the results and enhance robustness. Moreover, entropy-based indices not only record the correlation between configuration and composition, but, based on our empirical tests, also capture details of the complexity of small patch distribution. Accounting for the asymmetry of data confidence, we combined the information-theoretical metrics into a complex fusion technique, which is based on the dependence between random variables.
The flow chart in Figure 3 details the heterogeneity quantification process.

Entropy-Based Fundamental Index of Spatial Heterogeneity Measurements
To obtain the diversity (composition) and the adjacency (configuration) characteristics of land cover heterogeneity simultaneously, a bivariate approach for the analysis of two fundamental variables is needed [39]. For the purpose of describing each land cover unit mathematically, we extracted a bivariate co-occurrence histogram featured in color from the gray-level co-occurrence matrix for the subsequent calculations ( Figure 4). Each bin of the co-occurrence histogram is a land cover type adjacency feature extracted from two neighboring grids, called Pij. We adopted an eightconnectivity as the adjacency rule and to distinguish bins that have the same land cover type but a different neighboring order as Pij and Pji. Basic information-theoretical metrics applied to quantify land cover heterogeneity are as follows [16]: H(x, y) = ∑ where H(y) is the standard Shannon entropy calculated based on pairs of cells, not single cells; H(y|x) is the conditional entropy based on the joint probabilities Pij and the second-order probabilities P(i→j) = Pij/Pi; and H(x,y) is the joint entropy computable directly from Pij, which equals H(y) plus

Entropy-Based Fundamental Index of Spatial Heterogeneity Measurements
To obtain the diversity (composition) and the adjacency (configuration) characteristics of land cover heterogeneity simultaneously, a bivariate approach for the analysis of two fundamental variables is needed [39]. For the purpose of describing each land cover unit mathematically, we extracted a bivariate co-occurrence histogram featured in color from the gray-level co-occurrence matrix for the subsequent calculations ( Figure 4). Each bin of the co-occurrence histogram is a land cover type adjacency feature extracted from two neighboring grids, called Pij. We adopted an eight-connectivity as the adjacency rule and to distinguish bins that have the same land cover type but a different neighboring order as Pij and Pji. Basic information-theoretical metrics applied to quantify land cover heterogeneity are as follows [16]:

of 1where H(y) is the standard Shannon entropy calculated based on pairs of cells, not single cells; H(y|x)
is the conditional entropy based on the joint probabilities Pij and the second-order probabilities P(i→j) = Pij/Pi; and H(x,y) is the joint entropy computable directly from Pij, which equals H(y) plus H(y|x), measuring the overall complexity of the land cover pattern. These three metrics describe composition entropy, configuration entropy, and joint entropy of the land cover patterns, respectively. Many previous studies have shown that land cover composition and configuration are highly correlated, but the nature of the correlation, i.e., its linearity, has not been determined [40]. The ability of the mutual entropy to capture the dependence or the relevance between land cover composition and configuration has recently led to attempts to employ it in landscape complex ordering and pattern classification [41]: where I(y,x) is the uncertainty of variable y reduced by knowing variable x and U the relative mutual information, which measures the difference between composition and configuration. Certain land cover patterns have high diversity (composition) and low fragmentation (configuration), which can result in high values of U. In this study, we adopted it as a measure of compositional confidence.
ISPRS Int. J. Geo-Inf. 2020, 9, x FOR PEER REVIEW 6 of 17 H(y|x), measuring the overall complexity of the land cover pattern. These three metrics describe composition entropy, configuration entropy, and joint entropy of the land cover patterns, respectively. Many previous studies have shown that land cover composition and configuration are highly correlated, but the nature of the correlation, i.e., its linearity, has not been determined [40]. The ability of the mutual entropy to capture the dependence or the relevance between land cover composition and configuration has recently led to attempts to employ it in landscape complex ordering and pattern classification [41]: where I(y,x) is the uncertainty of variable y reduced by knowing variable x and U the relative mutual information, which measures the difference between composition and configuration. Certain land cover patterns have high diversity (composition) and low fragmentation (configuration), which can result in high values of U. In this study, we adopted it as a measure of compositional confidence.

Fusion of Entropy-Based Indices
A fusion method was adopted to combine the two entropy-based indices that describe the different aspects of heterogeneity, H(y) and H(y|x), into one final indicator. As information theory was used as a consistent framework, normalization of the index values to ensure a comparable range for the indices was not necessary.
The fusion process was designed because the reliability of the two indices needs to be adjustable to accommodate intricate land cover patterns. Indices may be imperfect; therefore, the expert should consider their imperfections, thereby specifying a partial trust [42]. In the fusion process, indices with partial trust are weighted proportionally to their trust in the result. Diverse composition may lead to a complex configuration, and both composition entropy and configurational complexity generally function as trust indicators for building the final indicator. However, in some real world situations, the land cover has a high diversity value but a geometrically simple pattern, which causes the composition index to fail as trust indicator (the heterogeneity is low whereas the diversity value is high). Furthermore, both indices fail to capture small patch patterns. To improve the accuracy of the heterogeneity quantification in these situations, we introduced a relative mutual information parameter U to judge the trust of the indicators. Empirical studies revealed that U is an appropriate reference for measuring the difference between composition and configuration [41]. We applied the following rule: both H(y) and H(y|x) are considered to express the different characters of land cover heterogeneity. However, if the U value is far greater than 0, we assume that H(y) does not accurately describe the land cover heterogeneity and the actual heterogeneity is lower than described by the composition layer. We define (1-U) as an adaptive weight to adjust this difference at a 1 km scale based on expert experience and a large number of experiments. If U is close to 0, the land cover pattern is more complex than the single layer result indicates.

Fusion of Entropy-Based Indices
A fusion method was adopted to combine the two entropy-based indices that describe the different aspects of heterogeneity, H(y) and H(y|x), into one final indicator. As information theory was used as a consistent framework, normalization of the index values to ensure a comparable range for the indices was not necessary.
The fusion process was designed because the reliability of the two indices needs to be adjustable to accommodate intricate land cover patterns. Indices may be imperfect; therefore, the expert should consider their imperfections, thereby specifying a partial trust [42]. In the fusion process, indices with partial trust are weighted proportionally to their trust in the result. Diverse composition may lead to a complex configuration, and both composition entropy and configurational complexity generally function as trust indicators for building the final indicator. However, in some real world situations, the land cover has a high diversity value but a geometrically simple pattern, which causes the composition index to fail as trust indicator (the heterogeneity is low whereas the diversity value is high). Furthermore, both indices fail to capture small patch patterns. To improve the accuracy of the heterogeneity quantification in these situations, we introduced a relative mutual information parameter U to judge the trust of the indicators. Empirical studies revealed that U is an appropriate reference for measuring the difference between composition and configuration [41]. We applied the following rule: both H(y) and H(y|x) are considered to express the different characters of land cover heterogeneity. However, if the U value is far greater than 0, we assume that H(y) does not accurately describe the land cover heterogeneity and the actual heterogeneity is lower than described by the composition layer. We define (1 − U) as an adaptive weight to adjust this difference at a 1 km scale based on expert experience and a large number of experiments. If U is close to 0, the land cover pattern is more complex than the single layer result indicates.
The fusion method results in higher values for intricate land cover patterns. In this case, H(y) and H(y|x) can be equally trusted, and U may be used as relative information gain to increase the value of land cover complexity because it measures the interaction of composition and configuration.

Datasets and Quantization Scheme
A GlobeLand30-2010 dataset with a fine resolution of 30 m was used in this study to extract the characteristics of land cover heterogeneity. The dataset comprises ten first-level classes, namely cultivated land, forest, grassland, shrubland, wetland, water bodies, tundra, artificial surfaces, bare land, and permanent snow/ice, for the years 2000 and 2010. In this study, we used the 2010 map (see http://www.globallandcover.com for the detailed legend). The maps were extracted from Landsat and HJ-1 satellite images through a pixel-object-knowledge-based (POK-based) approach, with preliminary validation results for the overall classification accuracy of greater than 80% in 2010.
The quantification of land cover heterogeneity was carried out within ArcGIS version 10.1 [43] using Python scripts (http://www.python.org). The GlobeLand30 data were first partitioned into regular 1 km × 1 km units at smaller computational cost than the basic units for the following quantification and analysis. These 1 km × 1 km squares were determined empirically for large-scale landscape analysis [44], and each unit overlapped by approximately 34 × 34 pixels of GlobeLand30. These simple units made up a cell array, and each unit contained complex content. In the next step, we employed these units of blocks as elementary units for each nation's heterogeneity aggregation. Subsequently, we extracted composition, configuration, and complexity information separately from the units with valid data. Finally, the land cover complexity results were mapped for intuitive perception.
Real land cover data at two scales, local and continental, were selected for the validation of our entropy-based fusion model, which was performed by examining the ability of the LCCI to quantify the complexity characteristics of land cover. Because the fusion method aims to quantify the complexity of land cover, we compared the performance of the LCCI with two indices proposed to measure two fundamental aspects of land cover heterogeneity. Furthermore, we compared the performance of the LCCI with one of the commonly used indices that is strongly correlated with it to validate the superior performance of the LCCI regarding the extraction of comprehensive land cover heterogeneity information.

Validation of LCCI
In total, 36 types of real land cover patterns from different parts of the globe were used as the evaluation dataset to test the consistency of our model for the quantification of land cover heterogeneity. Evaluation data used for heterogeneity quantification usually had the following two characteristics: first, they represented different degrees of heterogeneity and thus complex land cover patterns; second, their difference in heterogeneity could be captured by the naked eye. For each of the 36 pattern types, we calculated the LCCI and the two most commonly used heterogeneity metrics: ED as the configurational metric and SHDI as the compositional metric. For each indicator, we sorted the values into quintiles (20-percentile), referred to as classes 1-5. Class 1 represents the initial 20-percentile, expressing the lowest heterogeneity of land cover. Class 5 represents the last 20-percentile, expressing the highest complex distribution of land cover. Figure 5 shows the tiles that fell within that quintile for each indicator, and they are shown in order of increasing value. The evaluation dataset is marked using the ordering label of H(y). The evaluation data illustrated in Figure 5 displays distinctly different heterogeneities of land cover for the three different metrics. Based on visual inspection, SHDI seemed to increase with increasing diversity of the land cover distribution. In addition, different land cover heterogeneities had similar SHDI values (see class2, #12 and #9). Edge density was selected to contrast SHDI not only because it measures the spatial configuration of heterogeneity but also because it extracts complex boundary information, similar to the co-occurrence histogram, which connects different ecological The evaluation data illustrated in Figure 5 displays distinctly different heterogeneities of land cover for the three different metrics. Based on visual inspection, SHDI seemed to increase with increasing diversity of the land cover distribution. In addition, different land cover heterogeneities had similar SHDI values (see class2, #12 and #9). Edge density was selected to contrast SHDI not only because it measures the spatial configuration of heterogeneity but also because it extracts complex boundary information, similar to the co-occurrence histogram, which connects different ecological interactions with a variety of mosaic types. Based on visual inspection, the increase of ED seemed to correspond with an increase in the complexity of the land cover configuration heterogeneity. The land cover distribution changed from simple to complex, but units with similar values and classified according to similar degrees of heterogeneity did, in some cases, show different degrees of heterogeneity (see class1, #2 and #19). Several such discrepancies are apparent in Figure 5.
Thus, heterogeneity information derived from SHDI and ED did not fully capture the inherent complexity characteristics (see class1, where ED was predominantly classified by percentile class1, and class4, where SHDI showed obviously different patterns). LCCI, on the other hand, was able to distinguish the different land cover patterns more distinctly (when comparing LCCI, SHDI, and ED in each classification, LCCI classified each percentile homogeneously). These results show that the LCCI performed better than the SHDI and the ED indices, proclaiming its suitability for quantifying land cover heterogeneity. It captures richer heterogeneity information than single indices by considering both land cover composition and configuration information. Meanwhile, the purpose of this validation test is not to discredit the single indices but rather to show that they only partially quantify land cover heterogeneity information.

Relationships between LCCI and Landscape Metrics
In addition to SHDI and ED, seven metrics related to patch shape, size, and connectivity were selected for path analysis in order to further evaluate the quality of LCCI. Landscape metrics selected include patch density (PD), largest patch index (LPI), patch cohesion index (COHE), aggregation index (AI), fractal dimension (FRAC), landscape division index (DIVISI), and splitting index (SPLIT). These metrics are commonly used as heterogeneity measures in regional and local scale studies [45][46][47]. The selected metrics were calculated by FRAGSTAT 4.2 at the same level as the landscapes. Path analysis was performed using the structural equation modeling module of Amos [48]. Path analysis can decompose the interaction between dependent and independent variables (correlations) into direct (path coefficient) and indirect effects (indirect path coefficient).
The correlation and path coefficients between LCCI and the nine landscape metrics are listed in Table 1. The correlation analysis results show that LCCI was strongly correlated with both ED and PD and moderately correlated with SHDI. In addition, LCCI was strongly negatively correlated with AI and COHE, two metrics classified as aggregation indices. The path analysis results indicate that LCCI had the highest direct path coefficient with ED, which means that a close relationship between LCCI and ED existed. The direct path coefficient between AI and LCCI was greater than 0.5, indicating that AI had a large direct effect on LCCI. ED is a configuration metric of land cover heterogeneity and measures the boundary abundance, and AI is an aggregation metric that measures the land cover complexity. Higher complexity of land cover was thus reflected by higher ED and AI values. The direct path coefficient of PD was low at 0.115, but the indirect path coefficient mediated by ED was high at 0.683. Similar coefficients were observed for SHDI, DIVISI, and SPILT mediated by ED. This performance of PD, SHDI, DIVISI, and SPILT implies that their influence upon LCCI through ED was relatively important. The direct path coefficient and the indirect path coefficient values of COHE, mediated by ED, to LCCI were −0.222 and −0.657, respectively, which suggests a negative relationship between COHE and LCCI. FRAC was weakly correlated with LCCI, as shown by the low values of less than 0.5 of the direct and the indirect path coefficients. LPI was negatively correlated with LCCI and all indirect path coefficients less than 0.5, which implies a weak relationship with LCCI.

Large-Scale Application: Example Africa
Based on the entropy-based fusion method, we developed a comprehensive index to quantify the complexity of land cover in local environments. The path analysis results shown in Table 1 suggest that ED had the closest relationship with LCCI. This strong collinearity was expected because both metrics are boundary-based. The moderate correlation of SHDI and LCCI makes SHDI a complementary index able to capture the different aspects of land cover heterogeneity, showing strikingly disparate land cover patterns. The fusion LCCI not only captures the spatial difference of the land cover configuration but also takes the diversity of land cover type information into account, even for locations that are classified as having similar heterogeneities by the traditional landscape metric ED ( Figure 6). The aggregation results at the country level suggest that the LCCI captures richer heterogeneity information than the single metric ED (see Table 2; South Africa and Somalia have similar fragmentation levels but different diversity levels, which result in different complexity levels). The aggregation measure indicates that consistent measures of land cover heterogeneity may be more suitable for large-scale conservation planning or spatial sampling parameterization.    We identified five heterogeneity levels in Africa using the LCCI (Table 3). Variable heterogeneity levels reveal the uneven distribution of spatial patterns in Africa (Figure 7). Most high-heterogeneity areas are concentrated in natural or mixed areas. Artificial and agricultural area heterogeneity is low due to human-made planning. In general, Africa's land cover heterogeneity is not evident. On the coast of East Africa, the heterogeneity shows both high diversity and fragmentation, indicating that the coastal region has a complex pattern and more attention should be paid in both planning and monitoring.

Summary and Conclusions
Usually, land cover heterogeneity appears to be captured easily by landscape metrics. Numerous indices have been used to quantify land cover heterogeneity by describing features such as density, texture, size, and area. However, choosing an appropriate method for robust quantification on a global scale is still challenging because no single index can adequately take into account the whole spectrum of spatial characteristics [36]. In this study, a consistent indicator for large-scale land cover heterogeneity quantification was developed based on information theory. This measure effectively extracts more comprehensive information to distinguish the spatial variation of the land cover distribution at the continental level. Our experimental results suggest that the LCCI, a standardized and harmonized indicator, may be a good candidate parameter for large-scale geospatial sampling considerations of heterogeneity features [49][50][51]. One advantage of the LCCI is its consistent information theory framework because it eliminates the need to standardize, whereas landscape metrics are characterized by multiple value ranges and strong correlations between each other that necessitate the elimination of redundancy [52]. Furthermore, the moderate resolution of the land cover dataset can capture features at any scale, provided they are greater than 30 m, for analyses, and the temporal updateability and easy accessibility of the data should promote land cover heterogeneity data applications in environmental conservation.
Another important advantage of the entropy-based LCCI is that it extracts more abundant heterogeneity information compared with single landscape metrics by utilizing a fusion approach, thus successfully capturing information closer to the true heterogeneity of the surface. This is especially important for landscape ecology research. Within a sampling unit, the same land cover configuration may have higher LCCI values, identifying richer land cover type distributions and more complex arrangements. In such regions, edge effects may lead to unstable habitats. A recent study indicated that heterogeneous land cover mosaics may be represented as separate classes [4]. By measuring the heterogeneity using the LCCI, similar land cover patterns can be identified that offer valuable information to relevant developers.
A thorough evaluation and comparison for all quantification indices is beyond the purpose of this study. Instead, the SHDI index was chosen as a representative basic composition metric for validating the LCCI because of its computational simplicity and ease of interpretation. The results show that the SHDI does not always express the compositional complexity of the 1 km× 1 km units. This is because it extracts not only the diversity of land cover types but also the evenness of the

Summary and Conclusions
Usually, land cover heterogeneity appears to be captured easily by landscape metrics. Numerous indices have been used to quantify land cover heterogeneity by describing features such as density, texture, size, and area. However, choosing an appropriate method for robust quantification on a global scale is still challenging because no single index can adequately take into account the whole spectrum of spatial characteristics [36]. In this study, a consistent indicator for large-scale land cover heterogeneity quantification was developed based on information theory. This measure effectively extracts more comprehensive information to distinguish the spatial variation of the land cover distribution at the continental level. Our experimental results suggest that the LCCI, a standardized and harmonized indicator, may be a good candidate parameter for large-scale geospatial sampling considerations of heterogeneity features [49][50][51]. One advantage of the LCCI is its consistent information theory framework because it eliminates the need to standardize, whereas landscape metrics are characterized by multiple value ranges and strong correlations between each other that necessitate the elimination of redundancy [52]. Furthermore, the moderate resolution of the land cover dataset can capture features at any scale, provided they are greater than 30 m, for analyses, and the temporal updateability and easy accessibility of the data should promote land cover heterogeneity data applications in environmental conservation.
Another important advantage of the entropy-based LCCI is that it extracts more abundant heterogeneity information compared with single landscape metrics by utilizing a fusion approach, thus successfully capturing information closer to the true heterogeneity of the surface. This is especially important for landscape ecology research. Within a sampling unit, the same land cover configuration may have higher LCCI values, identifying richer land cover type distributions and more complex arrangements. In such regions, edge effects may lead to unstable habitats. A recent study indicated that heterogeneous land cover mosaics may be represented as separate classes [4]. By measuring the heterogeneity using the LCCI, similar land cover patterns can be identified that offer valuable information to relevant developers.
A thorough evaluation and comparison for all quantification indices is beyond the purpose of this study. Instead, the SHDI index was chosen as a representative basic composition metric for validating the LCCI because of its computational simplicity and ease of interpretation. The results show that the SHDI does not always express the compositional complexity of the 1 km× 1 km units. This is because it extracts not only the diversity of land cover types but also the evenness of the distributions. Therefore, small numbers of classes with even distributions may have high SHDI values, even though those regions are in actuality not high diversity regions. As a relative indicator for the assessment of heterogeneity change in the same region in different periods, it is still excellent. The diversity of land cover, however, should be fully explored through explicit classification of mosaic types in the future. For quantifying the configuration heterogeneity, the ED index is the most suitable when compared with the patch-based metrics, because it is easily computable, which is a big advantage. However, the ED index overlooks patch information, which results in underestimation of the heterogeneity. It is noteworthy that, although the LCCI is positively correlated with ED, PD, SHDI, and SPILT metrics, as shown by path analysis, the meanings of LCCI are different from them due to the fact that it quantifies land cover heterogeneity by incorporation of composition and configuration simultaneously. Because the LCCI provides comprehensive land cover heterogeneity information and thus more closely captures the actual degree of heterogeneity, we predict that the LCCI heterogeneity information can resolve fine-grained land cover variations.
Understanding the key role of the spatial scale is essential in geography analysis [53]. No optimal measurement scale exists because land cover patterns are naturally scale-dependent [13]. The scale should be selected based on at least one principle-the scale should be large enough to stand for one unit of landscape and reflect the heterogeneity features [54]. Previous studies have shown that a scale of 1 km 2 is useful for studying land cover heterogeneity at country and continental scales [55,56]. At the continental scale, we chose 1 km × 1 km square cells for this study, because this unit size is typical for the representation of local landscapes, supporting the subsequent analysis at the national level by aggregating the available metrics. In addition, the resolution of the selected size (1 km) allows for easy resampling of socioeconomic data (1 km) for future heterogeneity change and associated driving force research.
The GlobeLand30 dataset for the year 2010 was used for extracting the African heterogeneity characteristics. However, the accuracy of the GlobeLand30 leads to uncertainties of the heterogeneity characteristics, and, hence, misclassification of land cover data is unavoidable. Heterogeneity data extracted directly from remote sensing data may be a satisfying solution for research that requires high precision heterogeneity data. A recent study improved the land cover mapping accuracy by clustering the heterogeneity types of land cover, which helped to improve the classification accuracy of remote sensing-based land cover mapping [57]. In this study, we extracted heterogeneity information for spatial variation analysis. For environmental monitoring, obtaining the heterogeneity of each class is essential, and its success depends on the classification accuracy.
Overall, the LCCI is a novel indicator that can provide detailed information on land cover heterogeneity to support regional planning and ecological assessment. By integrating the occurrence of land cover differences between neighboring grids and information theory, we (i) propose the LCCI, a consistent scheme for the quantification of land cover heterogeneity and (ii) build a database of continent-scale land cover heterogeneity-elemental data for sustainable development monitoring and geographical analysis. Further, the performance of selected metrics at both regional and continental scales was evaluated, and the LCCI was found to enhance the robustness of land cover pattern characterization and distinction by combining both composition and configuration information. Meanwhile, our results show an improved accuracy compared with single metric approaches. We expect that our work will contribute to large-scale environmental sustainability monitoring and conservation planning by providing more direct data. Future work will attempt to apply our entropy-based index to the extraction of homogeneous land cover regions at multiple scales, which will simplify spatial statistics, increase their efficacy, and improve the meaningful to analyze.