Using a Similarity Matrix Approach to Evaluate the Accuracy of Rescaled Maps

: Rescaled maps have been extensively utilized to provide data at the appropriate spatial resolution for use in various Earth science models. However, a simple and easy way to evaluate these rescaled maps has not been developed. We propose a similarity matrix approach using a contingency table to compute three measures: overall similarity (OS), omission error (OE), and commission error (CE) to evaluate the rescaled maps. The Majority Rule Based aggregation (MRB) method was employed to produce the upscaled maps to demonstrate this approach. In addition, previously created, coarser resolution land cover maps from other research projects were also available for comparison. The question of which is better, a map initially produced at coarse resolution or a ﬁne resolution map rescaled to a coarse resolution, has not been quantitatively investigated. To address these issues, we selected study sites at three different extent levels. First, we selected twelve regions covering the continental USA, then we selected nine states (from the whole continental USA), and ﬁnally we selected nine Agriculture Statistical Districts (ASDs) (from within the nine selected states) as study sites. Crop/non-crop maps derived from the USDA Crop Data Layer (CDL) at 30 m as base maps were used for the upscaling and existing maps at 250 m and 1 km were utilized for the comparison. The results showed that a similarity matrix can effectively provide the map user with the information needed to assess the rescaling. Additionally, the upscaled maps can provide higher accuracy and better represent landscape pattern compared to the existing coarser maps. Therefore, we strongly recommend that an evaluation of the upscaled map and the existing coarser resolution map using a similarity matrix should be conducted before deciding which dataset to use for the modelling. Overall, extending our understanding on how to perform an evaluation of the rescaled map and investigation of the applicability of the rescaled map compared to the existing land cover map is necessary for users to most effectively use these data in Earth science models.


Introduction
Human activity increasingly impacts the Earth's environment and ecosystems [1,2]. To characterize the Earth's fundamental characteristics and environmental processes, maps have been created for providing information about the distribution and dynamic changes of the land cover. This map information has served as input for a great variety of models to better understand the earth including climate change [3], biogeochemical cycling [4], and carbon cycling [5], to name but a few. Therefore, characterizing the distribution of the land cover is crucial for Earth science and decision making.
To efficiently and effectively characterize the spatial distribution of land cover, remotely sensed imagery and associated technologies are employed for producing land cover maps at different spatial resolutions (i.e., raster/pixel sizes). As Grekousis et al. [6] reported, 23 global and 41 regional land cover products based on remote sensing technologies have been produced and released for use by the Earth science community. For example, the United States Department of Agriculture (USDA), National Agriculture Statistic Service (NASS) annually produces the Cropland Data Layer (CDL) at 30 m resolution for U.S.A. [7]. Other existing maps provide essential information at different spatial resolutions including 1 km, 500 m, 300 m, 250 m, and 30 m [8], and have continued to improve in detail (spatial resolution) over time.
However, in addition to these existing maps at specific spatial resolutions, other spatial resolutions may also be required to obtain optimal results for various Earth science models. Previous research has revealed that model outputs have varied considerably when using land cover maps of different spatial resolutions. For example, Buyantuyev and Wu [9] demonstrated that using land cover maps of different spatial resolutions resulted in variation of estimated land surface temperature. Grafius et al. [10] stated that the outputs of their ecosystem service models varied due to the use of land cover maps with differing spatial resolutions. Many researchers have emphasized that often the existing spatial resolutions cannot meet the requirements needed to obtain the optimal results for various Earth science models. In other words, the modeler would rather rescale a finer resolution map to the desired spatial resolution, if possible, then to use an existing coarse scale map.
Recognizing the need for various spatial resolutions by the Earth science community, rescaling techniques have been developed to fill the resolution gaps for existing land cover maps [11]. These rescaling techniques consist of two possible outcomes: upscaling and downscaling. The upscaling method decreases the spatial resolution while the downscaling method increases the spatial resolution [12]. These two methods have been documented as vitally necessary and efficient ways to obtain maps at different spatial resolutions to support a variety of Earth science research [11,13,14].
Upscaling produces coarser resolution maps from finer resolution maps using two different approaches: numerical aggregation and categorical aggregation. Numerical aggregation first computes the pixel value (e.g., digital number, DN) for the coarser resolution imagery based on some function between the finer resolution pixels and the coarser resolution pixels [15]. Then, the coarser resolution imagery is classified to produce the upscaled map. To conduct numerical aggregation, mean aggregation and central pixel resampling (CPR) are commonly employed. The mean aggregation algorithm calculates the average of the pixel values from the finer resolution imagery as the pixel value for the coarser resolution map [15]. The CPR algorithm selects the finer resolution image's central pixel value corresponding to the coarser resolution pixel as the output pixel value [16].
The other common way to obtain upscaled maps, categorical aggregation, assigns a class label to an upscaled map based on class labels in the associated finer resolution map by using different aggregation logic. The choice of aggregation logic includes: Majority Rule Based (MRB) (e.g., [17]), Random Rule Based (RRB) (e.g., [18]) and Point-centered Distance-Weighted moving window (PDW) [14]. MRB determines the class label for an upscaled map by selecting the most frequently occurring class of the fine resolution map. RRB randomly selects a class from the specific pixels of the fine resolution map. PDW first employs a weighted sampling grid of variable resolution to sample the existing map. Then, the class label for the corresponding location in the upscaled map is randomly selected from the frequency of the class labels derived from the samples.
Downscaling generates finer resolution maps from a coarser resolution map using either a numerical or categorical approach. Similar to the upscaling methods, the numerical approach first predicts finer resolution images that are then classified to obtain the downscaled map. The categorical approach directly transforms the categorical variable, usually representing the land cover class, from the coarser resolution map to the corresponding finer resolution map [13]. One common Remote Sens. 2018, 10, 487 3 of 21 method of downscaling is area-to-point prediction (ATPP). The ATPP approach interpolates the coarser resolution imagery to generate the finer resolution imagery by integrating support information from multiple sources of data. A variety of complex algorithms have been developed to conduct the ATPP such as area-point-kriging (ATPK) [19], area-to-point cokriging (ATPCK) [12], etc. For example, ATPCK incorporates the support information from imagery of different resolutions into a random field model to predict the finer resolution. The correlation within and cross-correlation between the imagery of different resolutions are used to minimize the variance of the error [12]. More details about ATPP can be found in Tang et al. [12] and Atkinson [13].
The categorical method for downscaling is frequently conducted by employing super-resolution mapping (SRM) algorithms [13]. The SRM divides a coarser resolution pixel into several finer resolution pixels and determines the class label for the finer resolution pixels by translating the coarse resolution fractions into a finer resolution land cover map [20,21]. Existing approaches produce a hard-classified land cover map at finer resolution typically choosing one class label based on the maximum a posteriori (MAP) principle. This principle is similar to that of many hard classification algorithms that assign the class label with the highest posteriori probability as the class type for the output classified map [21]. For example, Ling et al. [21] fused the class membership (CMP) of the finer resolution pixel derived by using the local smoothness priority (LSP) and the downscaled coarse fraction (DCF), respectively. The CMP is on the assumption that one pixel belongs to one of the predefined land cover classes, which represents the spatial distribution information of classes. The LSP represents the relationship between neighboring fine resolution pixels, while the DCF considers the fractional values between neighboring coarse resolution pixels. These SRM algorithms have been successfully conducted to generate finer resolution maps.
Since rescaling techniques have been well documented as necessary and effective approaches to produce land cover maps at a wide range of spatial resolutions (e.g., [14,22]), how to evaluate these upscaled or downscaled maps is becoming a critical issue. Although existing mapping projects typically have reference data that were used to assess the accuracy of that map, utilization of these reference data for validating any rescaled maps created by others is problematic. Often, the reference data are not readily available to the map users. In addition, the reference data do not match the spatial resolution of either an upscaled or downscaled map. Therefore, a simple and easy way to evaluate the accuracy of rescaled maps needs to be developed. Additionally, producing rescaled maps draws into question whether the rescaled maps are better than the existing land cover maps for use in Earth science modeling. If so, then methods for evaluating the rescaled maps will be beneficial for the community to make better decisions for selecting the appropriate datasets.
Therefore, the primary objective of this paper is to present a new method using a similarity matrix based on a contingency table to evaluate the accuracy of the rescaled maps. This method will also allow testing to see if existing, coarse resolution maps have a higher or lower accuracy than rescaled maps derived from existing fine resolution maps. Finally, testing if the extent (i.e., size) of the study sites has an influence on the accuracy of rescaled maps has not been quantitatively performed. Thus, the experiments done in this research vary the study area sizes over three different extent levels. All the analysis in this research was conducted using upscaled crop/non-crop maps derived by the MRB.

Materials and Methods
In this section, we describe the selection of data and study areas, the principles behind and the calculation of the similarity matrix, the design of our experiments using study areas of three different spatial extents, and the techniques for producing the upscaled maps. The methods consist of (1) selection of two datasets: the base map and the existing coarser resolution map; (2) selection of the study areas with different landscape patterns and different extents (i.e., size of area); (3) the similarity matrix approach proposed to evaluate either the rescaled maps or the existing, coarser resolution maps compared to the finer resolution maps; and (4) the MRB aggregation method that was used to produce Remote Sens. 2018, 10, 487 4 of 21 the upscaled maps to test the performance of similarity matrix and explore if the upscaled maps are better than the existing coarser resolution maps.

Data Descriptions
Based on the objectives of this paper, two datasets were required. The first dataset, NASS's CDL data were utilized to generate the crop/non-crop base maps that were used to produce the upscaled maps. The crop/non-crop base maps were also used as reference maps for assessing the existing coarser resolution maps (i.e., Global Food Security-Support Analysis Data, GFSAD). The CDL data can be obtained from the NASS's online geospatial application-CropScape (https://nassgeodata.gmu. edu/CropScape/). These data are annually updated, raster-formatted, and geo-referenced land cover maps [23], whose accuracies for major crops range from 85% to 95% [24]. Due to its high accuracy for predicting the information of crops and identifying the field crops, the CDL data are extensively applied in Earth science research [7,25]. Therefore, this paper utilized the CDL data at 30 m spatial resolution for the year 2010 to generate the crop/non-crop maps for each of the different study sites. The crop/non-crop maps were extracted by simplifying the crop type of CDL data based on the metadata provided by NASS (available online: https://www.nass.usda.gov/Research_and_Science/ Cropland/metadata/meta.php).
The other dataset needed in this study was the existing crop/non-crop maps at specific spatial resolutions that could then be compared with the upscaled maps. The GFSAD Crop Dominance Global 1 km and 250 m for nominal year 2010 were obtained from the Land Processes Distributed Active Archive Center (LP DAAC) provided by the National Aeronautics and Space Administration (NASA) (available from: https://lpdaac.usgs.gov/dataset_discovery/measures/measures_products_ table/gfsad1kcd_v001). The GFSAD crop/non-crop (cropland extent) maps were produced as part of a NASA project evaluating global food security in the twenty-first century. More details regarding GFSAD can be found in Teluguntla et al. [26] and Teluguntla et al. [27]. Note that GFSAD data at 30 m resolution were not used to produce the base maps in this study because maps for the year 2010 are not available. To facilitate comparison of the upscaled maps created in this study and the existing GFSAD maps, the GFSAD maps at 1 km and 250 m were resampled to 960 m and 240 m, respectively. ArcGIS version 10.4 was used to resample the GFSAD maps using the majority resampling technique. This technique determines the corresponding 4 by 4 cells in the input space that are closest to the center of the output cell and uses the majority of the 4 by 4 neighbors (http://pro.arcgis.com/en/pro-app/tool-reference/data-management/resample.htm).

Study Areas
To analyze the influence of the size of the area on the upscaling accuracy, this research employed study sites for three different extents: regional level maps (a portion of the USA), US state level maps, and Agricultural Statistical District (ASD) level maps. The boundaries of the regions and the ASDs are defined by the USDA's NASS. The U.S.A. has been divided into twelve regions ( Figure 1). Each ASD is a group of counties based on the geography, climate, and cropping practice within each state [24], and has been widely-used in agricultural research (e.g., [11]).
To evaluate the landscape pattern of each of the study sites, the Patch-Per-Unit area (PPU) (Equation (1)), a measure of the heterogeneity [28], was employed to compute the heterogeneity for each study site. As the landscape becomes heterogeneous, PPU increases [28]. The crop/non-crop base maps derived from NASS's CDL were utilized to compute the PPU for each state study site. The highest PPU for all states was 134.97, while the lowest was 4.72. The PPU (heterogeneity of each state) was then divided into three levels: lower level (0 < PPU ≤ 40), medium level (40 < PPU ≤ 100), and higher level (PPU >100). Three states were then selected for analysis from each of the three heterogeneity levels. where m is the total number of patches, n is the total number of pixels, λ is a scaling constant equal to the area of a pixel. The PPU of each of the ASDs in the nine selected states were then calculated and divided into three different levels: lower level (0 < PPU ≤ 50), medium level (50 < PPU ≤ 120), and higher level (PPU >120). For each level of heterogeneity, three ASDs were randomly selected for analysis in this study. Therefore, we selected a total of twelve regions, nine states, and nine ASDs to be analyzed in this study. The landscape characteristics of these regions, states, and ASDs are shown in Table 1. study. Therefore, we selected a total of twelve regions, nine states, and nine ASDs to be analyzed in this study. The landscape characteristics of these regions, states, and ASDs are shown in Table 1. Figure 1. Study sites. Note that the region-level study sites consist of all twelve regions. The statelevel study sites consist of nine states. The Agriculture Statistic District (ASD)-level study sites consist of nine ASDs. The number in ASD level study area represents the identify number of the ASD for marking different ASDs. For example, 2720 in the Figure 1 represents the study site of ASD2720.  Figure 1. Study sites. Note that the region-level study sites consist of all twelve regions. The state-level study sites consist of nine states. The Agriculture Statistic District (ASD)-level study sites consist of nine ASDs. The number in ASD level study area represents the identify number of the ASD for marking different ASDs. For example, 2720 in the Figure 1 represents the study site of ASD2720.

Description of the Similarity Matrix
The similarity matrix proposed here to evaluate the rescaling is based on a contingency table approach as shown in Table 2. Each row represents the area of each class in the upscaled map, while each column represents the area of each class in the base map. A ik denotes the area that is classified as class i in the base map but as class k in the upscaled map. Note that class k can be the same or different with class i. Entries on the diagonal represent the consistent area (i.e., similarity) for each class between the base map and the upscaled map. Thus, A ii shows how many areas for class i are correctly identified in the upscaled map. A +k sums the area of the class k in the upscaled map (the column total), and A +i sums the area of the class i in the base map (the row total). The omission error (OE) for class i of the upscaled map can be evaluated by Equation (2), which represents the percentage of area for class i that is omitted from the upscaled map. The OE of class i also represents the underestimation of class i in the upscaled map. The commission error (CE) for class i can be evaluated by Equation (3), which represents the percentage of area for class i that is committed from the correct base map. The CE of class i represents the overestimation of the class i in the upscaled map. The overall similarity (OS) can be evaluated by Equation (4), which represents the percentage of the area for all classes that are correctly represented in the upscaled maps.

Calculation of Similarity Matrix
To generate the similarity matrix, five steps should be taken:

1.
Compute the total number of the square windows (N SW ) to cover the base map. First, determine the total number of rows, M, and the total number of columns, N, for the upscaled map based on the desired spatial resolution (pixel size). Then, N SW is computed by N SW = M × N.

2.
Place N SW square windows (corresponding to the upscaled pixel size) over the base map. The square window, W mn , corresponds to the pixel at row m and column n in the upscaled map.

3.
Identify the class type of the upscaled pixel corresponding to W mn , which is denoted as class j (the entire upscaled pixel is class j). Then, identify the class types of the base map pixels within W mn , which is denoted as class i.

4.
Calculate the area of class i of the base map pixels within W mn . This area is denoted as A mn ij . Note that class i can be either the same or different than class j from the upscaled map. The area for class i in the base map that is classified as class j in the upscaled map, A ij , can be computed by ij , which is used to fill in the values in the contingency table to obtain the similarity matrix. 5.
The OS, OE, and CE can be obtained to evaluate the accuracy of the upscaled map according to the principles in the Section 2.3. Figure 2 presents an example of generating the similarity matrix and will make this method much clearer. The base map covers an area of 36 m 2 and has a spatial resolution (pixel size) of 1 m (resulting in 36 pixels in this study area). There are two classes in this map; Class 1 (C1) is the gray color and Class 2 (C2) is the white color in both the base and upscaled maps. The desired spatial resolution of the upscaled map is 3 m.
Step 1: The number of square windows (N SW ) in the upscaled map is 4 (the number of rows is M = 2 and the number of columns is N = 2; 2 × 2 = 4).
Step 2: The computed windows are placed over the base map. The size of each square window is same as the spatial resolution (pixel size = 3 m) of the upscaled map.
Step 4: Compute A mn ij for W 11 . Then, A 11 11 = 5, A 11 21 = 4, A 11 12 = 0, and A 11 Repeat step 3 and step 4 to obtain A ij for W 12 , W 21 , W 22 . For W 12 , A 12 11 = 0, A 12 21 = 0, A 12 12 = 1, Step 5: Table 3 shows the results of similarity matrix for this example. Based on the similarity matrix, OE, CE and OS can be computed. The OS for this example is 75.0%. The CE and OE for this example can be found in Table 3. Table 3. Similarity matrix for the example in Figure 2. OS, OE, and CE represent overall similarity, omission error, and commission error, respectively. Step 5: Table 3 shows the results of similarity matrix for this example. Based on the similarity matrix, OE, CE and OS can be computed. The OS for this example is 75.0%. The CE and OE for this example can be found in Table 3.

Figure 2.
A general scheme for implementation of similarity matrix. Class 1 is gray and Class 2 is white. Table 3. Similarity matrix for the example in Figure 2. OS, OE, and CE represent overall similarity, omission error, and commission error, respectively.

Figure 2.
A general scheme for implementation of similarity matrix. Class 1 is gray and Class 2 is white.

Design of the Experiment
To investigate the performance of the similarity matrix on rescaling techniques, the upscaling method was selected to be used in this research. No analysis was performed for downscaling, but the approach should work the same. The crop/non-crop maps were upscaled based on a widely-used aggregation method (i.e., MRB). The details of the MRB are described in Section 2.6. The similarity matrix was then obtained to illustrate the accuracy of the upscaled maps.
To explore if existing, coarser resolution land cover maps are better than newly created upscaled maps derived from accurate and finer resolution land cover maps, existing land cover maps at two different resolutions (240 m and 960 m) were chosen. Note that we are aware that producing land cover maps at coarse resolutions are vitally important for various research projects. Thus, our work here only aims to develop a tool (i.e., the similarity matrix) for evaluating the upscaled maps. It is not our intent to minimize the contributions of the existing, coarser resolution land cover maps for the Earth science community.

Upscaling Method
The MRB method for upscaling categorical maps has been recommended by [15] for agricultural projects. Therefore, our work utilized this method as an appropriate way to obtain the upscaled maps. To implement the MRB, a predefined square window corresponding to the coarser resolution pixel is constructed. Then, the MRB assigns the class type for the coarser resolution pixel based on selection of the most frequently occurring class in the base map within the predefined square window [17,18]. If there is more than one major class in the square window, the class for the coarser resolution map will be randomly determined from these major classes [15]. In this work, the base map was upscaled to 240 m and 960 m, respectively.

Results
The CDL data were used to generate the crop/non-crop base maps which were then used to create the upscaled maps at 240 m and 960 m spatial resolution for each study site. The similarity matrix, proposed here was then employed to assess the accuracy of these upscaled maps. This assessment method can not only quantify how well the upscaling techniques represent the area of all the land cover types, but also the omitted and committed area information for each map class. The accuracy of the GFSAD maps was estimated using a similarity matrix for comparison with the upscaled maps. After assessing the upscaled maps and the GFSAD maps, there were four issues of concern left to be analyzed: (1) a comparison of the accuracy of the upscaled maps and the GFSAD maps; (2) a comparison of the study site heterogeneity between the upscaled maps and the GFSAD maps; (3) the influence of heterogeneity on the accuracy of upscaled maps and GFSAD maps, and (4) the influence of extent on the accuracy of the upscaled maps.

Accuracy Assessment and Comparisons of Upscaled and GFSAD Maps
There are a total of 30 base maps generated from CDL data for all study sites (12 regions + 9 states + 9 ASDs). For each study site, two upscaled maps (at 240 m and 960 m) were generated using these base maps. Figure 3 shows an example of these base map and the corresponding upscaled maps for the Heartland region study area. The results show that the upscaled maps have different spatial distributions of the land cover from the base map. In addition, the upscaled maps visually show that the difference in the spatial distribution of the land cover became larger as the upscaled map became coarser. The similarity matrix for each upscaled map and each GFSAD map was obtained using the procedure developed in this project. An example of the similarity matrix for the upscaled map at 240 m for the Heartland region study site is shown in Table 4. The results show that the crop was omitted 9.79% and committed 12.30% of the time in the 240 m upscaled map. The OS for this example is 91.82%. Table 5 presents a summary of the OS for all of the upscaled maps (12 regions, 9 states, and 9 ASDs). Table 6 presents a summary of the OE and CE for the crop map class in all of the upscaled maps. At all spatial extents and study sites, the OS decreased and OE and CE increased when the upscaled maps became coarser. For example, the OS for the Heartland study site decreased from 91.82% to 86.88% and OE and CE increased from 9.79% to 17.37%, and from 12.3% to 18.40% when the basemap was upscaled from 240 m to 960 m. Table 7 presents the OS for all GFSAD maps for all of the study sites and Table 8 shows the OE and CE for the crop map class in all of the GFSAD maps. The OS of GFSAD maps at 960 m were, for many of the sites, significantly lower than the OS of the GFSAD maps at 240 m. For example, the OS for the Mountain region at 960 m decreased by 38.63 percentage points compared to the OS at 240 m. Additionally, the OE of the GFSAD maps at 960 m were lower than at 240 m, while CE at 960 m were higher when compared to the 240 m map.
To achieve one of the primary objectives of this paper, the accuracies of the upscaled maps and the GFSAD maps were compared. Tables 5-8 show that the upscaled maps obtained higher accuracy (higher OS, lower OE and CE) when compared to the GFSAD maps at each of the three different extent levels. For example, the OS of the upscaled map at 240 m for the Heartland region study site was higher by 6.39 percentage points compared to the coarse resolution GFSAD map at 240 m (Tables  5 and 7). Also, the OE and CE for the crop class of the upscaled map at 240 m for the Heartland region The similarity matrix for each upscaled map and each GFSAD map was obtained using the procedure developed in this project. An example of the similarity matrix for the upscaled map at 240 m for the Heartland region study site is shown in Table 4. The results show that the crop was omitted 9.79% and committed 12.30% of the time in the 240 m upscaled map. The OS for this example is 91.82%. Table 5 presents a summary of the OS for all of the upscaled maps (12 regions, 9 states, and 9 ASDs). Table 6 presents a summary of the OE and CE for the crop map class in all of the upscaled maps. At all spatial extents and study sites, the OS decreased and OE and CE increased when the upscaled maps became coarser. For example, the OS for the Heartland study site decreased from 91.82% to 86.88% and OE and CE increased from 9.79% to 17.37%, and from 12.3% to 18.40% when the basemap was upscaled from 240 m to 960 m. Table 7 presents the OS for all GFSAD maps for all of the study sites and Table 8 shows the OE and CE for the crop map class in all of the GFSAD maps. The OS of GFSAD maps at 960 m were, for many of the sites, significantly lower than the OS of the GFSAD maps at 240 m. For example, the OS for the Mountain region at 960 m decreased by 38.63 percentage points compared to the OS at 240 m. Additionally, the OE of the GFSAD maps at 960 m were lower than at 240 m, while CE at 960 m were higher when compared to the 240 m map.  To achieve one of the primary objectives of this paper, the accuracies of the upscaled maps and the GFSAD maps were compared. Tables 5-8 show that the upscaled maps obtained higher accuracy (higher OS, lower OE and CE) when compared to the GFSAD maps at each of the three different extent levels. For example, the OS of the upscaled map at 240 m for the Heartland region study site was higher by 6.39 percentage points compared to the coarse resolution GFSAD map at 240 m (Tables 5  and 7). Also, the OE and CE for the crop class of the upscaled map at 240 m for the Heartland region study site were lower by 18.3 percentage points and 1.89 percentage points, respectively, compared to the GFSAD map at 240 m. Figure 4 presents the PPU of the crop/non-crop map at 30 m (the reference data), the upscaled maps and the GFSAD maps for comparison. The results show that the PPU of the upscaled maps are closer to the PPU of the reference data compared to the PPU of the GFSAD maps. For example, for the Northern Plains region study site, the PPU of the reference data, the upscaled map at 240 m, and the GFSAD map at 240 m, were 60.15, 21.05, and 6.85, respectively. Additionally, the PPU of the maps at 240 m (either upscaled maps or GFSAD maps) are closer to the PPU of the reference data compared to the PPU of the coarser resolution maps at 960 m. For example, for the Heartland region study site, the PPU of reference map, the upscaled maps at 240 m and 960 m were 54.66, 17.53, and 1.31, respectively.   Figure 5 shows the relationship between the OS of either the upscaled maps or the GFSAD maps and the heterogeneity of each grouping of study sites by extent (i.e., regional, state, and ASDs). The study sites at each spatial extent were grouped into one of several heterogeneity levels (L1 to Lh) based on their PPU. For example, there are a total of 12 region level study sites, thus, the heterogeneity was defined as 12 levels (Figure 5a,b), which increased corresponding to L1, L2, …, L12. The results show that higher heterogeneity results in lower accuracy of the upscaled maps and the GFSAD maps at each resolution. For example, when the PPU increased from 9.28 to 72.22 (heterogeneity level L1 to L12 in Figure 5), the OSs for the region-level study sites were reduced from 98.67% to 90.00%, and from 96.32% to 83.89%, respectively, for the upscaled map at 240 m and the GFSAD map at 240 m ( Figure 5a). Additionally, either OE, CE, or both increased for the upscaled and the GFSAD maps when the study site became more heterogeneous (Tables 6 and 8). For example, Maryland with PPU of 132.77 was more heterogeneous than the Nevada with PPU of 4.72. For the Maryland, the CE for  Figure 5 shows the relationship between the OS of either the upscaled maps or the GFSAD maps and the heterogeneity of each grouping of study sites by extent (i.e., regional, state, and ASDs). The study sites at each spatial extent were grouped into one of several heterogeneity levels (L1 to Lh) based on their PPU. For example, there are a total of 12 region level study sites, thus, the heterogeneity was defined as 12 levels (Figure 5a,b), which increased corresponding to L1, L2, . . . , L12. The results

Influence of Heterogeneity on Upscaled Maps and GFSAD Maps
show that higher heterogeneity results in lower accuracy of the upscaled maps and the GFSAD maps at each resolution. For example, when the PPU increased from 9.28 to 72.22 (heterogeneity level L1 to L12 in Figure 5), the OSs for the region-level study sites were reduced from 98.67% to 90.00%, and from 96.32% to 83.89%, respectively, for the upscaled map at 240 m and the GFSAD map at 240 m (Figure 5a). Additionally, either OE, CE, or both increased for the upscaled and the GFSAD maps when the study site became more heterogeneous (Tables 6 and 8). For example, Maryland with PPU of 132.77 was more heterogeneous than the Nevada with PPU of 4.72. For the Maryland, the CE for crop of the GFSAD map at 240 m was higher by 6.55 percentage points, comparing to the Nevada (Table 8).

Influence of Extent on the Accuracy of Upscaled Maps
The results show that the extent of the study sites did not impact the accuracy of upscaled maps. First, for some study sites, there is no influence on the accuracy of the upscaled maps when the extent of the study site changes. For example, the state of Arizona and the ASD2720 had similar landscape pattern (landscape heterogeneity, and mean size of the fields (Table 1)). Note that the Arizona covered about 295,386.1 km 2 that is about eight times larger than ASD2720 which is located in Minnesota ( Table 1). The OS of upscaled maps at 960 m for these two study sites were 98.63%, and 98.73%, respectively (Table 5).

Influence of Extent on the Accuracy of Upscaled Maps
The results show that the extent of the study sites did not impact the accuracy of upscaled maps. First, for some study sites, there is no influence on the accuracy of the upscaled maps when the extent of the study site changes. For example, the state of Arizona and the ASD2720 had similar landscape pattern (landscape heterogeneity, and mean size of the fields (Table 1)). Note that the Arizona covered about 295,386.1 km 2 that is about eight times larger than ASD2720 which is located in Minnesota ( Table 1). The OS of upscaled maps at 960 m for these two study sites were 98.63%, and 98.73%, respectively (Table 5).
Second, for some other study sites, the accuracy of the upscaled maps for the regions (i.e., the very large areas) varied such that some were lower and others higher than the small areas. For example, the Northwest region covered about 642,992.8 km 2 which is about 17 times larger than the ASD2720 ( Table 1). The OS of upscaled map at 240 m for the Northwest region was lower by about 1.2 percentage points compared to the ASD2720 (Table 5).

Significance of the Similarity Matrix
Rescaling techniques inevitably lead to errors in land cover area (e.g., [11,14]). These errors in area are not only derived from the omitted areas from the base map but also by the committed areas created in the upscaled map. If the OS is 100% (i.e., the base map and the upscaled map are exactly the same (Figure 6a,d), it is not necessary to compute the OE and CE. However, an OS of 100% rarely occurs (e.g., [11]). Even if the OS of 100% occurs at a particular spatial resolution, it is difficult to know if the OS will also be 100% at another spatial resolution. Figure 6 shows an example where the base map (Figure 6a) is upscaled by the MRB to two different spatial resolutions (Figure 6d,e). Figure 6b,c are predefined windows for producing Figure 6d,e, respectively. The gray color represents class 1, while the white color represents class 2. In some situations, the area information of one class in the upscaled map will be same as its area information in the base map. For example, the base map pixels, P1, P2, P3, and P4, were aggregated to upscaled pixel Pc (Figure 6d). Since class 2 occurred more frequently within the predefined window, W1, the class type of Pc is class 2 according to the principle of MRB. However, the OS cannot be always 100% due to the CE and OE. For example, In Figure 6d, the OS is 100%, while in Figure 6e the OS is 81.25%. Up to now, there has been no standard and simple way to evaluate the accuracy of the rescaled map and to report the uncertainty information of the rescaled map to the users of the land cover map. The similarity matrix proposed here provides such a way. Second, for some other study sites, the accuracy of the upscaled maps for the regions (i.e., the very large areas) varied such that some were lower and others higher than the small areas. For example, the Northwest region covered about 642,992.8 km 2 which is about 17 times larger than the ASD2720 ( Table 1). The OS of upscaled map at 240 m for the Northwest region was lower by about 1.2 percentage points compared to the ASD2720 (Table 5).

Significance of the Similarity Matrix
Rescaling techniques inevitably lead to errors in land cover area (e.g., [11,14]). These errors in area are not only derived from the omitted areas from the base map but also by the committed areas created in the upscaled map. If the OS is 100% (i.e., the base map and the upscaled map are exactly the same (Figure 6a,d), it is not necessary to compute the OE and CE. However, an OS of 100% rarely occurs (e.g., [11]). Even if the OS of 100% occurs at a particular spatial resolution, it is difficult to know if the OS will also be 100% at another spatial resolution. Figure 6 shows an example where the base map (Figure 6a) is upscaled by the MRB to two different spatial resolutions (Figure 6d, e). Figure  6b, c are predefined windows for producing Figure 6d,e, respectively. The gray color represents class 1, while the white color represents class 2. In some situations, the area information of one class in the upscaled map will be same as its area information in the base map. For example, the base map pixels, P1, P2, P3, and P4, were aggregated to upscaled pixel Pc (Figure 6d). Since class 2 occurred more frequently within the predefined window, W1, the class type of Pc is class 2 according to the principle of MRB. However, the OS cannot be always 100% due to the CE and OE. For example, In Figure 6d, the OS is 100%, while in Figure 6e the OS is 81.25%. Up to now, there has been no standard and simple way to evaluate the accuracy of the rescaled map and to report the uncertainty information of the rescaled map to the users of the land cover map. The similarity matrix proposed here provides such a way.

Issues for Generating Upscaled Maps
An important upscaling issue that should be noted here is the problem that can occur around the edges of the study area. If the dimensions of the base map and the rescaled map are not equally divisible, then some partial upscaled pixels will fall outside of the edge of the base map. As

Issues for Generating Upscaled Maps
An important upscaling issue that should be noted here is the problem that can occur around the edges of the study area. If the dimensions of the base map and the rescaled map are not equally divisible, then some partial upscaled pixels will fall outside of the edge of the base map. As Figure 7 demonstrates, two non-overlapping windows (W3 and W6) fall outside the edge of the base map. The upscaled pixels, P3 and P6, are determined based on only the pixels in the base map that fall within the W3 and W6. Thus, for W3 and W6, there are only eight pixels used to determine the class type in the upscaled map.
demonstrates, two non-overlapping windows (W3 and W6) fall outside the edge of the base map. The upscaled pixels, P3 and P6, are determined based on only the pixels in the base map that fall within the W3 and W6. Thus, for W3 and W6, there are only eight pixels used to determine the class type in the upscaled map.
Interactive Data Language (IDL 8.3) was used to conduct MRB rescaling in this project. For this language, the implementation of the MRB rescaling algorithm begins from the upper-left pixel as the first pixel and ends with the lower-right pixel as the last pixel. Hence, the potential error of the determination of the boundary pixel in an upscaled map only occurs at the right edge and/or the bottom edge of the upscaled map. However, the similarity matrix is not impacted because only existing class types are included in the matrix calculation. In addition, the similarity matrix presents the uncertainty information by considering the error derived from the boundary pixels in the upscaled maps. These conditions further strengthen our confidence to recommend the similarity matrix to be extensively used to evaluate the rescaled maps.

Accuracy Analysis between Upscaled Maps and GFSAD Maps
Since one of our objectives was to compare existing coarser resolution maps with the upscaled maps, the GFSAD maps also were assessed using the similarity matrix. One issue that should be noted is that the crop/non-crop maps derived from the CDL data were assumed as the reference data for assessing the GFSAD maps [29]. We are aware that the CDL data are not 100% accurate. However, this dataset is highly accurate and was readily available at finer resolution [7] and therefore, was very appropriate for use in this project.
The comparisons of accuracy between the upscaled maps and GFSAD maps show that the upscaled maps can obtain higher accuracies (Tables 5-8). These results demonstrate that employing upscaled maps will be a better way to further conduct the related earth modeling research compared to using the GFSAD maps. Note that we are not saying here that all the existing coarser resolution maps should be replaced by the upscaled maps, but rather that the similarity matrix should be used to test which is better.
The lower accuracy of the GFSAD map in the USA mainly results from fact that coarser spatial resolution imagery were used for producing the GFSAD maps, while the finer spatial resolution imagery were used for producing the CDL data. The finer resolution imagery used to generate the CDL data include Lansat-5 Thematic Mapper (TM), Landsat-7 Enhanced TM plus (ETM+), Landsat-8 Optical Land Imager (OLI), Resourcesat-1 Advanced Wide Field Sensor (AWiFS), Disaster Monitoring Constellation (DMC) DEIMOS-1, and UK2 sensors. This finer-resolution imagery was used to obtain base maps that can accurately represent the distribution of land cover. However, the GFSAD maps are mainly based on the MODerate resolution Imaging Spectroradiometer (MODIS) imagery at 250 m [29]. As is known, greater amounts of mixed pixels exist in the coarser resolution imagery, which can introduce more errors into the classified maps (i.e., GFSAD map) [30]. Therefore, Interactive Data Language (IDL 8.3) was used to conduct MRB rescaling in this project. For this language, the implementation of the MRB rescaling algorithm begins from the upper-left pixel as the first pixel and ends with the lower-right pixel as the last pixel. Hence, the potential error of the determination of the boundary pixel in an upscaled map only occurs at the right edge and/or the bottom edge of the upscaled map. However, the similarity matrix is not impacted because only existing class types are included in the matrix calculation. In addition, the similarity matrix presents the uncertainty information by considering the error derived from the boundary pixels in the upscaled maps. These conditions further strengthen our confidence to recommend the similarity matrix to be extensively used to evaluate the rescaled maps.

Accuracy Analysis between Upscaled Maps and GFSAD Maps
Since one of our objectives was to compare existing coarser resolution maps with the upscaled maps, the GFSAD maps also were assessed using the similarity matrix. One issue that should be noted is that the crop/non-crop maps derived from the CDL data were assumed as the reference data for assessing the GFSAD maps [29]. We are aware that the CDL data are not 100% accurate. However, this dataset is highly accurate and was readily available at finer resolution [7] and therefore, was very appropriate for use in this project.
The comparisons of accuracy between the upscaled maps and GFSAD maps show that the upscaled maps can obtain higher accuracies (Tables 5-8). These results demonstrate that employing upscaled maps will be a better way to further conduct the related earth modeling research compared to using the GFSAD maps. Note that we are not saying here that all the existing coarser resolution maps should be replaced by the upscaled maps, but rather that the similarity matrix should be used to test which is better.
The lower accuracy of the GFSAD map in the USA mainly results from fact that coarser spatial resolution imagery were used for producing the GFSAD maps, while the finer spatial resolution imagery were used for producing the CDL data. The finer resolution imagery used to generate the CDL data include Lansat-5 Thematic Mapper (TM), Landsat-7 Enhanced TM plus (ETM+), Landsat-8 Optical Land Imager (OLI), Resourcesat-1 Advanced Wide Field Sensor (AWiFS), Disaster Monitoring Constellation (DMC) DEIMOS-1, and UK2 sensors. This finer-resolution imagery was used to obtain base maps that can accurately represent the distribution of land cover. However, the GFSAD maps are mainly based on the MODerate resolution Imaging Spectroradiometer (MODIS) imagery at 250 m [29]. As is known, greater amounts of mixed pixels exist in the coarser resolution imagery, which can introduce more errors into the classified maps (i.e., GFSAD map) [30]. Therefore, compared to the upscaled maps derived from accurate, finer resolution CDL data, GFSAD likely will not obtain higher accuracy of the distribution of land cover.
To visually show the difference between the upscaled map and the GFSAD map, two sub-areas in the region of Upper-Midwest were selected (Figure 8). The upscaled map either at 240 m or at 960 m accurately presents the crop distribution. The GFSAD map at 960 m of the region was mostly classified as crop, while the upscaled map present the original crop distribution as accurate as possible. These results demonstrate that the upscaled map can more accurately represent the distribution of land cover, which further strengthens our confidence in recommending the use of upscaled maps as inputs for earth science models. However, assessing and comparing the performance of the upscaled maps and existing coarser resolution maps are required before making these decisions.
Remote Sens. 2018, 10, x FOR PEER REVIEW 16 of 22 compared to the upscaled maps derived from accurate, finer resolution CDL data, GFSAD likely will not obtain higher accuracy of the distribution of land cover.
To visually show the difference between the upscaled map and the GFSAD map, two sub-areas in the region of Upper-Midwest were selected (Figure 8). The upscaled map either at 240 m or at 960 m accurately presents the crop distribution. The GFSAD map at 960 m of the region was mostly classified as crop, while the upscaled map present the original crop distribution as accurate as possible. These results demonstrate that the upscaled map can more accurately represent the distribution of land cover, which further strengthens our confidence in recommending the use of upscaled maps as inputs for earth science models. However, assessing and comparing the performance of the upscaled maps and existing coarser resolution maps are required before making these decisions.

Differences in Heterogeneity between the Upscaled Maps and the GFSAD Maps
A vitally important issue underlying different land cover maps is the difference in representations of landscape pattern. As acknowledged, several studies have reported that the representation of landscape pattern is biased in the upscaled map (e.g., [11,15]). However,

Differences in Heterogeneity between the Upscaled Maps and the GFSAD Maps
A vitally important issue underlying different land cover maps is the difference in representations of landscape pattern. As acknowledged, several studies have reported that the representation of landscape pattern is biased in the upscaled map (e.g., [11,15]). However, comparisons between the representation of landscape pattern based on the upscaled map and the existing coarser resolution map has not been explored. Therefore, the heterogeneity, evaluated by PPU, was selected to be analyzed in this research. Note that the study sites with higher PPU are more heterogeneous than study sites with lower PPU.
The comparisons of heterogeneity represented by the PPU values ( Figure 4) and by visual observation (Figure 8) between the upscaled maps and the GFSAD maps demonstrated that upscaled maps can obtain a better representation of the landscape patterns when compared to the GFSAD maps. This is because the upscaled maps can more accurately represent the distribution of land cover compared to the GFSAD maps. Furthermore, for users, a good quality land cover map should not only have a better map accuracy, but also a better representation of landscape pattern. Therefore, considering the comparisons of the PPU between two datasets and the users' requirements, the upscaled map is more suitable and should be used in Earth science models.

Considerations with Respect to Heterogeneity
The heterogeneity of the study site has been emphasized as a major factor impacting the accuracy of the upscaled maps (e.g., [11,31]) and the accuracy of classification maps (e.g., [32]), in general. Thus, the influence of heterogeneity on upscaled maps and GFSAD maps should be explored. As expected, the accuracies of both the upscaled maps and the GFSAD maps were considerably impacted by the heterogeneity of the study sites. Lower accuracies for the upscaled and the GFSAD maps, at each resolution, were obtained in the more heterogeneous areas compared to the homogeneous areas ( Figure 5, Tables 5-8). This is because the spectral heterogeneity and spectral complexity of the imagery are higher in heterogeneous areas, which results in lower classification map accuracy (e.g., CDL data, GFSAD map) [32]. If these heterogeneous maps (e.g., CDL data) were used as base maps for upscaling, mapping error would increase in the upscaled map and the accuracy of the upscaled map would be reduced [11]. Therefore, given the lower upscaling accuracy in the heterogeneous areas, users should be cautious when utilizing upscaled maps in heterogeneous landscapes.
Recognizing the negative effect of heterogeneity on the upscaled maps requires not only that caution be taken when using the upscaling techniques in the heterogenous areas, but also demonstrates the need for developing improved upscaling techniques. The improved upscaling techniques should consider the heterogeneity as a factor to improve the upscaling accuracy. Additionally, the analyst that is producing the land cover map should be aware of lower accuracy occurring in the heterogeneous areas. More accurate and improved classification techniques are also recommended to be developed and employed for improving the accuracy of the land cover products, especially in the heterogeneous areas. These accurate land cover products can be further used for producing upscaled maps at different resolution meeting the requirements for various Earth observation models or their related research (e.g., [33]).

Considerations with Respect to Extent of Maps
Landscape pattern has strong influence on the accuracy of the upscaled maps, as demonstrated by previous research (e.g., [31,34]). However, previous research rarely explored if the extent (i.e., size of the area) of the study sites impacted the upscaled maps. Therefore, as part of our research we employed different size study sites from large regions of the US to individual states, to smaller ASDs within a state. Although we did not test different upscaling methods to see if extent has an influence on the upscaling accuracy, it is worthy of discussion based on the results derived from using just the MRB approach.
The results of this study demonstrated that there was no difference between upscaled maps at the same spatial resolution for different extent levels (i.e., region level, state level, and ASD level). As long as the heterogeneity is similar, the upscaling accuracy did not show much difference between study area sizes. Instead, it is evident that heterogeneity and not extent is the main factor impacting the upscaling accuracy.

Implications, Limitations, and Future Work
The results of this work have important implications for decision making. The similarity matrix has been demonstrated as an efficient and effective method to assess the rescaled map. This method can provide detailed information on the accuracy of the rescaled map, which is important for users who wish to use these maps for further applications. In addition, comparisons between the upscaled map and any existing coarser resolution map draw demonstrate that it may be better to use the upscaled map than an existing coarser resolution map. The higher performance of the upscaled maps, in terms of accuracy and landscape pattern, confirms that they can be reliably used for Earth science modelling. Note that, here, we did do not emphasize the upscaled maps can always be better than the existing maps. The results illustrate that the upscaled maps can be one option for map users. Therefore, we strongly recommend that a comparison using the similarity matrix should be conducted to assess any existing coarse resolution maps with potential upscaled maps for use in any modelling.
Further, the role of heterogeneity in producing upscaled maps confirms that landscape pattern is an important consideration for those researchers developing upscaling techniques to obtain more accurate upscaled maps. Moreover, this study provides evidence that it is not necessary to consider the extent when developing a new upscaling technique. Overall, the achievements of this paper are essential for guiding land cover map users to select suitable datasets for their models and encourage them to consider heterogeneity when developing any new upscaling techniques.
Finally, we are aware that our research may have three limitations. The first major limitation is that there was no GFSAD map at 30 m resolution for 2010. Therefore, we used the CDL data at 30 m to generate the crop/non-crop map as the reference data to compare the results between GFSAD maps and the upscaled maps. These comparisons were based on the assumption that the CDL data are 100% correct, which we acknowledge is not the case. This limitation resulted in errors in the final results, which would be extremely difficult to quantitatively assess without an extremely accurate reference data. In order to do this assessment, reference data covering the entire USA would be needed. The cost for producing such a dataset in prohibitive and justifies our use of a very accurate (not 100%) data set (e.g., CDL).
The second limitation is that the results of this paper are based on the simple case of a crop/non-crop map and not a more complex land cover map with many land cover classes. Thus, our analysis should be repeated in future work using upscaled maps and the existing land cover maps that are more complex. Additionally, only the upscaling of maps has been explored in this paper. While these techniques are directly applicable to downscaled maps, the paper does not demonstrate this use and should be the subject of future work.
The third limitation is that only a single upscaling technique (i.e., MRB) was used here. As acknowledged, different upscaling techniques perform differently (e.g., [15]). The goal of this paper was to demonstrate the power of the similarity matrix and not to evaluate upscaling algorithms. Therefore, it would be useful to use the similarity matrix approach to compare different upscaling techniques in some future work.

Conclusions
This paper first proposed a new method, the similarity matrix, for assessing the accuracy of rescaled maps using three measurements: overall similarity (OS), omission error (OE), and commission error (CE). A conventional upscaling technique, Majority Rule Based aggregation (MRB), was used to produce the upscaled maps to demonstrate the performance of the similarity matrix. Additionally, this paper explored if these upscaled maps perform better than existing coarse resolution crop/non-crop maps. To conduct these experiments, two datasets: the Cropland Data Layer (CDL) at 30 m for 2010 and the GFSAD nominal data for 2010 at 250 m and 1 km, were utilized. The CDL data were first used to extract the crop/non-crop maps for (1) providing the base maps for upscaling, and (2) providing reference data to assess the accuracy of the upscaled maps and the GFSAD maps. Further, to determine if the extent of the study site has influence on the performance of the upscaling, three different extent levels were employed. There were twelve regions covering the continental USA, nine states, and nine Agricultural Statistic Districts (ASDs) that were selected to perform the experiments. Several conclusions resulted from this analysis: (1) the similarity matrix can successfully assess the accuracy of the rescaled map and report the uncertainty information for the rescaled map. This information is beneficial for land cover map users for further assessing the uncertainty or accuracy of their models. Also, the similarity matrix can provide information for making decisions about selecting the datasets for further analysis such as use in Earth science modeling; (2) the upscaled maps outperformed the existing coarser resolution GFSAD maps in both accuracy and representation of the landscape pattern. This result demonstrates that evaluating the performance of the upscaled map and the existing coarser resolution map is required before making a decision for selecting the datasets as the input to the users' model; (3) the influence of heterogeneity on the upscaling accuracy does not only reveal that lower accuracy was produced in the heterogeneous area, but also recommends that a new upscaling technique should consider heterogeneity as an important factor to improve the accuracy of the upscaling technique, and (4) the results between different study sites with different extents quantitatively confirms that the performance of upscaling has not been impacted by the extent of study site. In conclusion, extending our understanding on how to assess the rescaled map, determination for selecting the upscaled map or the existing land cover map, and suggestions for developing new upscaling techniques, is vitally helpful to the Earth science community as users of land cover maps.
Future work should focus on four aspects: (1) different downscaling/upscaling techniques should be employed to investigate if the downscaled/upscaled maps can really perform better than the existing land cover maps; (2) comparisons between upscaled maps and downscaled maps should be demonstrated; (3) landscape pattern should be considered for developing new downscaling/upscaling techniques, and (4) the land cover map with more map classes should be employed to explore if the upscaled map and downscaled map still can be a prospective data source for Earth science modelling research.