The Impact of Mapping Error on the Performance of Upscaling Agricultural Maps

Aggregation methods are the most common way of upscaling land cover maps. To analyze the impact of land cover mapping error on upscaling agricultural maps, we utilized the Cropland Data Layer (CDL) data with corresponding confidence level data and simulated eight levels of error using a Monte Carlo simulation for two Agriculture Statistic Districts (ASD) in the U.S.A. The results of the simulations were used as base maps for subsequent upscaling, utilizing the majority rule based aggregation method. The results show that increasing error level resulted in higher proportional errors for each crop in both study areas. As a result of increasing error level, landscape characteristics of the base map also changed greatly resulting in higher proportional error in the upscaled maps. Furthermore, the proportional error is sensitive to the crop area proportion in the base map and decreases as the crop proportion increases. These findings indicate that three factors, the error level of the thematic map, the change in landscape pattern/characteristics of the thematic map, and the objective of the project, should be considered before performing any upscaling. The first two factors can be estimated by using pre-existing land cover maps with relatively high accuracy. The third factor is dependent on the project requirements (e.g., landscape characteristics, proportions of cover types, and use of the upscaled map). Overall, improving our understanding of the impacts of land cover mapping error is necessary to the proper design for upscaling and obtaining the optimal upscaled map.


Introduction
Knowledge about the area and spatial distribution of land cover is critical for geo-information, environmental, and socioeconomic research [1][2][3][4][5][6][7].Land cover maps are fundamental data for modeling ecosystem services [8], agricultural management [9], climate change [10] and carbon cycles [11].Various types of research and/or models require large area (continental or global scale) land cover maps, over a range of spatial resolutions [12,13].Many of these maps are generated from remotely sensed imagery [14,15] and have been widely employed to serve scientific research [16].
However, researchers face issues with the availability of remote sensing data with specific resolutions due to the capability of remote sensors [17][18][19].Although diverse global or regional land cover maps have been generated, the spatial resolutions of these maps are also limited [20].For example, the spatial resolutions of Global Land Cover data set 2000 (GLC 2000) [21], Moderate Resolution Imaging Spectroradiometer (MODIS) Land Cover [22] and Fine Resolution Observation and Monitoring of Global Land Cover data (FROM-GLC) [23] are 1 km, 500 m and 30 m, respectively.It may be desirable to rescale these data to a specified resolution in order to fill data gaps or match preexisting project requirements [24,25].Upscaling and downscaling are two alternative methods for rescaling [26].The former decreases the spatial resolution, while the latter increases the spatial resolution [5,19,[27][28][29].The focus of this research is the impact of land cover mapping error on upscaling.
To upscale land cover maps, the categorical aggregation method is widely used to transform the land cover information to coarser scale maps [12,19,30,31].This categorical aggregation method assigns a class label to a coarse-resolution pixel according to the classes in the associated fine-resolution pixels from the existing thematic map [32].Three categorical aggregation methods exist: (1) the majority rule-based (MRB) method (e.g., [33]); (2) the random rule-based (RRB) method (e.g., [32]); and (3) the point-centered distance-weighted moving window (PDW) method [24].MRB determines the class type in the coarse-resolution map by selecting the most frequently occurring class of the fine-resolution map [32,33].When there is more than one major class, the dominant class is randomly selected from the major classes [20].MRB makes the dominant class more clumped while the dispersed class shows a less clumped pattern in upscaling maps [19,20].RRB is based on the random selection of a class from the specified pixels of the fine-resolution map.The corresponding aggregated pixel is then assigned to that class [32].He et al. [32] reported that RRB maintains spatial patterns better than the MRB.PDW conducts three steps to obtain an upscaled land cover map [20,24].First, the center point, C ij , of the pixel on the upscaled map is located.Second, a set of n sampling points (referred to as sampling net [24]) is placed on the base map with its center at location C ij .The distance between two points, r, is the resolution of sampling net.Last, the normalized frequency distribution of land covers, f, is computed at each point in the sampling net.The class type will be assigned to the upscaled pixels located at C ij according to the random selection of class from f.More details about PDW can be found in Gardner et al. [24].PDW provides a more consistent method for landscape comparison when maps derived from multi-sources of classification imagery [24].Recently, Raj et al. [20] compared these three approaches.Their results showed that MRB can be used for agriculture planning, while PDW is suitable for ecological resource management.
The literature on upscaling land cover map focuses on the evaluation of the effects of upscaling on landscape patterns as a result of the aggregation (e.g., [24,32,34]).For example, Moody and Woodcock [31,34] used statistical analysis to assess the relationship between landscape pattern and proportional errors (PE) in the aggregated maps.Earlier studies imply that PE in the land cover maps was greatly influenced by landscape characteristics (e.g., homogeneity and heterogeneity) [32].Recently, efforts to reduce PE in aggregation methods have been emphasized [20].For example, Hlavka and Dungan [35] used a model-based method to correct the underestimation of fragmented areas.The results showed that area correction led to inflated areal estimates.Wu [18] analyzed the effect of changing scale on landscape pattern and reported that scaling relations were more variable at the map class level than at the landscape level.These studies illustrate that the influence of upscaling is locally dependent on landscape patterns.
Fine-resolution land cover maps, as the fundamental data for aggregation, play a critical role in implementing upscaling.These maps, derived from remote sensing, have significant thematic and spatial uncertainties [16,36,37].For example, Gong et al. [38] reported that the overall accuracy of the global land cover classification derived from Landsat data was 64.9%.Wickham et al. [39] reported the overall accuracies of the 2011 National Land Cover Database (NLCD) at level II and level I were 82% and 89%, respectively.Congalton et al. [16] state that accurately mapping land cover using remotely sensed data is challenging because classification methods and remote sensing technology can be highly influenced by every component of a mapping project.
Therefore, while land cover mapping error is important, the impact of these errors on cover area and landscape characteristics in upscaled maps has not been quantitatively analyzed.Hence, our research has investigated this impact using upscaled agricultural maps based on the majority rule-based aggregation method (MRB) within two study areas with differing levels of landscape heterogeneity.A Monte Carlo simulation algorithm was employed to obtain a series of base maps with differing amounts of mapping error that would then be utilized for upscaling.The differences in impacts were evaluated at the various error levels and the results were analyzed.

Materials and Methods
This section describes the study areas, data descriptions, error simulation algorithm, aggregation method (i.e., MRB), assessment of MRB and the landscape metrics used to describe the landscape characteristics.Note that the CDL data were used to extract the initial main crop thematic maps.For ASD1810, the main crop types are corn, sorghum, soybeans, winter wheat and alfalfa.For ASD4550, the main crop types are corn, cotton, sorghum, soybeans and winter wheat.The CDL thematic maps were used as the base map with an assumed error level of 0%.These maps were then used to generate a series base maps with different error levels ranging from 5 to 40%.After creating these base maps, the majority rule based aggregation method (MRB) was employed to create upscaling maps at different resolutions ranging from 60 m to 960 m.To quantitatively evaluate the impact of base map error level on the upscaling method, PE [31], overall consistency (OC) [19] and several landscape metrics (i.e., PPU and SqP) were utilized.

Study Areas and Materials
Two Agricultural Statistics Districts (ASDs) with different heterogeneity in each were selected to conduct the experiment (Figure 1).These districts are groupings of counties in each State and defined by geography, climate, and cropping practices [40,41].Two study sites, ASD1810 and ASD4550, were selected.Both sites have high crop diversity (e.g., wheat, corn, soybeans, cotton, alfalfa, sorghum and fruit trees).The first site, ASD1810, is located in the northwest part of Indiana, U.S.A.The other site, ASD 4550, is located in the center of South Carolina, U.S.A.In describing each study site (containing crops and non-crop), ASD1810 is more fragmented than the ASD4550.Table 1 shows the landscape metrics describing the landscape pattern of each study district.Using a measure of fragmentation, Patch-per-Unit area (PPU), ASD4550 has a PPU of 8.95 compared to 11.45 for ASD1810 (Table 1).The lower PPU in ASD4550 is because the percentage of non-crop area for ASD4550 is greater, 87.8% compared to 37.5% in ASD1810.However, if just analyzing the agricultural fields and not the entire study area, then these two study areas show different landscape patterns for the crops.In ASD1810, the crop area is 6.724 × 10 5 hectares (ha) or approximately 62.5% of area of ASD1810.The agricultural fields appear more regular in shape.In ASD4550, crop covers 1.529 × 10 5 ha or about 12.2% of the area.The agricultural fields are of smaller size and have more irregular shapes.
The agricultural maps, with their corresponding confidence level maps, were obtained from National Agricultural Statistics Service (NASS) Cropland Data Layer (CDL).These data are an annually updated, raster-formatted, geo-referenced land cover map, which aims to improve the geospatial predicative information of crops covering 48 states [42].To identify the land cover accurately, the spectral responses derived from a variety of remote sensing imageries (including Lansat-5 Thematic Mapper (TM), Landsat-7 Enhanced TM plus (ETM+), Landsat-8 Optical Land Imager (OLI), Resourcesat-1 Advanced Wide Field Sensor (AWiFS), Disaster Monitoring Constellation (DMC) DEIMOS-1 and UK2 sensors), were employed as training datasets for classification, and as reference data for accuracy assessment for crop classes.The Multi-Resolution Land Characteristics (MRLC) consortium's National Land Cover Database (NLCD) for other non-crop classes [42].As reported by the United States, Department of Agriculture (USDA), the crop identification accuracies of 90% for major commodities (e.g., corn, cotton, rice, soybeans and wheat) were obtained [42].Besides the crop distribution information, CDL also provides the information of other specific land covers.The CDL products are widely used in various types of research [43] because the field crops have been accurately identified and geo-located [44,45].
error.The mapping errors were then added to this true data to quantitatively produce the maps with different error levels for analyzing the impact of land cover mapping error on the upscaling.The National confidence layer for 2016 was downloaded from https://www.nass.usda.gov/Research_and_Science/Cropland/-Release/index.php.The projection for all the data is Albers Equal Area.

Error Simulation
During a land cover mapping project, potential errors are introduced at each step in the process (e.g., image acquisition, data processing, and error assessment [46][47][48].The accumulation of these errors, across all steps, results in misclassifications or location errors where the wrong land cover is assigned.This study seeks to understand the impacts of these misclassifications on upscaled land cover products. The probability-based Monte Carlo (MC) simulation method [8] was used to generate new crop maps (i.e., base maps for upscaling) with varying levels of error as derived using the CDL data.This  Additionally, the confidence layer spatially represents the predicted confidence that is associated with each output pixel in the CDL map [42].Therefore, pixels with higher confidence will have lower probability of misclassification.These values are the basis for the mapping error simulation portion of this paper.NASS CDL agricultural maps at 30 m for 2016 were downloaded from the NASS online geospatial application-CropScape (https://nassgeodata.gmu.edu/CropScape/).These data were used to extract a crop thematic map to be used as the original base map for error simulation.Note that the original base map at 30 m was assumed as the true data without error.The mapping errors were then added to this true data to quantitatively produce the maps with different error levels for analyzing the impact of land cover mapping error on the upscaling.The National confidence layer for 2016 was downloaded from https://www.nass.usda.gov/Research_and_Science/Cropland/-Release/index.php.The projection for all the data is Albers Equal Area.

Error Simulation
During a land cover mapping project, potential errors are introduced at each step in the process (e.g., image acquisition, data processing, and error assessment [46][47][48].The accumulation of these errors, across all steps, results in misclassifications or location errors where the wrong land cover is assigned.This study seeks to understand the impacts of these misclassifications on upscaled land cover products. The probability-based Monte Carlo (MC) simulation method [8] was used to generate new crop maps (i.e., base maps for upscaling) with varying levels of error as derived using the CDL data.This numerical experimentation and statistical sampling technique has been widely used in uncertainty analysis for various ecosystem models [49][50][51].Errors in land cover maps occur most frequently at the borders of different land cover types [52] and thus the assumption was made that these boundary pixels have a higher probability of misclassification.Therefore, for this paper, MC was employed to randomly introduce a specified amount of error into the map by randomly misclassifying pixels at the boundaries of different crop patches.The process is described in more detail below.
Eight new crops maps (all at the original 30 m resolution) with different levels of error, ξ, were produced (5%, 10%, 15%, 20%, 25%, 30%, 35% and 40%) as base maps to be upscaled.The error level denotes the percentage of misclassified crop pixels with respect to the original CDL map.Based on Dong et al.
[8], we ran a probability-based MC for our error simulations following three steps.Figure 2 shows a general process of the MC simulation.First, boundary pixels were identified and the confidence level, CL i , for each were acquired from the confidence level data associated with the CDL map.Second, the required number of misclassification N ξ , was calculated as N ξ = N × ξ, where ξ is the desired error level and N is the total number of crop pixels.Third, a pair of boundary pixels, BP 1 and BP 2 , with confidence levels of CL 1 and CL 2 , respectively, were chosen randomly.BP 1 and BP 2 , must not have been previously misclassified (referred to as unswapped pixels).Two numbers, n 1 and n 2 , from a uniform distribution ranging [0, 1] were randomly selected.If n 1 > CL 1 and n 2 > CL 2 , then the two selected pixels would be misclassified by exchanging their labels.These pixels were then marked as swapped pixels.If the condition, n 1 > CL 1 and n 2 > CL 2 , was not met, a different pair of unswapped pixels would be chosen randomly and misclassified.After repeating 2N times [8], the number of swapped pixels, N sw , was computed.If N sw < N ξ , then the desired error level ξ was not acquired and the simulation was continued until N sw = N ξ .To obtain reliable simulation results, the simulation process was repeated a total of 100 times to acquire the error maps at the specified error levels.Note that, due to variation in agricultural landscape characteristics, at some error levels, there were not enough boundary pixels to meet the required number of misclassifications.When this occurred, pixels at the center of patches were also selected for misclassification.In addition, the principle of simulating mapping error is based on swapping the class type for the potential misclassified pixels that meet the requirement of misclassifying, which is consistent with Dong et al.Note that, due to variation in agricultural landscape characteristics, at some error levels, there were not enough boundary pixels to meet the required number of misclassifications.When this occurred, pixels at the center of patches were also selected for misclassification.In addition, the principle of simulating mapping error is based on swapping the class type for the potential misclassified pixels that meet the requirement of misclassifying, which is consistent with Dong et al. [8].

Upscaling Method
Aggregation is a common way to upscale thematic maps.The effect of spatial aggregation varies using different methods, each of which is dependent on the aggregation logic [20].The selection of an aggregation approach depends upon the objective of the research.Raj et al. [20] reported that MRB is useful for monitoring agriculture at regional or national levels.Therefore, we selected MRB to test the influence of mapping error on the performance of upscaling maps.MRB determines the land cover class for the coarse resolution pixels in the upscaled map by selecting the most frequently occurring class from the finer resolution map contained within each coarse pixel [32,33].When there is more than one major class, the dominant class is selected at random [20].
The aggregation makes use of a non-overlapping series of windows, placed over the finer resolution base maps produced in Section 2.2.Each window is assigned a class based on the most frequently occurring class within it (corn, cotton, sorghum, soybeans, winter wheat, alfalfa and non-crop).When more than one land cover type occurred with same frequency in a window, the assignment was chosen randomly to one of those types.The size of the windows used was determined by the desired spatial resolution of the output maps.For this study, ten different coarse maps with spatial resolutions ranging from 60 m to 960 m (i.e., window sizes from 60 to 960 m) were produced for each simulated base map (8 total) as well as the original agricultural map (error level is 0%).

Assessment of MRB
Proportional Error (PE) for each crop type measures the error produced by upscaling [31], which is calculated by Equation (1).
where A e is the estimated area derived from the coarse resolution agricultural map and A b is the base area calculated from the original 30 m resolution base map (finest resolution).Overall Consistency (OC) for the land cover in the entire map is defined as the percent area of the land that is labeled into the same cover class in both the coarse resolution and original land cover maps [19].To measure the overall accuracy of the MRB analysis, OC was used to assess the performance.

Landscape Metrics
Area proportions are impacted by the landscape patterns [19].In this study, the landscape pattern of the base maps varies and thus potentially impacts the upscaling result.Additionally, it has been found that landscape pattern can be influenced by upscaling land cover maps [30,31].Therefore, landscape pattern should be considered when analyzing the results of upscaling.To quantitatively assess the landscape properties and the effect of mapping error on upscaling, landscape metrics were employed.
Various landscape metrics were developed in the past decades but many of them are intercorrelated [20,53].Two metrics, Patch Per Unit area (PPU) and Square-pixel index (SqP), were used in this study.PPU and SqP, are two alternative metrics for the Contagion index (CI) and Fractal dimension index (FDI), respectively.These two metrics were selected as they show more consistent results through different aggregation levels compared to the actual contagion and fractal dimension metrics [54,55].PPU can quantify the clumping and fragmentation of the landscape [54].As the landscape becomes fragmented, PPU increases.SqP measures the complexity of landscape [54].The SqP considers the perimeter-area relationship for raster data structure and normalizes the ratio of perimeter and area to a value between 0 (for a square) and 1 (maximum perimeter-edge deviation from that of a perfect square).PPU and SqP are estimated by Equations ( 2) and (3) [54], respectively.PPU = m/(n × λ) (2) where m is the total number of patches, n is the total number of pixels, λ is a scaling constant equal to the area of a pixel, A is the total area of all pixels, and P is the total perimeter of all pixels in the study area [54].
Besides exploration of the influence of upscaling on the land cover map, the impact of the changes in landscape pattern (CLP) for base maps derived from error simulation should be investigated.As recommended by O'Neill et al. [56], a three-dimension Euclidean distance (Equation ( 4)) should be used to evaluate the CLP.
where X is the Dominance index (DI), Y is CI, Z is FDI [56], b means base map, and e means estimated map.For example, X b is the DI of the base map.The DI measures the extent to which one or a few classes dominate the landscape [57].However, Turner et al.
[58] reported the DI is not suitable for assessment of spatial aggregation effect due to its inconsistent results through different aggregation levels of thematic maps.Therefore, we just used the DI to evaluate changes in the landscape after increasing the base map error level.Additionally, CI and FDI are very dependent on spatial resolution [54], hence, we used these two alternative metrics, PPU (Equation ( 2)) and SqP (Equation ( 3)), to replace the CI and FDI, respectively.
To obtain the parameters for evaluation of the impact of upscaling on landscape pattern (i.e., PPU and SqP), and the impact of error simulation on the landscape pattern of the base maps (i.e., CLP), Fragstats version 4.2 [59], a widely-used spatial analysis software package [30,59], was employed.

Results
The primary objective of this study was to quantitatively analyze the impact of mapping error on upscaled maps.To obtain base maps at eight different error levels for each study site, MC was implemented successfully.The upscaled maps were then generated using these base maps and assessed using PE, OC, and landscape metrics.The specific results of this analysis are presented in this section.

Error Simulation
Our error simulation successfully misclassified crop pixels according to their confidence levels and their locations.Figure 3 shows the simulated base maps produced at the eight error levels for both study areas.A sub-region within each study area is marked with a red rectangle and shown in the second row of each panel.With increasing error level, the number of misclassified boundary pixels increases.Pixels beyond the boundary, visible in Figure 3, were also involved in misclassification when the error level ranged from 30 to 40% in ASD1810 and 40% in ASD4550.In ASD1810, non-boundary pixels made up about 0.8%, 4.6% and 9.5% of the error in the simulated maps at error levels of 30%, 35% and 40% respectively.In ASD4550, non-boundary pixels produced about 1.1% of the error for the simulated map at an error level of 40%.The landscape pattern of the fields changed within the different error levels.Figure 4 shows an increasing trend in landscape change (Equation ( 4)) with increasing error level for both study areas.For ASD1810, at error levels between 15% and 30%, the landscape change shows a decreasing trend with increasing error level.Additionally, the changes in landscape in ASD4550 are consistently lower than changes in ASD1810 for all error levels.

Impacts of Upscaling and Map Error on PE and OC
Nine base maps, the original base map with error level 0%, and the base maps with error ranging from 5 to 40%, were acquired.These base maps were upscaled, respectively, to be assessed by the assessment methodologies elaborated in Sections 2.4 and 2.5 to analyze the influence of mapping error on the upscaling.
The results of upscaling in ASD1810 at selected error levels and spatial resolutions are illustrated in Figure 5.Comparison of upscaling results in sub-regions under different error levels shows that The landscape pattern of the fields changed within the different error levels.Figure 4 shows an increasing trend in landscape change (Equation ( 4)) with increasing error level for both study areas.For ASD1810, at error levels between 15% and 30%, the landscape change shows a decreasing trend with increasing error level.Additionally, the changes in landscape in ASD4550 are consistently lower than changes in ASD1810 for all error levels.The landscape pattern of the fields changed within the different error levels.Figure 4 shows an increasing trend in landscape change (Equation ( 4)) with increasing error level for both study areas.For ASD1810, at error levels between 15% and 30%, the landscape change shows a decreasing trend with increasing error level.Additionally, the changes in landscape in ASD4550 are consistently lower than changes in ASD1810 for all error levels.

Impacts of Upscaling and Map Error on PE and OC
Nine base maps, the original base map with error level 0%, and the base maps with error ranging from 5 to 40%, were acquired.These base maps were upscaled, respectively, to be assessed by the assessment methodologies elaborated in Sections 2.4 and 2.5 to analyze the influence of mapping error on the upscaling.
The results of upscaling in ASD1810 at selected error levels and spatial resolutions are illustrated in Figure 5.Comparison of upscaling results in sub-regions under different error levels shows that

Impacts of Upscaling and Map Error on PE and OC
Nine base maps, the original base map with error level 0%, and the base maps with error ranging from 5 to 40%, were acquired.These base maps were upscaled, respectively, to be assessed by the assessment methodologies elaborated in Sections 2.4 and 2.5 to analyze the influence of mapping error on the upscaling.
The results of upscaling in ASD1810 at selected error levels and spatial resolutions are illustrated in Figure 5.Comparison of upscaling results in sub-regions under different error levels shows that errors in the base map impact the results of the upscaling.Additionally, the PE for each crop increased with increasing base map error and spatial resolution of the aggregated maps (Figures 6 and 7).
Moreover, the influence of error level on upscaling varied from one study area to another for each crop type (Figures 6 and 7).For example, the PE of corn in ASD4550 is higher than in ASD1810.In addition, the OC decreased with increases in either the error level or spatial resolution (Figure 8).
Remote Sens. 2017, 9, 901 9 of 19 errors in the base map impact the results of the upscaling.Additionally, the PE for each crop increased with increasing base map error and spatial resolution of the aggregated maps (Figures 6 and 7).Moreover, the influence of error level on upscaling varied from one study area to another for each crop type (Figures 6 and 7).For example, the PE of corn in ASD4550 is higher than in ASD1810.In addition, the OC decreased with increases in either the error level or spatial resolution (Figure 8).

Landscape Changes Based on Different Error Level
Compared to the landscape characteristics of the base map for each error level (i.e., the 30 m maps at each level of error), increasing the spatial resolution resulted in decreasing trends in PPU and SqP (Figure 9); however, beyond a resolution of 240 m, changes in PPU were negligible for both study areas.Similarly, decreasing error level resulted in lower PPU, but again, this relationship stops at resolutions beyond 240 m (Figure 9a,b).SqP exhibited the opposite trend to that of PPU.For ASD1810, when the pixel size was less than 360 m, increasing error level of base maps resulted in increasing SqP, while when the pixel size was larger than 360 m, increasing error level of the base map led to decreasing SqP (Figure 9c).In ASD4550, when the pixel size was larger than 240 m, the SqP was reduced due to an increase in the base map error level (Figure 9d).

Landscape Changes Based on Different Error Level
Compared to the landscape characteristics of the base map for each error level (i.e., the 30 m maps at each level of error), increasing the spatial resolution resulted in decreasing trends in PPU and SqP (Figure 9); however, beyond a resolution of 240 m, changes in PPU were negligible for both study areas.Similarly, decreasing error level resulted in lower PPU, but again, this relationship stops at resolutions beyond 240 m (Figure 9a,b).SqP exhibited the opposite trend to that of PPU.For ASD1810, when the pixel size was less than 360 m, increasing error level of base maps resulted in increasing SqP, while when the pixel size was larger than 360 m, increasing error level of the base map led to decreasing SqP (Figure 9c).In ASD4550, when the pixel size was larger than 240 m, the SqP was reduced due to an increase in the base map error level (Figure 9d).

Landscape Changes Based on Different Error Level
Compared to the landscape characteristics of the base map for each error level (i.e., the 30 m maps at each level of error), increasing the spatial resolution resulted in decreasing trends in PPU and SqP (Figure 9); however, beyond a resolution of 240 m, changes in PPU were negligible for both study areas.Similarly, decreasing error level resulted in lower PPU, but again, this relationship stops at resolutions beyond 240 m (Figure 9a,b).SqP exhibited the opposite trend to that of PPU.For ASD1810, when the pixel size was less than 360 m, increasing error level of base maps resulted in increasing SqP, while when the pixel size was larger than 360 m, increasing error level of the base map led to decreasing SqP (Figure 9c).In ASD4550, when the pixel size was larger than 240 m, the SqP was reduced due to an increase in the base map error level (Figure 9d).

Discussion
Higher levels of error in the base map resulted in higher PE in the upscaled maps with further significant impacts on calculated landscape characteristics.Additionally, due to differences in localized landscape characteristics, the effect of the error level on the upscaling results differed between the two study areas.The implications of these findings are analyzed further below.

Error Simulation Issues
During the error simulations, non-boundary pixels were involved when simulated error levels neared the high end of the tested values (30% and 40% for ASD1810 and only 40% for ASD4550).This occurred because the number of boundary pixels in each study area did not meet the number necessary to reach the desired error level.For example, the misclassified non-boundary pixels accounted for 9.5% of the error in ASD1810 at an error level of 40% (Figure 3).These errors derived from non-boundary pixels can be treated as "salt and pepper", a common issue when implementing classification algorithms to obtain crop maps [60,61].Therefore, errors produced by non-boundary pixels were considered satisfactory in obtaining the thematic maps used in this study.
Additionally, the error simulation is based on misclassifying crop pixels.In reality, misclassification can occur between crop and non-crop as well.For example, Shao and Lunetta [62] reported that crop was misclassified as other land cover types, such as forest and urban.However, per Congalton [52,63], error occurs most frequently at the boundaries of agricultural patches, which was the basis of our simulation.In addition, it is difficult to define the probability of misclassification between different types of non-crop and crop types.Future work should consider constructing a general probability of misclassification table for land cover maps based on the derived expertise knowledge database.

Impacts of Upscaling and Map Error on PE and OC
The PE in coarse-scale aggregated maps is affected by the amount of error in the base maps.Base maps with higher levels of error led to greater PE for each crop after upscaling (Figures 6 and 7).For example, at a spatial resolution of 600 m, the PE of sorghum in ASD4550 increases from 77.3 to 83.0% as the error level increases from 0 to 40%.This is due in part to changes in the frequency of each class in the coarse maps.CLP is another possible reason for the increase in PE when increasing the amount of error in the base maps according to the difference of performance in two study areas with different landscape characteristics, which is illustrated in Section 4.4.
As would be expected, OC decreases with increasing base map error (Figure 8).For example, in ASD1810, OC decreased from 92.45 to 82.41% when error level of base map increased from 0% to 40%.This result is consistent with a previous study [19].Although the PE of each crop type showed a general increasing trend as the error level increased, the PE of non-crop showed a more fluctuating trend when the resolution was between 60 m and 120 m in ASD1810 (Figure 6).These results demonstrate that one must consider the level of thematic error within the base map prior to implementing MRB.The base map should be as accurate as possible to avoid greater impacts in the rescaled maps.
The PE was also sensitive to the initial proportional area of each crop in the base map.This has been reported in previous studies [19,20,31,32,34,64].Crops making up a lower proportion of the landscape generally obtained higher PE when the base maps were upscaled.For example, in ASD4550, the proportions of corn, soybeans, cotton, sorghum, and winter wheat were 5.22%, 4.50%, 2.36%, 0.10% and 0.03%, respectively.At an error level of 0%, the PE of corn, soybeans, cotton, sorghum, and winter wheat increased by 49.06%, 61.63%, 65.14%, 65.65% and 85.64%, respectively, when the base map was upscaled to 960 m.ASD1810 demonstrated a similar trend.Thus, in addition to being cognizant of the amount of thematic error, we should be aware of the proportional area of each map class.
Based on the results presented, both the level of error in the base map and/or the proportional area of the crops in base map have impacts on the PE in the upscaled maps.To obtain a relatively accurate upscaled map, the analyst must not only consider both of these factors, but also the objective of the project.Reducing the classification error typically results in higher costs to the project; however, the benefits may not always outweigh this cost.To accurately obtain the distribution of a crop with low proportional area, reducing the base map error may not be an optimal choice, as it may not contribute much towards decreasing the PE after upscaling.For example, if the distribution of winter wheat in ASD4550 from a map with a resolution of 720 m is required, reducing the base map error would have little effect.The PE for winter wheat across all error levels tested was very similar (Figure 7), thus reducing base map error is not recommended as it would not change the PE.Additionally, PE cannot be greatly reduced when the dominant land cover constituting a very high proportion of the landscape is upscaled to a very coarse resolution.For example, the PE of non-crop (occupying about 87.8% area of the landscape) was reduced only approximately 1.46% despite reducing the error level from 40 to 0% when the coarse map was created at 960 m grids for ASD4550.It is not a wise choice to reduce the base map error to obtain a lower PE from the coarse maps.Hence, in addition to the base map error level and the proportional area, the decision to implement MRB should also consider the intended application of the coarse map.

Comparison of Landscape Changes Based on Different Error Level
The fragmentation of the landscape increased with increasing base map error (Figure 4) when pixel sizes were less than 240 m (Figure 9a,b).As the pixel size increased, the influence of the error on fragmentation was reduced.When the pixel size was larger than 240 m, there were no discernible differences between each coarse maps.For example, in ASD1810, the PPU was reduced 6.70% at a resolution of 60 m, while the PPU was only reduced 1.56% at 120 m.These results demonstrate that if the coarser maps at a relatively large pixel size are produced by MRB, considerable care must be taken when these maps are used for analysis concerning the landscape characteristics since the larger pixel size reduces the fragmentation.In addition, if the coarse maps are used for analysis in models that are not related to landscape characteristics, reducing the base map error should not be considered as an important issue in this analysis.Furthermore, landscape characteristics highly affect the performance of the upscaling, as illustrated in Section 4.4.Therefore, one should be aware that MRB changes the landscape patterns, especially, when used to obtain coarse maps at a relatively small pixel size.
Increasing base map error level did not have a considerable impact on the shape complexity in the resulting coarse maps (Figure 9c,d); however, the trends in shape complexity varied across the two study areas.In ASD1810 (the more homogeneous study site), when the pixel size was less than 240 m, increasing error lead to an increase in SqP in the coarse maps, while the opposite was exhibited when the pixel size was larger than 240 m.In ASD4550 (the more heterogeneous study site), the base map error level impacted the shape complexity in the coarser maps more seriously than in the finer resolution maps.For example, increasing the error level resulted in a reduction of about 0.09 in SqP when pixel size was 960 m, while the reduction was only about 0.01 when the pixel size was 360 m.Although the influence of base map error on SqP does not show an obvious trend or conclusion, these results demonstrate that the base map error level does result in uncertainty regarding the shape complexity in the coarse maps.This analysis further strengthens our confidence that base map error impacts the coarse maps seriously in terms of PE and landscape characteristics.

Comparison of the Performance Based on Different Study Areas
A comparison of the OC in the two study areas shows that increasing the base map error resulted in a greater reduction in OC for ASD1810 than in ASD4550 (Figure 8).The reason for this difference is that changes in the landscape caused by the introduction of error in ASD1810 were greater than in ASD4550 (Figure 4).For example, increasing the error level from 0 to 40% reduced OC by 14.30% in ASD1810, but only 2.57% in ASD4550.The change in landscape (CLP) of 34.33 in ASD1810 was about 2.7 times that of ASD4550.These results demonstrate that the degree of CLP produced by crop mapping error is an additional factor that affects the performance of MRB.Consequently, we suggest that pre-existing land cover maps with relatively higher accuracy should be used to assess the error level as well as the changes in landscape of the current thematic maps at the start of the project.Therefore, to reduce PE of upscaled maps, three factors must be considered: (1) the objective of the project; (2) the level of thematic map error; and (3) the change in landscape as a result of thematic errors compared to the distribution of land cover in the original, unscaled map.

Limitations and Future Work
We are aware of three limitations with this research.First, our error simulation was based on a mathematic method that assumed mapping error occurred randomly for boundary pixels of agricultural patches, with lower confidence level.However, various factors, such as complexity of the spectral response [65], classification algorithm [66], uncertainty of samples [67], and spatial resolution of data [68], may affect the accuracy of thematic maps.As discussed in Section 4.1, error also occurs between crop and non-crop, which may perform differently when upscaling maps.In addition, combining upscaling method and the error analysis based on the classification of remote sensing imageries is a potential way to reduce the uncertainty/error for processing the upscaling.Furthermore, the MC simulation produced error based on swapping the class types of selected pixels that meet the misclassification condition according to Dong et al. [8].Besides this simulation principle, other principles for simulating error probably can be applied.For example, assigning a random class type to the pixel to be misclassified could also be used.This exploration should be investigated in future work.Therefore, to acquire relatively realistic error maps, a definition of probability of error occurring between different cover types should be constructed in any future work.
The second major limitation is that our analysis is just based on MRB.Various aggregation methods (e.g., pointed-centered and distance-weighted moving window [24]) can be used to upscale maps.Different methods may result in different performances according to the aggregation logic [64].We selected MRB here as recommended by the literature for analyzing agricultural areas.Future studies could perform this analysis on different aggregation methods to be compared with our results.
Last, although the study districts employed in the experiments have typical and contrasting landscape patterns with different heterogeneity, numerous other landscape scenarios could be employed to further extend the findings of this paper.Future work should focus on how to simulate different landscape scenarios to further investigate the interactions between landscape pattern and upscaling effects on characterization of land cover distribution.

Conclusions
This study presented an investigation on the impacts of upscaling on crop mapping error based on the majority rule based aggregation method (MRB).We used Cropland Data Layer (CDL) for 2016 at a 30 m spatial resolution along with their corresponding confidence layers and Monte Carlo simulations to generate eight new agricultural base maps, each with a different level of error (5%, 10%, 15%, 20%, 25%, 30%, 35% and 40%) at two different Agriculture Statistic Districts (ASDs).MRB was used to upscale each base map to 10 coarser resolution levels (60 m, 90 m, 120 m, 240 m, 360 m, 480 m, 600 m, 720 m, 840 m and 960 m) for each study site.The results showed that three factors influence the performance of MRB: (1) the base map error level; (2) the proportional area of the crop in the base map; and (3) the change in landscape (CLP) as a result of increased error.The proportional error (PE) for each crop is highly affected by crop mapping error.Using the base maps with lower error can obtain lower PE for each crop.In addition, the uncertainty of shape complexity produced by upscaling strengthens our confidence that reducing error level is a potential way to reduce the PE when upscaling maps.The proportional area of the crop in the base map significantly impacts the performance of upscaling.Crops with a relatively low proportional area are influenced more than the crops with a relatively higher proportion.Greater changes in landscape characteristics produced by the error maps resulted in higher PE in the upscaling maps.Additionally, the landscape characteristics in coarse maps show that increasing error level leads to a decrease in the fragmentation of the landscape.Therefore, to obtain an upscaling map with lower PE and fewer changes in landscape characteristics, we recommend that first, pre-existing land cover maps with the highest accuracy possible be employed to assess the error level and the CLP of the thematic maps.Then, three factors should be considered to upscale maps: (1) the objective of research or project; (2) the error level of the thematic maps; and (3) the CLP of the thematic maps.The uncertainty/error information from upscaled maps should be used to analyze the error level influence on the results of modeling performed using these upscaled maps and guide any decisions to reduce the upscale map error on these models.Finally, future work should concentrate on constructing a pre-definition of probability of error occurring between different cover types to produce realistic error maps beyond just the agricultural crops used in this study.Various aggregation methods should be then employed to explore the impact of error level of the thematic map on the performance of upscaling and examine the applicability of each method for different land cover types.Moreover, the interactions between landscape patterns and upscaling should be further explored by employing numerous of landscape scenarios.

Figure 2 .
Figure 2. General procedure for producing error map using MC simulation method.BP1 and BP2 are boundary pixels randomly selected from the land cover map.CL1 and CL2 are the confidence level corresponding to BP1 and BP2.n1 and n2 are two numbers randomly generated from a uniform distribution ranging [0, 1].

Figure 2 .
Figure 2. General procedure for producing error map using MC simulation method.BP 1 and BP 2 are boundary pixels randomly selected from the land cover map.CL 1 and CL 2 are the confidence level corresponding to BP 1 and BP 2 .n 1 and n 2 are two numbers randomly generated from a uniform distribution ranging [0, 1].

Figure 3 .
Figure 3. Error maps of two study areas.Note each column represents the maps with different error levels.The upper panel shows the base maps in ASD1810, while the lower panel shows the base maps in ASD4550.In each panel, the second row presents a sub-region within each base map.

Figure 4 .
Figure 4. Changes in landscape of the base maps with increasing error.Confidence level probability (CLP) is calculated by Equation (4).

Figure 3 .
Figure 3. Error maps of two study areas.Note each column represents the maps with different error levels.The upper panel shows the base maps in ASD1810, while the lower panel shows the base maps in ASD4550.In each panel, the second row presents a sub-region within each base map.

Figure 3 .
Figure 3. Error maps of two study areas.Note each column represents the maps with different error levels.The upper panel shows the base maps in ASD1810, while the lower panel shows the base maps in ASD4550.In each panel, the second row presents a sub-region within each base map.

Figure 4 .
Figure 4. Changes in landscape of the base maps with increasing error.Confidence level probability (CLP) is calculated by Equation (4).

Figure 4 .
Figure 4. Changes in landscape of the base maps with increasing error.Confidence level probability (CLP) is calculated by Equation (4).

Figure 5 .
Figure 5. Unscaled maps for ASD1810 at selected resolutions and error levels.Each column represents a map with different resolution.Each row represents the results of upscaling with different error levels.

Figure 5 .
Figure 5. Unscaled maps for ASD1810 at selected resolutions and error levels.Each column represents a map with different resolution.Each row represents the results of upscaling with different error levels.

Figure 6 .
Figure 6.Changes in proportional error (PE) for each crop type at different resolutions with increasing base map error at ASD1810: (a-f) PEs of the non-crop, corn, sorghum, soybeans, winter wheat, and alfalfa, respectively.

Figure 6 .
Figure 6.Changes in proportional error (PE) for each crop type at different resolutions with increasing base map error at ASD1810: (a-f) PEs of the non-crop, corn, sorghum, soybeans, winter wheat and alfalfa, respectively.

Figure 7 .
Figure 7. Changes in PE for each crop type at different resolutions with increasing base map error at ASD4550: (a-f) PEs of the non-crop, corn, sorghum, soybeans, winter wheat, and cotton, respectively.

Figure 7 .
Figure 7. Changes in PE for each crop type at different resolutions with increasing base map error at ASD4550: (a-f) PEs of the non-crop, corn, sorghum, soybeans, winter wheat and cotton, respectively.

Figure 8 .
Figure 8. Overall consistency (OC) performance with different error level: (a) the results in ASD1810; and (b) the results in ASD4550.

Figure 9 .
Figure 9. Changes of PPU and SqP in ASD1810, and ASD4550: (a,b) the changes of PPU and SqP, respectively, in ASD1810; and (c,d) the changes of PPU and SqP, respectively, in ASD4550.

Figure 8 .
Figure 8. Overall consistency (OC) performance with different error level: (a) the results in ASD1810; and (b) the results in ASD4550.

Figure 8 .
Figure 8. Overall consistency (OC) performance with different error level: (a) the results in ASD1810; and (b) the results in ASD4550.

Figure 9 .
Figure 9. Changes of PPU and SqP in ASD1810, and ASD4550: (a,b) the changes of PPU and SqP, respectively, in ASD1810; and (c,d) the changes of PPU and SqP, respectively, in ASD4550.

Figure 9 .
Figure 9. Changes of PPU and SqP in ASD1810, and ASD4550: (a,b) the changes of PPU and SqP, respectively, in ASD1810; and (c,d) the changes of PPU and SqP, respectively, in ASD4550.

Table 1 .
Landscape characters of the study areas.All landscape metrics were produced by Fragstats version 4.2, a spatial analysis software package for computing landscape metrics.TA means total area.NP means number of patches, LPI means largest patches index, AI means aggregation index.Dominance measures the extent to which one or a few classes dominate the landscape.PPU is a measure of fragmentation.Square-pixel index (SqP) measures the complexity of landscape.More details about these landscape metrics can be found in Section 2.5.

Table 1 .
Landscape characters of the study areas.All landscape metrics were produced by Fragstats version 4.2, a spatial analysis software package for computing landscape metrics.TA means total area.NP means number of patches, LPI means largest patches index, AI means aggregation index.Dominance measures the extent to which one or a few classes dominate the landscape.PPU is a measure of fragmentation.Square-pixel index (SqP) measures the complexity of landscape.More details about these landscape metrics can be found in Section 2.5.