Social and environmental scientists are increasingly using highly detailed models of land use and land change (LULC) [1
] with the goal of better understanding human’s influence on future LULC and ecosystems [3
]. Urbanization is a dominant global and regional driver of LULC with far reaching impacts on climate through decreased surface albedo and increased emissions of CO
and other greenhouse gases [4
]. Many urbanizing regions are also experiencing decreased water quality [6
], urban heat islands [7
] and fragmentation of habitat [8
]. To quantify and project future changes in urbanization, LULC models often employ historical estimates of land cover combined with biophysical and socio-economic information to create estimates of future change [10
]. Model selection, a critical first step, is based on a study’s purpose, and available data and tools. Yet, this selection is often arbitrary and at the researchers discretion when multiple models are suitable for a study. Thematic emphasis, spatial and temporal scale, and quantitative methodologies of models vary considerably [11
], which cause different model behavior and patterns of change. The lack of studies comparing models with identical input data limits our ability to understand the trade-offs among different model’s accuracy and therefore constitutes a critical research gap. There is considerable research to quantitatively guide researchers in understanding how accurate a model simulation is in terms of its quantity and allocation [12
], but few that evaluate model accuracy in terms of landscape configuration. Configuration refers to the spatial arrangement of landscape types and their shapes.
While there have been several calls for evaluation of the accuracy of LULC models, systematic assessment across models has scarcely been achieved [1
]. A notable exception is the work of Pontius et al. (2008), who compared 9 different LULC models. Multiresolution comparison techniques were used to quantify model accuracy in terms of the amount of simulated change and the spatial allocation of categories. This work demonstrated considerable trade-offs among the models in terms of quantity and allocation accuracy. Overlooked was how accurate each model simulated the composition and shapes of different map categories across the landscape, often referred to as configuration. Not using the same input data, study area, spatial resolution, and number of years simulated made relative inter-model comparisons of limited utility, because it was not possible to isolate inaccuracies in the model’s inadequacy in representing phenomenon from inaccuracies resulting from input data differences. Accuracy evaluation across models based on consistent input data, parameterization, and location would better guide the initial selection of a land change model and highlight trade-offs between selecting one model over another.
Existing methods for assessing the accuracy of LULC models are an outgrowth of land cover/use assessments from the remote sensing research community [21
]. Historically, the most common method to assess simulation results was visual inspection by experts [18
], however this is highly subjective and irreproducible. Several methods have been proposed over the last decade, all of which compare simulated maps with maps of assumed truth [12
]. There is continued debate over accuracy assessment methodologies among land change and remote sensing research foci, with the sole emphasis on quantity and allocation disagreement [13
]. Quantity disagreement is the difference between observed and simulated maps attributable to the difference in proportions of categories [13
]. Allocation disagreement is the difference between observed and simulated maps attributed to differences in matching spatial allocation of categories (e.g., classifying new urban development in a location where urban development is not observed) [13
]. Allocation disagreement can have the same value for multiple, differing model projections and poorly captures where projected changes occur in relation to one another. For example, a simulation that allocates all changes together in one continuous patch could have identical allocation disagreement compared to a simulation where changes are dispersed, forming multiple new patches (Figure 1
). The ecological implications of a landscape with one newly developed, larger patch differs from multiple patches that form a heterogenous, fragmented landscape. Therefore we propose an additional metric for consideration in accuracy assessments: configuration disagreement. Configuration disagreement quantifies the degree to which the simulated spatial configurations of categories matches the observed map irrespective of the specific location of those categories. Numerous ecosystem functions, including biodiversity [27
], nutrient cycles [29
], pollination [30
], as well as water quality [31
] and urban heat islands [32
] are influenced by landscape configuration. All three measurements of accuracy are important, and should be considered when assessing the validity of LULC models. However, to date few studies have answered the call of Pontius et al. (2008) [1
] to evaluate quantity and allocation separately, with none attempting to assess configuration.
A systematic review by van Vliet et al. (2016) [33
] of modeling applications published from 2010 to 2014 revealed that 68 percent of applications assessed allocation accuracy, and only 23 percent of studies determined quantity accuracy. The specific methodology for assessing allocation accuracy differed among studies [33
]. LULC model accuracy and proper validation is important for achieving credibility in decision support related to landscape planning [34
]. Therefore, a consistent methodology for assessing model accuracy would offer land change scientists a reproducible means to perform cross model comparisons and elucidate trade-offs among quantity, allocation, and configuration. In this study, we build upon the work of Pontius et al. (2008) by comparing four inductive pattern-based land change models—SLEUTH [36
], GEOMOD [37
], Land Change Modeler (LCM) [38
], and FUTURES [39
]. We hypothesized differences in quantity, allocation, and configuration accuracy would arise from differences in model characteristics. To isolate these differences in model characteristics, we used the same input data for each model, focusing on a study location in the rapidly expanding metropolitan region of Charlotte, North Carolina. To compare the accuracy of urban development projections produced by each model from 2006 to 2016, we quantified: (1) quantity disagreement; (2) allocation disagreement; and (3) provide a methodology for evaluating configuration disagreement. Taken together, when the same input data is used, these three metrics allow for trade-offs to be identified in model accuracy that is specifically attributable to differences in each model’s characteristics.
This work demonstrates cross-model comparisons based on a variety of validation metrics using consistent input data. Maintaining the same input data separates the differences in simulation that arise from the model’s function from differences attributable to input data. This allows for quantification of the trade-offs among quantity, allocation, and configuration disagreement attributable to each model’s function. Our results suggest that these four land change models produce representations of urban development with substantial variance, where some models may be better suited depending on which type of accuracy is most important for a specific analysis. For example, quantity accuracy may be most appropriate for macro-scale studies of development, whereas allocation and configuration accuracy are more appropriate for detailed ecological modeling. Taken together, these results demonstrate that urban development can be quantitatively projected to a future time point with high levels of accuracy for quantity, allocation and configuration, but that no one model performs best at simulating all three simultaneously.
Analyses whose primary concern is to project the quantity of change should use LCM, as the quantity disagreement had less than 0.5 percent average difference across ten simulations. Allocation accuracy was highest for the LCM and SLEUTH, indicating that research analyses primarily concerned with simulating change pixels in the correct locations should consider using either of these models. However, trade-off in allocation disagreement between these two models exist when evaluated along an urban to rural gradient. Results indicate the LCM is likely better suited to simulate new development in regions with urban densities less than 80 percent, whereas SLEUTH performs best in highly urbanized areas (>80 percent). While SLEUTH performs best for spatial allocation in dense urban areas, it comes with the trade-off of overestimating the quantity of new development by an average of 10.5 percent of the landscape. Configuration disagreement results reveal that FUTURES and GEOMOD simulate landscape-level configuration patterns of new development best, with FUTURES performing better in dense urban and mixed suburban environments. GEOMOD simulates new development configurations in rural counties better than the other three models. Collectively, it is critical to understand urban density characteristics of the study extent in relation to the trade-offs associated with each model prior to selection.
Our results indicate that allocation accuracy does not automatically confer configuration accuracy, indicating trade-offs between the two. FUTURES best simulated landscape-level configurations of development, but at the trade-off of lower allocation accuracy. Comparatively, GEOMOD, LCM and SLEUTH tend to agglomerate new growth predominantly adjacent to existing development in a “tree ring” type of pattern (Figure 6
). This creates homogenous development areas and fails to capture heterogeneous patterns of development, where new patches of spontaneous growth emerge. FUTURES patch growing algorithm maintained a greater degree of landscape heterogeneity and best simulated patch based configurations of development. Configuration accuracy is indiscriminate with regards to cell level transitions, but can be used to describe the emergence of macro-level outcomes [33
]. Therefore, while calculating configuration disagreement is not suitable for small areas (e.g., the
km grid analysis), county level analyses demonstrate that FUTURES and GEOMOD are better suited for specific development densities along the urban to rural gradient (Figure 5
). FUTURES simulated landscape configurations of 85 percent or greater LSI in seven of the ten counties when compared to configurations observed in 2016. Furthermore, FUTURES simulations had LSI values below 75 percent for only two predominantly rural counties. These high LSI values suggest that when modeling objectives are focused on accurately representing landscape-level configuration, FUTURES is likely the most suitable model. In general, GEOMOD simulated landscape configurations well, especially in counties trending more rural. However, LSI values for all four models were lower in predominantly rural areas, suggesting further investigation and modeling improvements are needed to better project new urban development in rural contexts.
This methodology presents generalizable techniques for assessing three distinct types of accuracy and indicates that different models should be selected depending on the goals of the specific analysis. Expanding on Pontius and Millones (2011) [13
], we recommend that quantity, allocation, and configuration accuracy should each be evaluated in robust accuracy assessments of land change studies. Each of these metrics provides insights into a different component of an accurate model and when evaluated together, a complete understanding of model accuracy can be determined. Some analyses may place greater emphasis on the accuracy of one metric over the others. Ultimately, the determination of whether a model is “valid enough” still remains largely at the user’s discretion and is guided by the study’s purpose [14
]. Expert based validation, while subjective, may provide a valuable supplement to the three metrics recommended, provided that it can be used to improve model accuracy.
Apart from being a research and educational tool, land change models can play an important role in policy and decision making [74
]. Land change models are now being used to simulate alternative future scenarios [1
]. The four models assessed are capable of scenario modeling, some with greater complexity than others. Each model contains the ability to exclude areas from development—useful for simulating conservation planning. Additionally, SLEUTH offers functionality for exploring different economic boom and bust cycles of growth [76
]. FUTURES, GEOMOD, and LCM allow for alternative scenarios depicting different quantity of change estimates. For example, scenarios depicting higher or lower projections of per capita land consumption can simulate dense new development or increased land consumption [39
]. FUTURES uniquely offers the ability to simulate management policies related to the importance of placing new development near existing urban area (e.g., sprawl vs. infill). Furthermore, FUTURES ability to manipulate patch size distributions and related characteristics may be helpful in understanding landscape-level pattern dynamics [58
While quantitatively assessing the accuracy of scenarios depicting future events is not possible, this analysis contributes to understanding these models’ ability to realistically simulate future quantity, allocation and configurations from historical data. How realistic a “future scenario” will be depends on many factors such as how the demand for urban development may change or how dense new urban areas may be. We focused on maintaining status quo, linear extrapolations of past trends with results indicating that accurate simulations can be achieved over a decade. Understanding the trade-offs of selecting a stochastic or deterministic model when projecting scenarios is important. By routinely running numerous simulations, stochastic models facilitate exploration of a range of outcomes possibly attributable to the complexity of coupled human-natural systems [41
]. In an urban context, development can emerge in disjunct patches, sometimes referred to as “leapfrogging”. Deterministic models tend to miss leapfrog development by only simulating change adjacent to previously developed areas. Stochastic models ability to include random, chance events of new development away from existing urban areas may better simulate observed development. Model accuracy is important but, depending on the research objectives, selecting a stochastic model which better represents the heterogeneity in human decision making may be equally significant.
The multilevel modeling structure developed in Meentemeyer et al. (2013) [39
] and implemented in this analysis for models requiring a probability surface, allows for relationships between drivers of change to vary spatially rather than assuming stationarity across the entire study region [64
]. This is likely the most critical component of the land change modeling process. Accounting for sub-regional level change at the county-scale enables heterogeneous socioeconomic and policy factors to drive simulation results [39
]. A challenge to land use and land cover change analyses is to spatially depict landscape indicators and policy-driven decision-making [74
]. The multilevel modeling structure is an initial resolution to this, as indicated by the increased modeling accuracy compared to previous analyses [1
]. Challenges remain, however, such as incorporating human values and goals in the modeling of land systems for design and planning [78
]. Integrating agent based and land change models together may better simulate the complexities of socio-ecological systems [41
Unlike previous model comparison exercises [1
], we evaluated model performance using a consistent set of input data and a common starting map. Doing so allowed for more informative cross model comparisons. While input data was uniform across the four models, some differences in modeling methodologies were unavoidable. SLEUTH’s rigid data requirements and inability to allow for a site suitability surface contributed to the over estimation of the total quantity of change. In addition to SLEUTH, the LCM does not allow for a user defined quantity of change. In this analysis LCM performed better than the user defined change, however in other study systems this may lead to less accurate results. More comparative exercises are necessary to determine the optimal method for inducing quantity of change estimates. We also considered a single study location—these results may be unique to the particular development context of the ten county region (e.g., zoning, development pressure). Model performance may differ when other regions with different development constraints and pressures is considered. Regional-scale development patterns are the physical manifestation of interacting socio-political decisions [81
], environmental driving factors [82
], and agent-based decision making [79
]. Despite considerable progress, these processes have proven difficult to distill into a computer algorithm. The next generation of land change models should continue to focus on improving computational algorithms that realistically represent land change, but also should improve and standardize model evaluation approaches to allow for greater cross-model comparisons.