Calibrating and Validating a Simulation Model to Identify Drivers of Urban Land Cover Change in the Baltimore , MD Metropolitan Region

We build upon much of the accumulated knowledge of the widely used SLEUTH urban land change model and offer advances. First, we use SLEUTH’s exclusion/attraction layer to identify and test different urban land cover change drivers; second, we leverage SLEUTH’s self-modification capability to incorporate a demographic model; and third, we develop a validation procedure to quantify the influence of land cover change drivers and assess uncertainty. We found that, contrary to our a priori expectations, new development is not attracted to areas serviced by existing or planned water and sewer infrastructure. However, information about where population and employment growth is likely to occur did improve model performance. These findings point to the dominant role of centrifugal forces in post-industrial cities like Baltimore, MD. We successfully developed a demographic model that allowed us to constrain the SLEUTH model forecasts and address uncertainty related to the dynamic relationship between changes in population and employment and urban land use. Finally, we emphasize the importance of model validation. In this work the validation procedure played a key role in rigorously assessing the impacts of different exclusion/attraction layers and in assessing uncertainty related to population and employment forecasts. OPEN ACCESS


Introduction
More than a decade ago, Gardner and Urban [1] pointed out that despite an extensive literature on model validation it was not widely practiced.Further, it was "common to find that the problems and pitfalls of validation and testing are poorly understood, inadequately executed, or entirely ignored" (p.184).While these comments were made in the context of ecosystem models, the same can be said today regarding many land use change models, many of which are rooted in ecosystem science either theoretically or practically.
The cellular automata-based SLEUTH (an acronym based on the data inputs of Slope, Land use, Exclusion, Urban land, Transportation, Hillshade) land use modeling system [2,3] is one of the more widely used land use/land cover change models, with dozens of applications worldwide, many in the peer review literature [4].SLEUTH has been used to explore questions related to urban theory [5][6][7], visualize and evaluate impacts of land use policy [3,8,9], develop calibration techniques [10], and explore questions related to geocomputation [3,11].Model sensitivity to spatial scale [12], temporal scale [13], and input data [14] have also been explored.Examples of SLEUTH's incorporation into coupled modeling systems have also been developed [15,16].
This work, spanning well over a decade, has served to increase user confidence in the SLEUTH model, what Gardner and Urban [1] call "face validity" (p.186).However, even with a model as well understood and widely used as SLEUTH, there are few examples of the model being validated.In this case, we consider validation to mean the comparison of model performance (i.e.simulated land cover change) to objective and independent observed system behavior (i.e., actual land cover change) [1].While SLEUTH's calibration process requires the use of historic land cover data, to our knowledge there is only one example [13] of a calibrated SLEUTH model being used to generate simulations of land cover change that are then compared to an independent data set, data that were not used to calibrate the model.Indeed, the process of validation within land use/land cover change models in general is rare.While many methods have been developed to evaluate land use/land cover change model performance (e.g., [17][18][19]), data limitations are often cited as the main reason why validation is avoided.Therefore, SLEUTH's implementation steps are often given as an incomplete five step process: software compilation, preparation of input data, calibration, prediction, and analysis of results [20,21].We argue that including the step of validation is necessary to improve the rigor of land use/land cover change modeling.In many regions, data limitations should no longer pose a barrier given the wide availability of historic remote sensing data and land use classification techniques.Further, validation provides a powerful technique to better understand land use systems that are being considered during a modeling exercise.
In this research, we build upon the work of Jantz et al. [3], and much of the accumulated knowledge of SLEUTH (its "face validity"), and offer some additional advances.First, we leverage SLEUTH's exclusion/attraction layer to identify and test different urban land cover change drivers; second, we utilize SLEUTH's self-modification capability to couple the urban land cover change model with a demographic model; and third, we include the procedure of validation to both quantify the influence of land cover change drivers and assess uncertainty.We also generate forecasts to 2030 based on our findings.We use the Baltimore region as our test area.

The SLEUTH Model
Although full descriptions of the SLEUTH model can be readily found in the literature (especially [2,3,5]) and on-line at the SLEUTH website, Project Gigalopolis [22], we present a brief, simplified overview here.
As a cell-based model, urban and land use systems are represented within SLEUTH as data layers of regular gridded arrays.SLEUTH consists of two linked components, an urban change model, which represents its primary function, and a land change model, the deltatron model.In this work, we use only the urban change component, where changes in state from non-urban to urban are determined via the interaction of five growth rules: spontaneous growth, new spreading center growth, edge growth, road-influenced growth, and slope resistance.The relative operative strength of each of these rules is in turn determined by parameterizing coefficient values (0-100, where 0 indicates no influence and 100 indicates maximum influence) for each of the five controlling parameters: the dispersion parameter influences spontaneous growth and road-influenced growth, the breed parameter controls new spreading center growth and road-influenced growth, the spread parameter influences edge growth, the road growth parameter influences road-influenced growth, and the slope parameter influences all growth types to capture the effects of slope resistance.Appropriate coefficient values are identified by the user via a calibration (or parameterization) process, where various combinations of coefficient values are tested, usually through a brute force method, and the best performing coefficient set is selected.Performance is usually measured through quantitative comparisons of simulated data to observed data.For urban change modeling, SLEUTH's minimum data inputs include a time series of historic urban land cover change data, a transportation system, percent slope, and a data layer that represents land that is excluded from development.
SLEUTH also has a functionality referred to as "self-modification," which allows the model to simulate different rates of growth over time.When the rate of growth exceeds a user-specified critical threshold SLEUTH initiates a self-reinforcing growth "boom" cycle, and when the rate of development falls below a user-specified critical threshold the model initiates a growth "bust" cycle.Without self-modification SLEUTH will simulate a linear growth rate until the availability of developable land declines and fewer and fewer urbanization opportunities are available.While self-modification adds an important dynamic component to the SLEUTH modeling system, it also introduces another layer of complexity from the perspective of the user.There is no calibration procedure in place to determine the critical growth rate thresholds and little research that focuses on this issue, so most users either opt to disable the function or use the default values.Self-modification has, however, been used to build scenarios for forecasting [3,23].
In this work, we use the SLEUTH-3r model [3], which includes several key innovations to both the modeling software and implementation methods.In terms of computational advances, SLEUTH-3r utilizes memory more efficiently, reducing required RAM by approximately 65%, and also doubles processing speed through modifications to the road search algorithm.SLEUTH-3r also calculates ratio comparisons of the metrics used to evaluate SLEUTH's performance during calibration.Previous versions of SLEUTH relied on least squares regression scores (r 2 ) calculated by the software, which required at least four points in time to parameterize the model and also could result in the selection of sub-optimal coefficient values.With ratio measures, users can identify combinations of coefficient values that both match the trend and numerical values for fit metrics that compare simulated and observed urban change.This also allows for as few as two points in time for calibration.
SLEUTH-3r also addresses issues related to scale sensitivity.When simulating spontaneous development, SLEUTH utilizes a dispersion value to determine the number of spontaneous urbanization attempts.In previous versions of SLEUTH, this value was embedded in the software code and was based on the value of the dispersion parameter, the number of cells in the image diagonal, and a constant dispersion value multiplier equal to 0.005.In SLEUTH-3r, users can test different values for the dispersion value multiplier in order to capture the appropriate level of spontaneous growth within the urban system under study.
Finally, Jantz et al. [3] introduced a new way to use the exclusion layer.In the original conception of this input data set, values within the exclusion layer would range from 0 to 100, where 0 indicates lands that are theoretically open for development and 100 indicates lands that are completely excluded from development [2].Jantz et al. [3] use a value of 50 to indicate lands that neither attract nor repel development, values from 51 to 100 indicate increasing levels of repulsion and values from 49 to 0 indicate increasing values of attraction.This convention allows the exclusion layer to be used more broadly as a suitability layer for new urban development, allowing users to capture factors that both attract and exclude urbanization.We refer to this layer now as the exclusion/attraction layer.This offers a great deal of potential in terms of scenario building, some of which has already appeared in the SLEUTH literature [9,21,23].

Study Area
The modeled Baltimore region (Figure 1) consists of Baltimore City and five surrounding counties (Baltimore, Howard, Harford, Carroll, and Anne Arundel), which represent urban, suburban, and rural landscapes.This region definition is the same definition used by the Baltimore Metropolitan Council and its many planning partners.In recent decades the study area has seen significant suburban development, reflecting a common trend in post-industrial cities such as Baltimore [24,25].Between 1950 and 2004, the City of Baltimore experienced a decrease from nearly one million residents to roughly 600,000.During that same period the neighboring suburban Baltimore County experienced a population increase of more than 179%, as the number of residents grew from a quarter to three quarters of a million [25].Not only were people relocating from the city to its surrounding suburbs, but the population of the metropolitan area grew as a whole.The population of the Baltimore Metropolitan Statistical Area, a statistical region that includes the modeled region and across-the-bay Queen Anne's County, grew by 23% during the fifty-year span, adding nearly half a million residents [24].Figure 2 highlights population trends in the City compared to the modeled region over the 20th century.
Suburbanization has had implications for overall land cover patterns and land use policies.Many counties within the study area were once dominated by agricultural lands, usually greater than 50% of total land coverage at the middle of the last century.By the turn of the 21st century however, most had lost more than half of these lands to urban development [25].In the late 1990's the state of Maryland, realizing the threat that exurban sprawl posed to valuable resource lands, passed legislation limiting low density development.The most notable measures of the statewide smart growth policy act aimed at curbing sprawl were Priority Funding Areas and Rural Legacy Areas [26].These policies provide funding incentives to guide development to existing communities while protecting farmland and open space by purchasing development rights [24].

Data and Methods
In general, we used the SLEUTH model to read and explore a series of historic snapshots that represent the geospatial distribution of non-urban and urban states in the Baltimore metropolitan region.We derived our snapshots from the Chesapeake Bay Land Cover Data series (CBLCD) [27], which cover our study area and represent the geospatial distribution of land cover states for the years 1984, 1992, 2001, and 2006.Because we have multiple historic data points available, we calibrated SLEUTH using the 1984-2001 time period and withheld the 2006 data point for validation.In addition, we used the calibration/validation process as an opportunity to test different suspected drivers of urbanization, and incorporate our findings to generate a series of forecasts from 2006 to 2030.
These tests were implemented by manipulating the exclusion/attraction layer following the methods outlined in Jantz et al. [8].Onsted and Chowdhury [28] noted that such layers are often built with arbitrarily chosen resistance scores and that there is a "dearth of research devoted to improving excluded layer construction" (p. 6).In turn, we tested three different exclusion/attraction layers via three separate calibration and validation runs of the SLEUTH model, two of which are informed by expert information: These tests were implemented by manipulating the exclusion/attraction layer following the methods outlined in Jantz et al. [3].Onsted and Chowdhury [28] noted that such layers are often built with arbitrarily chosen resistance scores and that there is a "dearth of research devoted to improving excluded layer construction" (p. 6).In turn, we tested three different exclusion/attraction layers via three separate calibration and validation runs of the SLEUTH model, two of which are informed by expert information: • Exclusion/attraction layer 1 represents a map of protected lands and areas that are off-limits for new development.Thus, this run essentially represents the performance of the model when only SLEUTH's internal growth rules are used to simulate urban dynamics.
• Exclusion/attraction layer 2 tests model performance when it is provided with a key planning layer, water and sewer service areas, in addition to the prior mentioned map of protected lands.This run is a test of the broadly held assumption that, all other things being equal, areas serviced by water and sewer utilities will attract development.
• Exclusion/attraction layer 3 represents model performance when it is provided with information about changes in population and employment at the scale of regional planning districts (RPDs), which are sub-county enumeration units that will be described in more detail below.This run will test the effectiveness of these factors as drivers of growth in the SLEUTH modeling environment.A location quotient (Q) analysis was used to derive weights for the spatial allocation of growth.
During our initial calibration and validation runs, we disabled SLEUTH's self-modification function so urban land cover from 1984 to 2001 (for calibration) and 2001 to 2006 (for validation) would be based on a linear growth rate.This assumption was based on the fact that, without a priori knowledge about the future growth rate, a growth rate based on the recent past was the best estimate.Based on this analysis, we identified the exclusion/attraction layer that resulted in the best model performance and performed an additional validation where the amount of urban land cover change was constrained based on expected population and employment changes.In the latter case, population and employment data were used to estimate a range of expected urban land cover change for both the validation and forecasting procedures and SLEUTH's forecasts were constrained by invoking the boom or bust cycles that are part of the self-modification functionality.

Land Cover, Transportation, and Slope Inputs for SLEUTH
As mentioned earlier, the main datasets required for SLEUTH to perform its operations include a time series of land use and urban data, a digital elevation model (DEM), a transportation layer, and the exclusion/attraction layer.For this study the land cover and urban datasets were acquired from the U.S. Geological Survey's (USGS) CBLCD [27].Based on imagery collected circa 1984, 1992, 2001, and 2006, these Landsat-derived datasets provide the most accurate and up-to-date land use data for the study area.The USGS made a particular effort to capture low density development in its "developed open space" category.Low density development is often overlooked or misclassified in medium-resolution satellite-derived land use/land cover data [29] and while the CBLCD likely still underestimates this land use category, it represents low density development in the exurban areas better than other comparable datasets, making an important improvement for monitoring and modeling purposes.The CBLCD represents the spectrum of land cover with 16 different land cover classes.Since SLEUTH requires a binary representation of urban land cover, we considered all developed land classes, including developed open space, to represent urban land cover; all other classes represent non-urban land cover.
The transportation layer represents U.S. and Maryland State highways, interstate highways, and important primary and secondary county roads.The network of compiled road features was then rasterized to make an input surface that is recognizable by SLEUTH.The percent slope layer was derived from the USGS National Elevation Dataset (NED).

Exclusion/Attraction Layers for SLEUTH
The first exclusion/attraction layer represents only areas that are fully protected from urbanization.Protected lands were derived from the Maryland Department of Natural Resources (DNR) Protected Lands dataset [30] and include forest and agricultural land easements, state-owned parks and forest, county parks, and federal lands; wetlands in Maryland DNR's wetlands inventory were included; and water bodies as identified in the CBLCD.Transportation systems were also off limits to urbanization, so roads, clover leaf intersections, airports and railroads were excluded.Finally, we excluded areas that may have possessed extensive open areas that SLEUTH would otherwise consider available for development, but which are subject to different development processes, such as military bases.All locations in excluded areas were assigned a value of 100 in the exclusion/attraction layer; all other areas were assigned a value of 50 to indicate neutral weighting for development.We note that the excluded lands identified in this layer were included in the other two exclusion/attraction layers developed next.
The second exclusion/attraction layer was created to test the assumption that urbanization would be more likely in areas serviced by public water and sewer infrastructure.Data for this layer were compiled by collecting geospatial service area data from the counties within the BMR.In this layer, locations within water and sewer service areas would be given a value less than 50 to indicate areas that would attract growth.We tested several attraction values (10, 20, 30, and 40) to identify an appropriate weight.Areas outside water and sewer services areas and not otherwise designated as excluded were given the value of 50 to designate a neutral weight.The county service area data include date fields indicating when each existing area came into service and when planned service areas will come on-line.We were, therefore, able to tailor exclusion/attraction layer 2 to match temporally with both the calibration time period (1984)(1985)(1986)(1987)(1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001), the validation time period (2001)(2002)(2003)(2004)(2005)(2006), and the forecasting time period .
In addition to the excluded lands data, the third exclusion/attraction layer incorporates the influence of changes in all day human intensity, which is a metric described below.

Incorporating Population and Employment Data into Exclusion/Attraction Layer 3
Cheng and Masser [31] argued (persuasively) that "local knowledge is an essential source of qualitative information" (p.193) that should be used during the urban modeling process.We recognized the value of obtaining information and data from the Baltimore Metropolitan Council's Cooperative Forecasting Group (BMC-CFG).The BMC-CFG consists of economists, demographers and planners at the State of Maryland and local planning offices that develop population, household and employment forecasts for the Baltimore region in conjunction with other jurisdictions from the adjacent Washington, DC, region.They are regional experts and their forecasts are informed by U.S. Census Bureau reports, regional surveys, analyses of changing market conditions, and local and state policies.The Cooperative Forecasting Group is a subcommittee of the Baltimore Regional Transportation Board, which is a member of the Baltimore Metropolitan Council (BMC); the BMC distributes BMC-CFG data for multiple enumeration units, including Regional Planning Districts (RPDs).A Regional Planning District is a geographic unit of analysis that has been used since the 1970s by the BMC to serve its planning purposes.Each RPD comprises one or more census tracts and has an estimated population of at least 20,000 residents.We used Round-7c BMC-CFG forecasts, which includes counts for census year 2000, estimates for the years 2005 and 2010, and forecasts for the period 2015 to 2030 (at five-year intervals).We obtained copies of the Round-7c forecasts and copies of the historical Round-5, Round-6, and Round-7b forecasts to assemble a complete time series for the period 1985 to 2030, at five-year intervals.
In order to develop the exclusion/attraction layer 3 informed by the BMC data, we first extrapolated population and employment values for year 1984 and interpolated values for all intermediate years between 1985 and 2030, which yielded a complete annual time series that contains four snapshots comparable with the CBLCD (1984, 1992, 2001, and 2006).To interpolate population values for year 2001, for example, we estimated the compound annual growth rate (Equation (1)) for each RPD for the bracketing 2000-2005 period, for which we have data.Next, we used the known year 2000 population count and the calculated compound annual rate of population growth to estimate the intermediate population value Equation (2).We used the same general workflow to find intermediate employment values.
Equation ( 1).Let G be the compound annual population growth rate to be calculated for regional planning district, r, between successive population estimates, P, which are given at five-year intervals, and .To find compound annual employment growth rates, employment estimates, E, were substituted for population estimates, P.
Equation (2).Let be the intermediate population estimate to be calculated for regional planning district, r, between successive population values, P, which are given at five-year intervals, and ; G is the compound annual population growth rate described in Equation ( 1); and indicates the intermediate year within the 5-year bracket.To find intermediate employment estimates, employment values and employment growth rates were substituted.The amount of urban growth that occurs in any area is influenced by population growth (new places of residence) and employment growth (new places of work).During the industrial era, people tended to live near the place where they worked, so population counts and changes alone could be used to indicate urban growth because places of residence and places of work were proximate.Today, however, people tend to commute between their place of residence and distant places of work.According to 2009-vintage American Community Survey results [32], the Baltimore Metropolitan Statistical Area (MSA) has the tenth longest average commute (29.7 min) among all MSAs.Consequently, information about employment and places of work are now also necessary to reveal the "all day" picture.Accordingly, we conceptualized urban growth in each RPD as the aggregate result of population growth and job growth.Thus, for each RPD, we summed the reported population and employment values to approximate the total number of humans putting land use pressure on the RPD, and divided each sum by the amount of land available.We refer to this metric Equation (3) as "all day human intensity" (ADHI).Intensity values, unlike counts, allow more effective comparisons to be made between differently sized enumeration units.
Equation ( 3).Let H be the "all day human intensity" value for regional planning district, r, at time t; and are the population and employment estimates obtained via equation 2; and A is the amount of non-excluded land area (km 2 ).
Next, we identified RPDs that were expected to capture larger or lesser shares of total regional growth during the period of interest by using the Location Quotient technique Equation (4).
Equation ( 4).Let Q be the quotient to be calculated for regional planning district, r, within growing benchmark region, B; and H represent ADHI at two points in time, t 1 and t 2 .Although the B term did not vary in our experiment, it served as a reminder that Q values are not absolute; rather, they reflect local conditions and the definition of the benchmark region.The technique described above was useful for identifying RPDs that are expected to attract proportional or disproportional shares of total regional growth.We used that information to build an exclusion/attraction layer that, in turn, is used to inform SLEUTH's allocation of growth.A mathematical scaling problem exists, however, because Q can possibly take any value between zero and infinity (even though values greater than 100 are uncommon in practice).Therefore, the exponential domain of possible Q values {0, …, ∞} must be transformed and rescaled to the linear domain and limits required by SLEUTH {0, …, 100}.We used a logarithmic transformation to accomplish the task Equation ( 5).Q values that approach unity, which indicates proportional growth, are transformed to neutral exclusion scores approaching 50.Q values approaching 0 will, when transformed, asymptotically approach the exclusion limit 100 and, so, will be used to discourage SLEUTH from initiating a change there.Q values approaching infinity will, when transformed, asymptotically approach the exclusion limit 0 and encourage SLEUTH to initiate a change there.Overall, this method proved meaningful and useful, but readers are cautioned that it is not suitable for use with benchmark regions that exhibit overall negative growth.Also, the non-linear transformation method will preserve order but not necessarily distances among the transformed Q scores.Onsted and Chowdhury [28] do offer an alternate method that is based on transforming zoning district information into exclusion scores, but the method seems highly dependent upon how zoning categories are defined and/or aggregated.Their method seem most suitable for small areas where zoning data are temporally consistent, non-conforming land uses are distributed evenly, and exemption and variance policies are applied uniformly across the area of interest.
Equation (5).Let E be the SLEUTH exclusion score to be calculated for regional planning district, r, within the BMR, B, between two points in time, t 1 and t 2 ; Q represents the location quotient described in equation 4. For positive values of Q, the transformation is anchored at the value 50, which is the neutral exclusion score.Consequently, a Q value of 0.5, which indicates a local change rate that is half the base change rate, gets transformed to an exclusion score, E, of 64.A Q value of 2.0, which indicates a local change rate that is twice the base change rate, is transformed to 36.And for the rare instances when Q r takes a value ≤ 0, which reflects negative growth, E is assigned the strong resistance value 99.

SLEUTH Calibration
Prior to running SLEUTH, all data inputs were rasterized to the same 30 m × 30 m grid system and extent, and converted into the required Graphics Interchange Format (GIF).Calibration was performed for each exclusion/attraction layer following the workflow outlined in Jantz et al. [3].Prior to initiating a calibration run, an appropriate value for the dispersion value multiplier was identified.Then, we proceeded with a brute force calibration procedure, the most common method for SLEUTH calibration.A range from 0 to 100 was set for each of the five growth parameters and the model stepped through the parameter ranges at increments of 25 (i.e., 0, 25, 50, 75, 100), resulting in 3125 unique parameter combinations and tests.As SLEUTH is probabilistic, the performance results for each parameter combination were averaged over 10 Monte Carlo trials.SLEUTH automatically calculates thirteen study area wide performance measures (or fit statistics) that compare simulated growth to observed growth for each year for which observational data exist.In our case, the temporal domain for calibration was 1984-2001 with an intermediate time step in 1992.
There is no consensus on which of the thirteen fit metrics should be used to evaluate SLEUTH's performance.In this case, we follow the recommendations in [3,12] and focus on the clusters, edges, and area metrics and also rely primarily on the fractional difference measures generated by SLEUTH-3r.Thus, the clusters metric measures the fractional difference between the simulated and observed number of urban clusters, edges measures the fractional difference between the simulated and observed urban edge pixels, and the area metric represents the fractional difference between the simulated area of urban land as represented by the number of urban pixels.To be considered a good fit, a parameter combination had to result in a simulation that matched all three of these fit statistics within ±5% of observed for year 2001.In order to achieve this level of accuracy, we occasionally had to run additional calibration procedures to test a narrower range of parameter values for a particular coefficient at a finer step (i.e., 0-25 at increments of 5).
We then selected three to five parameter sets that performed the best, and re-ran the model in calibration mode over 100 Monte Carlo trials to ensure robust averaging.In addition to relying on the study area wide statistics calculated by SLEUTH, we also evaluated the results for the selection of best performing parameter values spatially.To accomplish this, we first ran SLEUTH in predict mode so that it would generate maps of simulated urban land cover in 2001 (calibrate mode outputs tabular data only).We then imported these results into a geographic information system in order to compare them against the observational data.Because SLEUTH generates maps showing the probability of urban land cover occurring at any cell location, in this case averaged over 100 Monte Carlo trials, a cell by cell comparison is impractical.We therefore aggregated simulated and observed urban land cover to a coarser resolution, a regular array of 480 m × 480 m cells.This coarser-scale array aligned with the 30 m × 30 m array of the full resolution data so that each 480 m × 480 m cell contained 256, 30 m × 30 m pixels.(The coarser-scale array was designed specifically for later coupling with a three-dimensional process-based hydrologic model-a discussion of which is beyond the scope of the present report and will be described in a subsequent report.)For each cell, we calculated both the simulated and observed proportion of urban land cover and then subtracted the observed from simulated proportions.This allows for visual and quantitative assessments of over-and underestimation by the SLEUTH model at the local scale.Coupled with the regional-scale tabular results, this assessment allowed us to identify which of the parameter sets being tested performed the best.A single parameter set for each excluded layer was thus selected to move on to the validation phase.

SLEUTH Validation
The validation time period was 2001 to 2006 and unconstrained validation procedures were run for each of the exclusion/attraction layers described above.The model was initialized using the parameter sets derived during calibration and the 2001 urban land cover data set, then run to forecast urban land cover change to 2006.The transportation and percent slope layers remained constant, as did exclusion/attraction layer 1. Exclusion/attraction layer 2 was updated to reflect any new water and sewer service areas that came on-line between 2001 and 2006, and the RPD weighting scheme in exclusion/attraction layer 3 was updated to reflect new Q scores for the 2001-2006 period.For the unconstrained run, we assumed no a priori knowledge of how much urbanization might take place between 2001 and 2006.SLEUTH was therefore run with the self-modification function disabled and it simulated a linear growth rate (based on the 1984-2001 trend) for all three exclusion/attraction layers.
Based on the results of these validation runs, we identified the exclusion/attraction layer that resulted in the best model and then performed an additional validation to constrain the amount of growth that SLEUTH would forecast given expected population and employment growth.Under the constrained scenario, we used BMC population and employment data to estimate a range of expected growth for both the validation and forecasting procedures and, thus, constrain SLEUTH's forecasts accordingly.
To estimate the range of expected growth, we built a digital spreadsheet containing data that represents the observed period (1984, 1992, 2001, and 2006) and the prediction period (2001-2006 for validation, 2006 to 2030 for forecasting).We tallied the numbers of people and jobs in the region and the amount of land as urban for each year in the observed period, and calculated ADHI values.Next, we calculated the compound annual growth rate (expressed as a percentage per year) of non-urban to urban transitions for each pair of successive dates in the observation period.The rate of urban growth decreased from 0.91%/yr during the 1984 to 1992 interval, to 0.49% during the 1992 to 2001 interval, and to 0.18%/yr during the 2001 to 2006 interval.Next, we employed an "adjustment exponent" to help us simulate smooth and continuous changes to future compound annual growth rates.An adjustment exponent value of 1, when propagated forward (e.g., 1.8E-03 1.00 ), allows the simulated urban growth rate to remain constant throughout the prediction period.An adjustment exponent value less than 1, when propagated (e.g., 1.8E-03 0.96 ), allows the simulated growth rate to decrease gradually over time.And, an adjustment exponent value greater than one (e.g., 1.8E-03 1.04 ) allows the growth rate to increase gradually over time.
We started by developing the pro forma "status quo" scenario, whereby we estimated the compound annual rate of land cover change during the most recent historical period and employed it as a constant throughout the prediction period (2001-2006 for validation, 2006-2030 for forecasting).The adjustment exponent in this case takes the value 1 and the urban footprint for the target forecast year is estimated.Next, we identified the appropriate multiplier in SLEUTH's self-modification function to achieve the estimated target urban footprint under the "status quo" scenario.This scenario has value in as much as one expects the past to determine the future, and serves as a baseline for comparison with other scenarios.
We developed an "accelerating growth" scenario wherein the urban growth rate was allowed to increase gradually during the prediction period.This scenario has value because it provided us with an opportunity to examine outcomes associated with a development boom.To avoid simulating another era of rampant over-building, however, we used brute force to find an adjustment exponent value (1.032) that would allow future rates of non-urban to urban transitions to increase continuously, yet constrain them so that ADHI would not decrease during the prediction period.Next, we identified the appropriate multiplier in SLEUTH's self-modification function to achieve the estimated target urban footprint under a "boom" to reflect accelerated growth.This scenario serves as an upper limit on the range of possible future outcomes.
Next, we developed a "decelerating growth" scenario wherein the rate of non-urban to urban transitions decreases gradually over time.This scenario is worth considering because it coerces new places of residence and work to be located more densely, but not exclusively, in existing urban areas.We wanted to avoid simulating negative growth rates, which we consider impractical in the Baltimore case, so we used brute force to find an adjustment exponent value (0.96) that allowed at least 1 km 2 of non-urban land to transition to an urban state in every year.Next, we identified the appropriate multiplier in SLEUTH's self-modification function to achieve the estimated target urban footprint under future "bust" and reflect a lower growth rate.This scenario has value because it allows future non-urban to urban transitions to occur, but it increasingly favors a smart-growth strategy of developing existing urban communities for more intensive uses.
To assess the accuracy of each validation run, we used techniques similar to those described for calibration accuracy assessment.In addition to the regional tabular statistics calculated by SLEUTH (clusters, edges, and area fractional differences), simulated and observational data were aggregated to the 480 m array to allow for local scale comparisons.

Forecasting to 2030
In order to demonstrate the capability of generating forecasts, we generated status quo, bust, and boom scenario estimates for new urban growth in 2030, using BMC forecasts of population and employment, as described above.We ran SLEUTH in forecast mode for each scenario, constraining the forecasts accordingly, and initializing the model with 2006 urban land cover and the same transportation network and DEM used previously.We generated forecasts using only exclusion/attraction layer 3, which was updated with new ADHI Q scores to reflect the 2006-2030 BMC information.

SLEUTH's Exclusion/Attraction Layers
Exclusion/attraction layer 1 is shown in Figure 3.As noted previously, this layer includes all lands that are off-limits to development and is included in exclusion/attraction layers 2 (Figure 4) and 3 (Figure 5).We note that the water and sewer service areas in Figure 4 show both the data used for calibration (1984)(1985)(1986)(1987)(1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001) and validation (2001)(2002)(2003)(2004)(2005)(2006).As noted above, we tested several attraction values (10, 20, 30, and 40) to identify the best weight to use in the water and sewer service areas in exclusion/attraction layer 2 and found that the value of 30 resulted in the best model performance.The ADHI Q weights in Figure 5 are shown for calibration, validation, and forecasting.It is interesting to note the patterns of population and employment change as illustrated by the exclusion/attraction scores in Figure 5.For example, during the 1984-2001 time period, the Baltimore region demonstrates that RPDs in suburban and exurban areas attracted more new residents and jobs relative to the regional average, especially much of Howard County, the city of Westminster in central Carroll County, southern Harford County, and central Anne Arundel County.Meanwhile, the city and inner suburbs were less competitive, a pattern associated with the decentralization or "sprawl" of post-industrial cities [24,25].From 2001 to 2006, there was an apparent phase transition, as nearly all areas excepting the city and inner suburbs have a more or less equal level of competitiveness in terms of attracting population and employment.We note that this era captures the real estate boom that much of the United States was experiencing.Finally, the 2006-2030 time period shows a return to and an enhancement of the pattern demonstrated in the 1984-2001 time period.This is a signal that, despite Maryland's relatively strong smart growth policies and programs, regional experts expect a continuation of urban decentralization in the Baltimore region.

SLEUTH Calibration
Best fit parameters and study area-wide fit statistics for each of the exclusion/attraction layers are shown Table 1.For calibration, model simulations are driven by and compared against the 1984-2001 observational data.We note that there is little variation across the parameter values selected for each exclusion/attraction layer.This convergence indicates that at the scale of the study area SLEUTH-3r does not demonstrate a strong sensitivity to the differences in the exclusion/attraction layers.This is also reflected in the values for the fit statistics, which indicate relatively good model performance across all three exclusion/attraction layers.The fit statistics also indicate consistent model overestimation for overall urban area (0.036-0.046, or 3.6%-4.6%overestimation), urban edge pixels (0.018-0.047), and urban clusters (0.061-0.080).Left unconstrained, SLEUTH will develop land based on the empirical constant growth rate, which produced overestimations of both urban area and urban edge.In the case of urban clusters, the observed number of urban clusters declined over the 1984-2001 time period, indicating coalescence.However, no combination of parameters was able to capture this process for the study area, resulting in an overestimation of urban clusters.This study area-wide assessment fails to reveal the response of the SLEUTH model to the different exclusion layers.Results of a finer-scale comparison for 480 m × 480 m cells are shown in Figure 6.Although generally the spatial patterns of over-and underestimation across the BMR are similar, there are some significant differences that show the response of the model to the different exclusion/attraction layers.First, we observe that exclusion/attraction layer 1, which includes only lands that are excluded from development, results in overestimation around existing urban clusters, reflecting the well documented behavior of the edge growth rule [12].Areas of underestimation coincide with places that experienced rapid and unexpected growth, a limitation that has also been previously documented for SLEUTH [8].In other words, the performance of SLEUTH with exclusion/attraction layer 1 is more or less as expected.
For exclusion/attraction layer 2, we were testing the assumption that areas of existing or planned water and sewer service areas would act to attract development.However, our calibration results show that when these data layers are provided to SLEUTH the overestimation in and around the urban core of Baltimore is greater than what is observed otherwise.This reveals something about the urban system-that despite policy objectives, centrifugal forces prevailed during the 1984-2001 time period in the Baltimore region-and also highlights the historical inertia apparent in Baltimore's current morphology; for most of Baltimore's history, it was a monocentric city.The post-industrial sprawl driving recent growth patterns is a new phase that is only a few decades old.This latter observation is reinforced by the results for exclusion/attraction layer 3, where the exclusion/attraction weights reflected population and employment growth that occurred primarily in suburban RPDs away from the urban core.In this case, SLEUTH still overestimates the amount of growth taking place near the urban core.We do note, however, that overestimation errors are lower when compared to exclusion/attraction layers 1 and 2.

SLEUTH Validation
For the validation process, we compare simulated and observed urban area and patterns for 2006, a data point that was withheld from calibration.Study area-wide results (Table 2) again indicate good model performance across all three exclusion/attraction layers, although this time the performance of the model with exclusion/attraction layer 3 is improved, showing an order of magnitude higher fit scores for area and clusters compared to exclusion/attraction layers 1 and 2. The calibration and validation results discussed thus far are based on SLEUTH model runs that are unconstrained.We also wanted to pursue the question of whether or not the model would perform better when informed by exogenous forecasts that would estimate the demand for land (see Section 4.5).Results for the minimum, maximum, and status quo growth scenarios, run with exclusion/attraction layer 3, are shown in Table 3 and in Figure 7.Of the range of growth levels forecasted, the minimum estimate is the closest to the observed amount of development in 2006 resulting in an improved score for the area fit statistic, although fit statistics for edge and clusters decline somewhat (Table 3).However, when the simulated and observed maps of 2006 are compared at the 480 m scale, it is apparent that the minimum growth scenario performs the best.What is important to note, however, is that the amount of observed urban development is ultimately captured within the range of growth levels we estimated.Table 3. Fit statistics, reported as fractional differences between simulated and observed values, for the constrained validation runs.These runs were performed using exclusion/attraction layer 3.

SLEUTH Forecasts
Through the process of calibration and validation, we were able to show that the model's spatial allocation of new urban development would be positively influenced by including a weighted exclusion/attraction layer that includes information about where population and employment growth are likely to occur.We also effectively incorporated a method to estimate a range of new urban land cover that would occur given forecasted levels of population and employment.We therefore propagated these methods forward to generate a range of forecasts for new urban land cover in 2030 (Figures 8 and 9).These forecasts were based on BMC estimates of population and employment growth, which informed the spatial allocation of growth via the 2006-2030 exclusion/attraction layer (refer to Figure 5) and the overall amount of growth via the methods presented in Section 4.5.As noted in Section 2, SLEUTH's exclusion/attraction layer is now being used more broadly as a suitability layer for new urban development.In this work, we build on this innovation by using the exclusion/attraction layer to ask questions about what drivers are most important in influencing the spatial allocation of new urban land.We emphasize two key findings.First, we found that new development is not necessarily attracted to areas serviced by existing or planned water and sewer infrastructure.In fact, including water and sewer service areas as an attraction for development arguably lowered the performance of the model considerably, which was already overestimating Actual the amount of development in the urban core.Our second and related key finding is that incorporating information about where population and employment growth is likely to occur, even at the relatively course scale of regional planning districts, improved model performance.Taken together, these findings point to the still dominant role of centrifugal forces in post-industrial cities like Baltimore, MD.In addition, the approach presented here offers a promising approach for using simulation modeling to test hypotheses regarding land use change drivers, similar to the work recently presented by Chaudhuri and Clarke [33].
Few advances have been made with respect to SLEUTH's self-modification functions-most users choose to use the default settings or disable these functions entirely.We know of only one example [16] that relies on the self-modification functions for coupled modeling.In this work, we build on Ciavola et al.'s [16] work and use self-modification as a mechanism to couple SLEUTH with a separate demographic model.In this case, the BMC population and employment database was critical to our work as it represents a reliable observational record and broadly vetted forecasts.However, the relationship between changes in population and employment and urban land use is spatially and temporally variable.To address this uncertainty, we developed a range of urban land use change estimates based on observed and forecasted changes in population and employment.Our validation results emphasize the importance of this approach, since our model estimates captured the actual amount of development that occurred in 2006.Ultimately, the incorporation of the BMC database and the explicit treatment of uncertainty lends validity to our forecasts of 2030 urban land cover change.
This work also demonstrates the importance of model validation.As we pointed out in the introduction, few land use change models are subjected to a true validation, where a calibrated model is compared against new data.We argue that the increasingly wide-spread availability of land use change data should now make validation a given in land use change modeling.In our case, the validation procedure played a key role in rigorously assessing the impacts of different exclusion/attraction layers and in assessing uncertainty related to population and employment forecasts.These approaches will contribute to land use change models being used more widely in local and global policy development.

Figure 1 .
Figure 1.The modeled Baltimore Metropolitan Region (BMR) of adjacent county and county-equivalent areas, which does not include across-the-bay Queen Anne's County (QAC).The highlighted region indicates the extent of urban land covers in 1984 and 2006.

Figure 2 .
Figure 2. Census population counts for Baltimore city and the entire metropolitan region, including the city, between 1810 and 2010.
For calibration, the 1984-2001 time period was used, for validation 2001-2006 was used, and for forecasting 2006-2030 was used.A Location Quotient (Q) is a measure of the relative significance of a phenomenon (in this case, a change in ADHI) in a small region (RPD) compared with its significance in a larger benchmark region (the Baltimore region).Q values greater than one indicate the proportion of change in the RPD was greater than the proportion of change across the benchmark region, while Q values less than one indicate the proportion of change in the small region was less than the proportion of change across the benchmark region.Unity values indicate equal proportions; the small region changed at the same relative rate as the benchmark region.

Figure 3 .
Figure 3.The exclusion/attraction layer used to guide the initial instance of the BMR SLEUTH model.

Figure 4 .
Figure 4.The Exclusion/Attraction layer used to calibrate the "infrastructure" instance of the BMR SLEUTH model.Locations in sewer and water service areas were assigned a mild attraction score (30).

Figure 6 .
Figure 6.Upper panel (A-C) shows maps of difference in % urban per 480 m cell (modeled-observed); Lower panel (D,E) compares the mapped results from the three exclusion/attraction layers.

FitFigure 7 .
Figure 7. Maps showing constrained validation results for (A) the MIN, (B) Status Quo and (C) MAX scenarios run with the RPD exclusion/attraction layer.

Figure 8 .
Figure 8. Maps showing constrained forecast results for (A) SQ, (B) MIN and (C) MAX and for RPD exclusion/attraction layer.

Figure 9 .
Figure 9. Graph showing constrained forecast results for SQ, MIN, and MAX scenarios run with the RPD exclusion/attraction layer.

Table 1 .
Parameter values and fit statistics for the best performing calibration runs.Fit statistics are reported as fractional differences between simulated and observed values.Positive values indicate model-overestimation, negative values indicate model underestimation.Values closer to zero indicate higher fits.

Table 2 .
Fit statistics, reported as fractional differences between simulated and observed values, for the unconstrained validation runs.