Simple Urban Simulation Atop Complicated Models: Multi-Scale Equation-Free Computing of Sprawl Using Geographic Automata

Reconciling competing desires to build urban models that can be simple and complicated is something of a grand challenge for urban simulation. It also prompts difficulties in many urban policy situations, such as urban sprawl, where simple, actionable ideas may need to be considered in the context of the messily complex and complicated urban processes and phenomena that work within cities. In this paper, we present a novel architecture for achieving both simple and complicated realizations of urban sprawl in simulation. Fine-scale simulations of sprawl geography are run using geographic automata to represent the geographical drivers of sprawl in intricate detail and over fine resolutions of space and time. We use Equation-Free computing to deploy population as a coarse observable of sprawl, which can be leveraged to run automata-based models as short-burst experiments within a meta-simulation framework.


Introduction
"All these little strings keep holding us together" [1].In this paper, we present a method for building simple simulations on top of complicated models, as an avenue for accommodating both vantage points in exploration of urban sprawl.Sprawl is a significant urban phenomenon, but its exploration often produces controversy and generates wide variation of opinion, in part because sprawl is approached with both simple and complicated explanations, diagnostic properties, drivers, and consequences [2].Agent-based models have recently been suggested as a computational vehicle for investigating the intricacies of urban sprawl because of their faculties in representing complex systems [3][4][5][6].An assumption of generative ability is central to much of the motivation for using agent-based models in urban simulation: the idea that simple agents can be specified at the microcosm of urban elements and run interactively over large swaths of space and time to produce complex formations such as sprawl at the macrocosm of the urban system.The idea is eloquent and perhaps even romantic, but the reality is that most agent-based models of urban growth or urbanization remain simple in their specification [7], despite decades (maybe even centuries) of theory and observational evidence that suggest that urban processes are actually incredibly more complicated [8].
Without reliable ability to trace indelible (or at least plausible) paths between the micro-scale of cities and their urban components and macro-scale phenomenological outcomes such as sprawl, urban models are perhaps always open to the criticism that they function as black boxes [9]: stuff goes into the model, stuff comes out, and little understanding of the trajectory between the two is garnered in the process.As broader than discussion about urban simulation of sprawl, this tension between simplicity and complication is characteristic of a wider critique of complexity studies more generally [10], and the sufficiency of the notion of emergence as used in vanguard complexity methods such as agent-based modeling [11].Resolving the validity or usefulness of simple or complicated models in urban simulation is difficult, as both vantages have utility in urban studies and urban policy, and in many cases, we simply do not how to reconcile the 'mesocosm' between fine-scale urban dynamics and regional-scale urban phenomena.Indeed, we may be building the model to try to identify plausible paths for more detailed theoretical and observational inquiry.Complicating matters further, the urban modeling community's ability to treat city-systems with the sorts of intricacy afforded by approaches such as agent-based modeling, and supported by big data to test them is very recent in urban studies [12].The methods are in a stage of relative infancy and have not matured to the level enjoyed in related simulation fields such as climate modeling, where reconciliation of elaborate paths from micro-scale to macro-scale, featuring all of the messy intricacies in between, is more robustly handled in simulation, and more commonplace in models [13].
It would be tremendously useful, in urban simulation, to have schemes that can handle urban models at both complicated and simple scales of formulation, so that simulations can focus on the interstitial dynamics between the two [14].Such a scheme would have particular value in studying suburban sprawl, where public policy and management schemes that can be brought to bear on individual urban agents (land parcels, households, developers, infrastructure projects) need to be assessed in terms of their up-scale consequences (smart growth, urban containment, externalities and spillovers) [15][16][17][18][19].
In this paper, we present a scheme for using population as a coarse and diagnostically useful observable of urban sprawl to run many diverse geographical process models of urbanization and urban growth at fine-scale in simulation, as short-burst experiments within an evolving simulation.To facilitate this, we use coarse-graining [20,21] to enable a model-user to enjoy both simple and complicated vantages of sprawl via a meta-simulation.In the examples explored in [20,21], explicit coarse grained models of a stochastic lattice model are constructed and simulated: both deterministic (mesoscopic) and coarsegrained stochastic meta-models.In our case, we circumvent the explicit derivation of such metamodels; we perform the meta-simulation via the Equation-Free approach, by wrapping techniques such as Coarse Projective Integration (CPI) [22][23][24] atop geography in simulation.For example, in the case of [20,21] the fine scale variable is population (of chemical species adsorbed on a lattice of catalytic sites); in our example, by analogy, the fine scale variable is again population, but of persons, on a fine urban grid.Similarly, the coarse scale variable in [20,21] is again population (but now on a much coarser lattice), or a continuous population density profile at the mesoscopic level, and the analogy persists in our case.The point here is that the same nominal variable (population) can be simulated and observed at a very fine scale (when its evolution may appear strongly random and fluctuating) or at a much coarser (averaged) level, when its expected evolution, averaged over several stochastic realizations, can appear much smoother and predictable over longer time intervals.It is precisely this "recovery of smoothness" at the coarser level that underpins the existence of useful reduced, meso/macrosocpic descriptions.
In what follows, we will show how the Equation-Free approach can be applied to coarse projective integration on population atop geography in urban models.We will discuss the value of population as a coarse observable of sprawl and as a base ingredient of other sprawl indicators.We will also present experiments to demonstrate that the Equation-Free approach can usefully ally simple observables of sprawl with detailed and complicated models of sprawl process dynamics, with the advantage that efficiencies in simulation may be attained, while preserving a faithful realization of the original, fine-scale model.While preliminary, this suggests several avenues for future simulation that could enable simple, policy-and theory-relevant global norms to be used as levers to control massively interactive and complicated underlying models in a decision-support context.In short, we think that the approach can allow urban simulations to be simple while also being complicated.

Background
In modeling cities, there is often a debate about whether simple or complicated representations of urban processes and phenomena are most appropriate for simulation [7,9].Simplified models have long had a place in urban simulation.Some of the most popular, longstanding, and elegant urban models are simple [25].Considered practically, it is also often desirable to reduce model results to a few degrees of control, i.e., outcomes or recommendations that can guide urban planners and managers in their decision-making [26].
The desire for simplicity in urban simulation from earlier traditions seems to have persisted into a recent era of agent-based modeling (ABM).Ironically, ABM was pioneered as a new approach to urban modeling because of its ability to support detailed and massively interactive simulation [7].However, the much-cited advantage of ABM as a generative approach to modeling [27] often begins from an assertion that agent-based models should be designed simply at microscopic level and that complexity will emerge, from the bottom-up, as they are run through their paces [28].This has led to some strong criticism that such agent-based models are black-box simulation schemes, in which the mechanisms for generative complexity are not explicitly represented or presented in simulation [29].If we think about this practically, this situation mirrors the current state of urban studies: we can observe lots of things on the ground in terrific detail, and the city-or regional-level conditions of many cities are also reasonably well-understood, but how we get from one to the other (up-scale, or down-scale) is not always clear.Indeed, this missing link in the middle, so to speak, is beginning to be realized in urban policy discussions [30,31].Thinking purely about simulation, proponents of simple agent-based models rightfully point to several benefits of reductionism as a desired design goal for models [32]: there may be philosophical merits [33], simplicity has pedagogical value when models are deployed as tools to think with [34], simple models may be easier to code and use [35], and complicated models may be more difficult to control [29], yet, simple urban models are often at odds with the reality that our theoretical and observational assumptions about cities suggest [36]: that while some properties of cities conform to simple (usually statistical-mechanical) regularity [37,38], more of their processes and phenomena are anything but simple in their functioning [39].
Realistically, simplicity is just a vantage point on urban systems.Of course cities are rich in complexity and detail, but there are many instances in which a simple viewpoint of urban complexity is desirable.This is often true for urban policy, and examination of the phenomenon of urban sprawl is a good example of this.Debate about what exactly sprawl is has developed for some time [2,[40][41][42].This debate has carried forward into representation of sprawl in urban models [3].A resolution to the controversy surrounding sprawl is somewhat elusive because the phenomenon has many causes, drivers, and implications [43] of relevance to a diverse set of stakeholders.Indeed, depending on how one measures sprawl, it may be found to appear or be absent in the same city [16].
Two central axes of controversy in defining, observing, and measuring sprawl seem to be, (1) the variable used to diagnose sprawl, and (2) the detail with which one investigates variables [2,16].Amid the many properties of sprawl, two usually reoccurring, common benchmarks do present in the literature (i.e., they appear in most examinations of sprawl): population density [44] and geography [45].Sprawl manifests with relatively low densities, and this is one of the main sources of complications (and confusion) with sprawl regarding accessibility [46], impervious surface [47], aesthetics [48], and so on.Less controversial is the idea that sprawl is a form of peri-urbanization, which appears on the urban fringe, arguably with related potential complications of political fragmentation [18], encroachment on agricultural land [49,50], inefficiencies in resource use [51,52], challenges for centralized transportation and transit infrastructure [53], and potential disruption of peripheral ecosystems [54].
In the approach that follows, we will use both mechanisms as control schemes in simulation: population as an ingredient to agent-based geographic processes, and as a coarse observable for numerical simulation.We will treat population as a coarse-scale observable of sprawl (which, incidentally, we could ally to further compound variables or indices such as service provision, activity, land-use intensity, and so on [15,44,[55][56][57][58][59][60][61][62]).Population is then used in a meta-simulation framework to control fine-scale modeling of geographical processes for urbanization, urban growth, and urban change using geographic automata specified to the level of small units of space and time within a massively interactive simulated urban space.The meta-simulation framework is flexible, in that different models or observables could be swapped or compared within the same scheme.
We make use of an existing sprawl model [4], although, a different model could conceivably take its place in simulation.Elsewhere, we have reported on the accelerated computing scheme underlying this approach [63].Our intention, in this paper, is to focus on its utility for modeling sprawl and as a scheme for enveloping simple and complicated vantages of an urban phenomenon within a unified simulation platform.

Methods
In the following section, we will present a method for performing Equation-Free numerical simulation of population using a detailed (agent-based) geographic automaton model of sprawl process dynamics at fine scales of space and time.The geographic model is designed to treat space-time drivers of sprawl, as spatial processes.The geography of these processes may have particular meaning in environmental, engineering, policy, sociological, and economic settings, but here we focus on their geographic form, as we regard population and geography as being sufficient indicators of sprawl, as discussed above.In dealing with geography, we deploy geographic automata as a polyspatial vehicle for sprawl, which allows the process drivers of the phenomenon to adapt to spatial context and scale [64].

Automata-Based Data Structures, Geographic Automata, and Polyspatial Functionality
We begin by introducing a resourceful computational apparatus for simulation at fine-scale, with the realization that the often bewildering array of component phenomena and entities responsible for urban dynamics requires an extensible medium for simulation.At its essence, the apparatus is fashioned around the automaton because of its ability to be agilely and diversely configured to suit many model scenarios.While our requirement for polyspatial functionality in the model is not a particular representation challenge for the automata framework, the geographical mechanics of the models will be significant levers of dynamic control in the simulations that we wish to build, and so we also present specific, innate, parameters for them in the formulation of the automata.
We consider geographic automata (GA) as a specific class of automata, and we seek to integrate GA building blocks as geographic automata systems (GAS) in simulation.The automaton structure is well-known since its introduction by John von Neumann [65] and Stanislaw Ulam [66] following Alan Turing's ideas for computing [67,68] and thinking machines [69], but it is worth revisiting so that the geographic functionality that we add can be contextualized.We consider a basic automaton information-processing structure, , which we then wrap with additional geographic functionality to derive a geographic automaton, .The geography of is parsimonious in its facilitation of geographic abilities and characteristics, but it is sufficient for most applications (we have tested this sufficiency in application in [70] and in scale in [64]): ~ , , , , In Equations ( 1) and ( 2) above, we consider an automaton as derived from states and transition rules that determine dynamic shifts in those states based on input between two finite time-points (with input possibly gleaned from other automata, but not necessarily).We may modify this basic data structure to form geographic automata by expanding the definition of , , and to accommodate sets of geographic operators for inertia , location , neighborhoods of influence , movement rules , and interaction rules .We will leave and to their generic automata interpretations so that state transition can interact more generally with location and interaction transitions supplied by and .The inertia parameter enables to be fixed or mobile in its environment, which may be used to define as a cellular automaton (CA) or not.If invokes fixture, 's location remains wedded to the space within which it is registered.still retains the ability to become unfixed if needed and it can still draw input to its transition sequences dynamically using .The expansion of the basic functions of the automata with geographic-specific considerations is important, as it allows for the definition, interpretation, and contextualizing of key components of the automaton's information-processing to be expressed in different places, spaces, times, events, processes, and company [70].Significantly, in the context of this paper, we may also express based on its scale or scaling, such that the components of act polyspatially, i.e., that they are capable of adapting to the particular setting that a scale affords.At the most obvious, polyspatial functionality is enabled by , which, in some senses, allows us to render the automaton into a free agent-automaton [71] or a cellular automaton (CA).If a CA is defined, information can still move (through diffusion, contagion, influence, and so on), but the automaton itself (and its transition-processing functionality) remains within the cell.This is a small semantic distinction at first glance, but it can be important.When automata move, their ability to interact and react in the system can become much more dynamic than it might otherwise be under conditions of inertia.Realistically, agents in sprawling cities (whether houses or householders) engage in both geographies, and processes governed by movement and diffusion may intertwine seamlessly.We need the flexibility to handle both, and to treat both in the same model and even in the same automata system.The malleability implied in the interaction rule also supports polyspatial reach for the automata, by allowing the geographic automaton to express its states and transitions up-scale and down-scale.
is left open to specification, so that geographic automata (or the reach of their influence) can flexibly traverse many spaces and times, even simultaneously between when an automaton requires input from many different scales to resolve transition calculations.Similarly, movement rules allow the geographic automata to work locally, globally, or anywhere in between.Operationally, these components can also be invoked using a variety of spatial processing [72], spatial analysis [73], or spatial data access schemes [74] from Geographic Information Science [75], geocomputing [76,77], or geosimulation [39], so geographic automata may also be considered as polyspatial in their ability to ally with other information technologies.

Multi-Scale Equation-Free Computing on Population
We employ two coupled scaling mechanisms in our approach: (1) polyspatiality in the model, and (2) multi-scale computing in simulation.In the automata model, as described above, automata (themselves) have a great deal of extensibility, with respect to space and time, in their information-processing abilities.
Importantly, they can adapt flexibly to the special spatial context of their situation.In multi-scale simulation (where we regard simulation as the running of the model for particular scenarios or over particular infrastructures or data-sets), we take advantage of mathematics-assisted computation to create multi-scale efficiencies in simulation and resolution of a model's computing.This could make use of automata, although it does not have to; the scheme is open in this regard.This is multi-scale, rather than polyspatial, because the computational process for resolving the model in simulation usually remains the same, even when crossing scale barriers, and does not particularly transform or adapt to the scale at hand.The mathematics of the Equation-Free approach is used to resolve the simulation as an experiment (rather than as a model) in computing.
In this example, multi-scale computing is achieved using the Equation-Free computational framework developed by Kevrekidis [78], which facilitates the design of simple experiments (which we can describe and interpret at macroscopic, systems-level) atop complicated models (which we can define at microscopic scales of space and time, and even stochastically if needed) [78].(Although, other approaches beyond Equation-Free could be used instead.)This is accomplished by building the macroscopic experiment for relatively large swaths of space and time, using short-burst calls to the microscopic simulator for small bundles of space and time.Moreover, the macroscopic experiment can be built using traditional continuum numerical analysis, while the microscopic model can be specified as automata, such that the two styles of modeling can be reconciled within a unified experimental computational infrastructure.
The essence of the Equation-Free (EF) approach is to address the challenge of moving from microscopic details of a system to holistic, macroscopic outcomes at the system level, without explicit formulae for how to trace that path between scales.In many ways, this echoes the tension that often presents between arguments for 'simple' and 'complicated' models of cities. Simple descriptions of the city may present as well-identified spatial structures or compositions or priorities for urban policy, such as suburban sprawl.These are the types of descriptors of urban dynamics that we inherited from the early quantification of urban studies and the big picture (city-encompassing and conceptual) models of Ullman, Harris [79], and Burgess [80].In many ways, regional science [81,82] and the landuse and transport models of the Twentieth Century [83][84][85][86] were pioneered to handle cities at this scale of analysis.However, we might also consider, on the other end of the spectrum, a set of complicated models that endeavor to explain, from the bottom-up and from the experiences of the intricacies of the components that make up cities, where, how, why, and when those big picture phenomena come from (or emerge from).This sort of scholarship is more traditionally sourced in the observational work of urban sociologists, and more conventionally available through urban sensor networks (transit systems, smart highways, instrumented buildings) [87] and volunteered geographic information from citizens' mobile devices [88] and transactions [89].Moreover, this microcosm of urban dynamics is more commonly modeled using agent-based models [90] that are more concerned with the actions, interactions, and reactions of individual people, cars, and properties than they are with the city as a holistic entity.At both ends of this spectrum, quite a lot is known and quite sophisticated (and scaleappropriate) models can be built, but building connections between the two is a challenge.Realistically, it is a grand challenge, as a bewildering array of micro-conditions of a city or a city-dweller could possibly effect dynamics in the urban system that might manifest at system level; cities are exemplar complex adaptive systems [71,90,91].
The standard approach in evolving microscopic urban simulations using cellular automata or agentbased models, for example, is to specify the model with plausible, experimental, or known (from remotely-sensed data, usually) parameters and then advance the model in discrete time-steps until a city or urban pattern is produced.In other variants, some global-level controls may be set as targets for the evolution of the model in fine time-scale, discrete, as normative urban policies or goals, for example [4].In either case, the scientific procedure really involves setting the model, winding it up, and letting it go.This can be something of a scattershot approach and a series of averaging procedures have been developed to, in essence, automate huge volumes of blind trial and error runs (see [92] for an overview).The often-mentioned intent of this experimental approach is to engage in 'generative' science [93], such that simulated cities are built from the bottom-up.Realistically, the models are specified as ingredients at the 'bottom' scale of agents of urbanization, but very little building actually takes place in moving from the 'bottom-up', as the dynamics between the micro-scale and macro-scale are usually ceded to Monte Carlo averaging.Perhaps a tell-tale sign that little is being grown in simulation is the common practice of building models in exquisite detail, but reporting results as generalized, broad-strokes statistical indices or global spatial compositions [94].
As a practical concern, these sorts of massively-averaged generative runs of urban simulations are also computationally exhaustive.Adding more processing cores to the procedure is unlikely to offer efficiencies, as the volume of data that we have on hand to parameterize models at the micro-scale, and the sophistication of the rule-sets used to describe agents in the models, is ever-increasing.Agents are, like finite-state machines, thorny beasts for high-performance computing [95].Indeed, perhaps one of the limiting factors for current automata-based urban simulation may be the constraints on computing.This has been raised in the context of load-balancing, for example, in the TRANSIMS traffic model, which is the most massively agent-based urban model (in terms of the triple of numbers of agents, sophistication of rule-sets, and fineness of spatio-temporal granularity) that we are aware of [96], and in implementations of the SLEUTH urban growth model [97].Using the EF approach to run wellparameterized bursts of the micro-model, instead of evolving every single component of the model to a macro-condition, we can accelerate the computing of the simulation [63].
Our approach in simulation, then, is to treat the micro-scale (automata-based) model as a nodemember of a supposed coarser mesh, which represents a small bundle of space-time embedded within a larger ensemble (field) of space-time that holds the macro-scale realization of the city-system.These short-burst experiments are resolved at the micro-scale and then interpolated to generate the evolution of the system-level realization of the city.A side advantage is that if the microscopic model is wrong in some way (in producing city-level phenomena), this will be detected sooner than with full simulation.Different approaches or controllers can be used at the level of the macro-scale to issue and resolve the short-burst experimental simulations at the micro-scale.In the experiments reported in this paper, we used a coarse time-stepper [98], which facilitates coarse stability and bifurcation analysis of the automata model at a macro-scale of the system.

Modeling Sprawl
Our main targeted application of the infrastructure is modeling suburban sprawl in American city-spaces.We regard this as a good test case for our approach for a number of reasons.First, sprawl is polyspatial in its determining agency and processes.We will introduce a series of diverse geographical processes to account for this in simulation, but the panoply of urban geographies that govern sprawl dynamics has been well-discussed in the literature [2,18,19,[40][41][42][43][44][45]48,[99][100][101][102][103][104][105][106][107][108][109][110][111][112][113][114][115][116].Second, sprawl is also a multi-scale phenomenon, experienced at the level of individual neighborhoods [117,118] through to megalopolitan regions [17,45], with plenty of processes that course among the interstitial scales.There is also a strong sense that sprawl emerges from the bottom-up; indeed, popular arguments contend that small-scale politics may be responsible for fostering sprawl with metropolitan-wide implications [18].Third, sprawl, as an umbrella phenomenon, maintains fast and slow processes [119] within its collective dynamics [120].The value platform of urban sub-markets, for example, may take decades to unfold [121], while new home-builders can construct properties on the scale of months in reaction to urbanization trends [15], and a household may take only days to engage in a new home search [122].Fourth, sprawl is widely recognized to be a complex adaptive system, sharing many of the same characteristics of other complex systems in social and physical realms.In particular, traits of emergence, self-organization, positive and negative feedback, path-dependency, bifurcation, homeostasis, and allometry must be negotiated in simulating sprawl in a systems context [16,37,123,124].Fifth, sprawl is also a relatively recent phenomenon: realistically, we do not fully understand it as a generative system [43].So, it is likely that our micro-formulation of proposed drivers may change over time.Sixth, sprawl manifests (and emerges) differently in different places and times [108].One model is not likely to suffice given the number of sprawling cities around the world and the array of context-specific considerations that need to be embraced in modeling them.Realistically, what is needed is an extensible modeling infrastructure with the flexibility to mix and match different micro-models within an overarching metasimulation, experiments-of-experiments superstructure that can support meaningful cross-comparisons between models.
Many of the characteristic outcomes (but not necessarily the drivers) of sprawl are observable at a systems level.This is particularly true of population density, which manifests in some of the distinct architectural, design, transport, lifestyle, social, and environmental aspects of the phenomenon [15,44,56,57,62,115,[125][126][127].Population density also serves as one of the most significant control levers for urban planning and management policies.These policy interventions are not without controversy in the public debate, and exploring their potential dynamics in a computational laboratory is useful [2,41,42].Moreover, population density is an indicator of several other sprawl-related problems (some would argue them as opportunities [100]), as we have already described.

Automata-Based Model Design for Sprawl Processes
We make use of an existing sprawl model developed by Torrens [4] to specify GA-based agents as mobile urban development processes and dynamic land parcels.In essence, this provides our complicated model; the EF scheme then operates atop this, dipping into the GA model as-needed for short-burst experimentation.The results of these experimental bursts are integrated using the coarse population (population density) state from the original GA model, because of the significance of this variable for sprawl, as mentioned above.The original sprawl model developed in [4] ran by direct computation and we can therefore compare direct and Coarse Projective Integration (CPI) runs of the simulation.
Sprawl processes are treated in the model as geographic drivers, i.e., sprawl is modeled using spacetime mechanisms that mimic the path of sprawl over the urban/land surface as an urbanization, settlement, and development process.That households, policies, property developers, markets, sub-markets, decision-making, and so on are the very bottom-up drivers underlying these geographic processes is implied, but not explicitly treated.Elsewhere, Torrens and colleagues have demonstrated the necessary couplings between the two [64], as have other researchers [3,5,6].
Five geographical processes are introduced to represent interacting paths of urbanization for sprawl.The first process is immediate growth, which is used to introduce incremental development through simple extension and contagion at a (very) localized scale.The localization is constrained within input filters [an adjacent consideration of in equation (ii)] that are limited to the immediate neighbors for a given GA.The second process is represented as nearby growth, which operates at the neighborhood level (several properties in close vicinity; i.e., a larger consideration of ).Third, leap-frog growth is used to produce development through settlement on the fringe of the urbanizing system at a distance from its main core or several polycentric cores [15,18,50].Fourth, irregular growth accounts for development corridors that might form through some happenstance path-dependence in the anchoring of their settlement or, more likely, form under the influence of irregular natural features, for example, along rivers or rough terrain [128].Fifth and finally, road-induced growth is used to produce urban growth around transportation nodes and links [129].For each of the five processes, they may run in parallel or as compound sequences (such as leap-frogging followed by nearby growth).
While the five geographic processes just mentioned produce patterns of growth in the model, population serves as the engine of growth.In other words, the five processes provide the spatial distribution of population geography over the evolving city-system in simulation.It is also important to note that each of these rules 'sees' population density (and sees it through the lens of geography) only; they do not see related variables such as wealth or overcrowding.So, population is, in this sense, both a "good" coarse observable and a honest parameter of the model.
We treat population dynamics separately, but in a way that allows population totals (and by consequence density) to couple to the five process mechanisms.Again, population is one of the key ingredients and outcomes of our model.To enable simulated city-systems to be connected to other city-system not treated in simulation, we allow the possibility of exogenous growth, which we assume comes from in-migration of population to the system from without.Exogenous population growth is assumed to enter the system through specified gateway cities, from which it pass into the rest of the simulation processes.The input of population, in this way, is read-into the simulation continuously through a simulation run.We also treat endogenous population change within the system, to account for crude (in the demography sense of the word, as growth and decline due to factors other than movement) population dynamics.Once these units of population are produced in the system, they may also enter into the model's mechanics, allowing them to move, settle, produce new population, and so on.
The role of the geographic automata mechanisms, in simulation, is to determine the spatial distribution of exogenous/endogenous population change.For nearby and immediate growth, development probabilities and resulting population densities are obtained in a straightforward manner and the rules operate in a fashion familiar in CA transition.For a given cell , , the population at time 1 is determined by a trade-off of exogenous change , endogenous change , move-in (immigration) potential from a neighboring cell , within neighborhood , and move-out (emigration) potential , as follows: For the nearby and immediate rules, simple Moore CA neighborhoods are used as windows for diffusion of population.For the immediate growth rule, 1, which denotes a fixed Moore sweep of one-cell in band size (Figure 1)., , 0, because from the perspective of this rule at a particular cell , and time , immigration is not considered.Cell , can, however, receive population diffused to it from other cells , via the nearby rule (which from their perspective would be out-migration).From the particular space-time vantage of cell , at time , , , , for , , and , , .(We consider , simply because it denotes a fixed Moore neighborhood of nine cells, not considering seed cell , , cast within a Moore bandwidth of one.)For the nearby growth rule, we relax the neighborhood to permit a two-cell band Moore neighborhood (Figure 1), such that from the vantage of , at , and for , , , move-out conditional probabilities are calculated as follows: , , , , or  The other three rules, for leapfrog growth, road growth, and irregular growth require agent-based movement to achieve in simulation.Under these schemes, a sub-transition is initiated (usually subject to Monte Carlo simulation) within the transition 1 .During this bundle of space and time, an agent is created on , , endowed with the population for that cell, and traverses through a set of cells , based on one of the three processes.Either during the traversal or at the end of the traversal, the agent will modify the states of the cells traversed, which will then be available for further transition at 1 .The leapfrog growth rule creates agents on the fringe of the urbanized mass, which then execute either a nearby growth transition rule, or an immediate growth transition rule as dictated.The road growth heuristic selects a set of cells (the yellow portions of Figure 2b) within an existing urbanized mass and designates them as hubs.It builds a polyline (red in Figure 2b) between the hubs and initiates a nearby growth transition rule, or an immediate growth transition rule as dictated along that polyline to create linear corridors of growth.The irregular growth rule is initiated as a correlated random walk across the simulated space (Figure 3).

Time-Stepping and Coarse Projective Integration of Population as a Macroscopic Observable of Urban Sprawl
We use population density, which is encoded as a dynamic state of the GA structure, to guide the coarse projective integration in the EF simulation of the model.In essence, this provides a computational exploration of the behavior-space of the model in simulation.That exploration is guided by the premise that while the states of automata in simulation have many degrees of freedom in their transition, a simpler systems-level description may also be observable in coarse form.We consider population as the smooth, coarse level, macroscopic variable of the system, which is appropriate, as population is the variable that we mobilize and develop in the model.A time-stepper function is used on population to probe the dynamic evolution of the model, to estimate numerical temporal derivatives of population, and to temporally project values of the variable ahead using the estimates.This is achieved as follows (Figure 4).First, we must identify an appropriate coarse-grained variable, which should sufficiently describe system dynamics in some way that is useful to the processes and patterns than unfold in simulation.As already justified above, we use population.Second, a lifting operator is used to map the macroscopic description (population) to one or more consistent microscopic descriptions.As discussed in the Introduction, this is "coarse population", a smooth population density profile over geography; it is the expected population density, averaged over several random realizations of the fine-scale, stochastic, agent-based model.In practice, we consider a discretization , of such a smooth profile , over a coarse , grid (with interpolation between the grid values).We then use a lifting operator to map the macroscopic description to one or more consistent microscopic descriptions.This is achieved by producing an ensemble of realizations of this coarse smooth profile through integer-numbered assignments of agents in each cell of a fine-scale , grid, , , , 1,2, … , .These integervalued instantiations on the fine grid are consistent, on average, with the real-valued population density on the coarse grid.Technical details on the sampling of such integer valued profiles using the cumulative distribution function (CDF) in one dimension, as well as marginal and conditional distributions in dimensions higher than one can be found in [24,130].Third, with the ensemble as a lifted, microscopic realization at an initial condition, we perform fine-level simulation using automata to generate later-stage conditions.In essence, the automata realize versions of the fine-scale dynamics.Importantly, this may be repeated for ensembles of realizations consistent with the same macroscopic initial condition, to reduce variance if required, and to generate proper ensemble averages.Fourth, a restriction operation is performed at subsequent steps of time, , , … , , up to time .The restriction operator maps the simulated microscopic states to the macroscopic description.This is achieved by calculating profiles, , realization as a count over each cell , .This is used to garner ensemble-averaged population profiles by coarse time-stepping ensemble and space-averaged population profiles, , , , , … , , .Fifth, the population profiles are used for extrapolation.We numerically estimate the temporal derivative of the population profile, , , using, for example, least-square fitting of the last population profiles , , , (In general, we could use other estimation methods, such as maximum likelihood [78].)In our simulations here the fine and coarse meshes coincide; the averaging is performed over several realizations of the detailed dynamics, so that the expected population behavior is evolved in time.We produce an extrapolation of this expected, coarse population by extrapolation over the time interval as: and iterate the process again and again.

Simulating Sprawl in the American Midwest
In order to build a realistic test-bed for the model and equation-free computing scheme, we simulated the urbanization and urban growth of the system of cities, towns, and suburbs that form the Midwestern Megalopolis [17,131] around the southern rim of Lake Michigan in the United States.The simulation was performed to represent two hundred years of urbanization, from 1800 to 2000.The seed sites for urbanization over this space are known from the historical record.We used the larger sites in simulation: Madison, WI; Milwaukee, WI; Chicago, IL; Gary, IN; South Bend, IN; Grand Rapids, MI; and Lansing, MI.The population counts for these cities are also known from historical census records, at decadal snapshots of space and time (Figure 5).So, we are able to use the seed cities to introduce geographical path-dependence/inertia to the system, and actual census data, to produce rates of population change for the system.Portions of census-derived population are released to the seed cities over the simulation run at rates commensurate with historical records.This represents in-migration to the system [132].Endogenous change is also possible, as already explained, and this population has the ability to mobilize within the simulated city-system.The task of the simulation, then, is to actionably simulate space-time urban geography from these parameters and using the rules described in section 4.1, with the assistance of the meta-simulation architecture described in Section 4.2.

Parallel Computing
We ran direct simulation and coarse projective integration (CPI) simulation on a parallel computing cluster to achieve resolution of the simulation scenarios in an efficient manner.We ran simulations on Princeton University's TIGRESS cluster, using 25 of its 3.2 GHz Xeon processors with 5, 3, and 5.There was also a secondary motive for employing high-performance computing, as automata are, at their core, finite state machines.They are the thirteenth dwarf of parallel (or even extreme) computing.Dwarfs are patterns of computing and communication, collections of which may be usefully distributed spatially and/or temporally over distributed computing grids.A review by Asanovic and colleagues reached the conclusion that in alleviating computational burden for the thirteenth dwarf, "nothing helps!" [95] (p.45).At issue, here, is that while the idea of automata (particularly agentautomata) hypothetically scales for relatively straightforward (simple) automata models, the reality in many urban simulations is a set of automata specified with detailed state-variables and with multiple (and multiply-involved) transition rules, properties which are thorny to handle in computation.In our model, for example, automata may run one of many (or combinations of many) transition rules, heterogeneously, in run-time.Such compound behaviors and processes are not easily abstracted into continuum aggregates, as are commonly used in particle models [133], or into master equations as commonly used in agent-based computational economics [134].In the particle physics and economics examples, the aggregation of transition functions allows the model to focus on the exchange of information (states) between automata entities in simulation.However, in our models, because the rules are diverse, multiplicative, and often sensitive to the spacing and timing of when and where they are executed in simulation, we have to handle dynamic states, dynamic transition, and dynamic neighborhood filters.Moreover, our automata are mobile, which increases the potential for interaction for all of these components [135].When the agents are allowed to be polyspatial, we introduce further interactive capability (and computing) between scales.No matter where one looks to expand or deepen the model, computational burdens increase.As the number of automata increases, computing increases and the same is true as the number of states, the number of neighborhood filters, the amount of movement, the number of rules, and the number of runs increase.
We expect that our EF scheme can accelerate the computing required to extract useful information from the model.The outstanding question is whether it can do so while also preserving the fidelity of the simulation dynamics.

Results
The spatial patterns and space-time dynamics of urbanization are shown for snapshots of the simulation in Figure 6.Characteristic sprawl patterns are generated, with concentrated population in central cities, and a distance-decay in the population profile [59] toward the urban fringe, where a large band of orbital sprawl forms (largely due to leapfrogging urbanization [136]), supported by (and supporting) the central core.Elsewhere, we describe, in detail, various metrics for benchmarking the composition and configuration of sprawl as a dynamic geographic phenomena [16].We will abstract from that discussion in this paper, where our focus is on the simulation architecture for achieving relatively straightforward simulation atop relatively complicated models.In the remaining discussion, we concentrate on the plausibility and efficiency of the EF scheme in supporting sprawl simulation.

Plausibility of Sprawl
To gauge the plausibility of the CPI scheme for coarse simulation on population, we compared differences in population estimation, per cell, in the simulated space (Figures 7 and 8).We also compared population counts nearby the simulation's seed sites for exogenous population input at major cities (Figures 7 to 9).(Please note that these population counts are for cells near the sites, not for the entire metropolitan areas reported by the Census Bureau.They differ from those illustrated in Figure 5 for that reason.)The difference maps shown in Figure 7 are roughly equivalent to standard kappa-statistics used in remote sensing registration [137] and in many urban CA model validation tests for per-pixel comparisons [92].We have explored other registration and comparison schemes, which are discussed in [16].The errors introduced by the extrapolation operator in coarse projective integration (CPI) between time-steps 100 and 150 were relatively large (~50% for brief periods of simulation) in some places in the model.This was particularly true close to seed sites (Figures 7 and 9), which makes sense as they have the greatest population turnover of the simulated space and thus a higher chance for disagreement between simulation schemes.However, the errors between simulations quickly heal, i.e., they return relatively quickly to a stable trend with relative errors no more than ~30% at worst.This is evident in the relative spikes in Figure 9 around the 150 time-step.One possible explanation for the dramatic shift in registration between direct simulation and CPI at this point is the merging of several of the cities' hinterlands at this time, as the independent cities in the Midwestern Megalopolis begin to form as a federated urban complex.

Efficiency of the Meta-Simulation Architecture
A second test for the meta-model (perhaps even a supra-model) architecture was computational efficiency.As referenced earlier in the paper, automata models are beholden to the thirteenth dwarf of parallel computing-parallelizing one's way out of computational burdens in simulation is not always straightforward.It is therefore desirable to have some options in the simulation scheme (as well as the information processing) to circumnavigate these difficulties.By simplifying the simulation architecture atop the underlying model, to allow the EF scheme to run the automata components in short-burst experiments, and integrating the resulting population dynamics as a coarse observable, we were able to (1) accelerate the resolution time of the model, improving its efficiency, while (2) holding as faithfully as possible to the space-time dynamics of the represented processes (sprawl in this case).Direct simulation of the Midwestern scenario took 806 seconds, while the CPI-accelerated simulation resolved in 470 seconds, for a ~42% time saving.(This is discussed in more detail for a variety of simulation scenarios beyond the Midwestern example in [63].)

Conclusions
A near-standard argument in agent-based modeling is that models should be simple.This argument is particularly pervasive in computational social science and elements of the mantra are also beginning to appear in the burgeoning science of cities [12], as supported by urban simulation.Certainly, models should be simpler than the realities that they seek to represent in many cases and there is often pedagogical value in simple models used in the classroom, or in effectively communicating complicated ideas and complex systems.But, there is also a need for detailed models and one of the often-advertised advantages of the automata approach is its ability to map to the specific, independent detail of the agent (in urban contexts, individual people, households, properties, and land parcels) with ability to represent the autonomous actions of these agents as they interplay in massively dynamic systems contexts.Some urban processes and phenomena are simple, but most are not.For other urban processes, there may be some key feature of the system that produces its most dramatic phase shifts (Thomas Schelling's tipping point is a classic example [138] and the emerging obsession with rank-size rules is a recent archetype [37]), but even in these cases there are many other pieces of the system that must be accounted for before that single feature can be isolated, and realistically those features are usually artificial indices of much more involved underlying processes [10,11].
In many cases, urban models are (very, very much) simpler than the supporting observational and theoretical scaffolds that support them; in other instances they are simpler than the data available to feed them.In most cases models are interpreted with broad strokes descriptive statistics that simplify them even further.So, why do we bother to build urban models at all?In the geographical sciences, a main motivation is to energize our theories, observations, hypotheses, and data over space and time and many models are beginning to focus on explicitly spatial processes (as well as spatialized economic, social, environmental, and so on processes) to achieve this [139].In part, the field of geosimulation [39] and tool-kits such as geographic automata has been developed and advanced in response to this impulse [140][141][142][143][144].
However, urban simulation, particularly in the geographical sciences, is in some ways limited by several decades of devotion to simple models [9,145].Ironically, the trend toward simplification of models has reappeared in agent-automata schemes, despite the potential that automata computing structures offer to build intricately detailed models that scale from the microcosm to the macrocosm of urban phenomena.The availability of easy-to-use modeling packages [146,147] has perhaps emphasized a simple approach at the expense of supporting the meticulous work that needs to be done to reconcile perhaps hundreds of years of practical and theoretical experimentation with simulation-assisted urban studies and the now burgeoning silos of big data that we have pulsing-in from decades of qualitative survey, remote observation, urban sensor grids, and passive citizen sensing.Curiously, these trends in urban simulation are quite at odds with the state-of-the-art in the hard science modeling endeavors that the idea of a 'science of cities' [12] seeks to emulate.Consider urban climate modeling, for example, where community models [13,148,149] grow more and more detailed and powerful each year and calls to keep such models as simple as possible are relatively quiet.Indeed, increased sophistication in urban community modeling in the climate sciences has allowed for large unified models to be developed, advanced, and honed [150].
We need a way to reconcile complicated models with simple simulation if we are to successfully and meaningfully develop a simulation-assisted science of cities. Simple models and tools to think with are certainly a component of this science, but the backdrop must be formed from the best available science, data, and exploration and this would seem to suggest detailed, complicated models that delve deeply into the thorny complexities of urban systems.We also need to future-proof urban simulation so that it is extensible to: (1) the fact that cities are appearing, developing, changing, and even disappearing in new ways; (2) the range of questions that we might wish to pose in simulation, which are going to advance, deepen, and shift in focus as our science develops and modeling capabilities advance; (3) the sorts and volumes of data that may well become available in the near-term and future; (4) the sorts of model approaches that are deployed or developed to explore urban processes; and (5) the evolving community of models and meta-models that might emerge under the umbrella of a science of cities.Furthermore, we need to acknowledge the reality that urban simulation is now inextricably bound to computing, and has been so for many decades.We need to embrace extreme computing (whether mobile, parallel, threaded, GPGPU-assisted (general purpose graphics processing units), or high-performance) as a fundamental component of our urban simulation pipelines and methods, not just as an afterthought once a model has been specified.It would benefit the community to develop dedicated computational architectures for running and experimenting with urban models.In part, this is the argument proposed by Asanovic and colleagues in introducing the thirteen dwarfs [95].Urban simulation could be a specific dwarf that needs to be carefully addressed at all stages in model development and processing.
We believe that the approach that we have demonstrated here provides a template to support advances beyond the limitations that we have just discussed.It has, already, proven useful in many 'hard science' applications, in biology, chemistry, and physics, for example [98,[151][152][153][154].It has the potential to work with other multi-variable control schemes employed in other sciences [155].Here, we presented just one variable-population-but hopefully the reader can imagine how others could be usefully engaged in the same scheme.Its broader utility in urban simulation, where complex sociospatial and socio-economic processes comingle with physical phenomena is only beginning to be explored [63].Our work as presented here illustrates that it is potentially quite valuable in moving urban modeling beyond the current state-of-the-art.
, , , , … , starting with in a corner of the neighborhood filter.

Figure 1 .
Figure 1.(a) The immediate rule diffuses development into the five cell Moore neighborhood (in blue, also including yellow cell) of a given cell (in yellow alone).(b) The nearby rule diffuses to the 25-cell Moore neighborhood.The red box in (b) denotes a corner from Equation (4).

Figure 2 .
Figure 2. (a) The leapfrog rule casts development (blue) at a distance from a seed cell (yellow) within a neighborhood window, to the periphery of that window.(b) The road rule forms connective polylines (red) of development (blue) between seeded node cells (yellow).

Figure 3 .
Figure 3.The irregular rule is used to establish development (blue) that takes place along a linear trajectory (red) abutting assumed physical obstacles, in this case adverse terrain and a water feature.

Figure 4 .
Figure 4.The scheme for coarse projective integration.

Figure 5 .
Figure 5. Historical population totals for seed sites in the simulation, derived from census data.

Figure 6 .
Figure6.The Midwestern sprawl coarse projective integration simulation at three time steps (darker yellow indicates higher relative population density than the case of lighter yellow).(Because we are averaging over many possible (heterogenous) growth trajectories/geographies, the spatial configurations above look relatively smooth).

Figure 7 .
Figure 7. Relative agreement of population counts, per-cell, between direct ensemble simulation and simulation by coarse projective integration at time-step (a) 50, (b) 100, and (c) 200.Insets show seed city-sites for the model, with magnification expressed.The y-axis unity is 100%.

Figure 8 .Figure 9 .
Figure 8. Relative differences in population percentage (unity is between direct ensemble simulation and simulation by coarse projective integration.(In these figures we use just one shade of yellow and displacement in the y-axis indicates relative population change in positive or negative terms).