Urban Expansion Simulated by Integrated Cellular Automata and Agent-Based Models; An Example of Tallinn, Estonia

: From 1990 to 2018, built-up areas in Tallinn, Estonia’s capital city, increased by 25.03%, while its population decreased by − 10.19%. Investigating the factors affecting urban expansion and modeling it are critical steps to detect future expansion trends and plan for a more sustainable environment. Different models have been used to investigate, predict, and simulate urban expansion in recent years. In this paper, we coupled the cellular automata, agent-based, and Markov models (CA–Agent model) in a novel manner to address the complexity of the dynamic simulation, generate heterogeneity in space, deﬁne more complicated rules, and employ the suitability analysis. In the CA–Agent model, cells are dynamic agents, and the model’s outcome emerges from cellular agents’ interactions over time using the rules of behavior and their decisions concerning the adjacent neighboring cells and probabilities of spatial changes. We performed the CA–Agent model run two times for 2018 and 2030. The ﬁrst simulated results were used to validate the performance of the model. Kappa showed 0.86, indicating a relatively high model ﬁt, so we conducted the second 12-year run up to the year 2030. The results illustrated that using these model parameters, the overall built-up areas will reach 175.24 sq. km with an increase of 30.25% in total from 1990 to 2030. Thus, implementing the CA–Agent model in the study area illustrated the temporal changes of land conversion and represented the present spatial planning results requiring regulation of urban expansion encroachment on agricultural and forest lands.


Introduction
Urban expansion is the process of conversion of lands to urban [1]. Shifts from agricultural lands [2][3][4], forest lands [5,6], water [7][8][9][10], and wetlands [6,11,12] are among the most critical transitions to urban lands. This causes adverse impacts on the physical environment and is one of the leading causes of natural ecosystem degradation [13]. It causes loss of the agriculture and croplands, habitat fragmentation, heat island effects, and reduction of surface watercourses [14][15][16]. It also causes changes in landscapes [17][18][19][20] and urban transportation needs by increasing travel distance and commuting trips between the city and suburbs, demand for private cars, and fuel consumption [21,22].
Land use/land cover (LULC) maps are the products of the classification of satellite remote-sensing images [23], which enable quantitative spatiotemporal analysis across geographic regions [24] and provide helpful information about the transformation of lands [25,26] to a temporal extent. Eventually, extracting urban areas from LULC maps and modeling provides a critical investigation into urban expansion's driving mechanisms, future trends, and LULC transitions [27].
Since urban expansion is a dynamic and complex process, many spatial models have been constructed to investigate, predict, and simulate it. Simulation models can develop scenarios for future-oriented decision-making [28] by preparing a projection of land-use changes, expecting the future urban land demands and spatial distribution of these demands [29]. Among others, these models include the cellular automata (CA) model [30][31][32], CA-Markov model [33][34][35], logistic regression model [36][37][38], agent-based model [39,40] and multi-agent-based model [41,42]. Of these models, CA has been the most popular model employed by researchers since its conceptualization by Ulam and Von Neumann in the 1940s [43,44]. A countless number of studies on the application of CA in modeling urban expansion exist. For example, Ma et al. [45] attempted to examine CA to simulate urban expansion in China. Putting seed data and control layers, they explored the parameters for their model and reached high accurate simulation results. White et al. [46] employed the constrained CA model based on the development intensity or dynamic constraints in space and time to predict urban land-use dynamics. Mozaffaree Pour and Oja. [47] modeled urban expansion in Harju County in Estonia to investigate its driving forces and predict the future trend in a regular cell space.
Integrating CA with other models increased the popularity, efficiency, and quantity of the simulation [48] in the field of urban expansion. Integrated CA-Markov has been implemented in research, considering that the trend and pace of changes in the urban expansion are similar in the past and future. In a study performed by Li et al. [34], they explored the capabilities of CA-Markov to simulate urban expansion in China. Prediction of urban expansion employing the CA-Markov model in Iran was performed by Jafari et al. [49] and revealed expansion on the periphery of population centers with encroachment to the forest and agricultural lands. The integrated logistic-CA model was conducted in some research [50][51][52], exploring the spatial feature of urban expansion and defining transition rules. In the research by Mustafa et al. [50], they took advantage of multinominal logistic regression and CA to assess the probabilities of causative factors and neighborhood effects in the urban expansion of Belgium. To explore the relationships between land conversion and driving factors of urban expansion, Arsanjani et al. [52] employed integrated logistic-CA-Markov models to simulate land use maps consisting of built-up lands and calculate the quantity of land-use change using a transition area matrix in Tehran, Iran.
Furthermore, several studies enhanced the CA model with agent-based models to explore the drivers of urban expansion, boost the behavioral rules by defining the dynamic agents, and determine more realistic neighborhood effects to simulate urban expansion. Mustafa et al. [29] have considered a combination of CA-Logit and agent-based modeling to capture the dynamics of neighborhood interactions and static drivers of urban expansion in three levels of agents with homogeneous characteristics and behaviors. Liu et al. [53] constructed a land-use simulation and decision-support system (LandSDS), integrating agent-based and CA modeling in China for two homogeneous agents. However, models of urban expansion simulation vary in terms of data requirements, mechanisms, and application scales [48]; these models have limitations, and considerably new integrated model approaches are required to meet the complexity of urban expansion nature, which deal with spatial heterogeneity, local interactions, and neighborhood effects.
To this aim, we coupled the CA model with the agent-based model (CA-Agent model) to address the complexity of the dynamic simulation, generate heterogeneity in space, define more complicated rules, and employ the suitability analysis in a novel way.
Cellular automata as a bottom-up model is based on regular or irregular cell space depending on the complexity of the representation of reality [54] and consider the interactions between cells and their neighbors implemented by transition rules. Defining transition rules is the critical step in CA models. While transition rules are invariant through time [32], considering the spatial heterogeneity among the cells and uneven development in simulation requires improvements of transition rules [55]. Applying the appropriate thresholds is a way to address the spatial heterogeneity [56]. One of the key benefits of our CA-Agent model is implementing different transition rules manipulated by suitability thresholds for cellular agents.
Moreover, a heterogeneous neighborhood impacts the spatial heterogeneity [57] and the model simulation results. Taking advantage of the suitability analysis, we enhanced the CA-Agent model performance to simulate the most suitable areas for new areas expansions [39]. Eventually, the CA model is flexible with an adaptable structure capable of integrating with other models [58,59].
Additionally, agent-based models as generative simulation modeling [60] employ the behavioral factors of agents and their interactions with the environment and with one another to simulate the urban evolution [61] and track the dynamic changes from one agent to the whole area [62]. Agents could be defined as human or physical entities [63]. We defined cells as the dynamic agents in our model. The model's outcome emerges from agents' interactions over time [60] and the probability and randomness of the agent's behavior [64]. The CA-Agent model's significant advancement is exploiting the Markov model to enhance the allocation of probabilities to the model. The Markov model as a stochastic model describes the state of the cells regarding their previous states by specifying a series of random values to each cell, and the results represent the probability of transition [65][66][67]. The Markov model prepares an estimation of the quantities of LULC changes [68] appropriate for describing the complex structure of urban systems.
Coupling these models in a GIS environment is a novel approach that leads to a better understanding of the dynamics of urban expansion driving forces. The main distinction of the CA-Agent model compared to the previous models is implementing dynamic factors [61], enhancing the spatial behavioral rules [41,42,61,[69][70][71][72][73] of the autonomous cellular agents, their decisions concerning the neighboring cells, and probabilities of spatial changes. Additionally, agents acting in a cellular space can change their behaviors over time, so it is possible to understand the evolution of spatial patterns [32]. Additionally, the CA-Agent model can highlight the spatiotemporal dynamics of urban expansion at the local level.
As the process of expanding the cities mainly occurs in the immediate neighboring lands from main cities [61], in line with the observations of Mozaffaree Pour and Oja [74] on the expansion of Tallinn in Harju County, we considered the buffer of 15 km from the center of Tallinn. The buffer of 15 km was chosen as this is where most of the new development occurred between 2000 and 2018, and the nearest cities and surrounding municipalities started expanding.
It is important to note that until 1990 (during the Soviet period), the use of land adjacent to Tallinn was relatively strictly regulated, and agricultural land change into settlements was almost excluded. Urbanization-related land-use changes were not happening around Tallinn due to state-established limiting regulations. After regaining the independence of Estonia in 1991, regarding the land reform and revision of planning principles on the one hand, and economic growth and increase in personal wealth on the other hand, the location of new constructions considerably changed to a scattered form in Tallinn's neighboring suburbs. At the same time, wealthy people moved to the suburbs to improve their living conditions in detached houses [75][76][77]. Thus, the consequences of urban expansion require effective decision making implementing such models in spatial planning.
To monitor the footprints of urban expansion, we used SCP (Semiautomatic Classification Plugin) in the open-source software QGIS 3.10 (Free Software Foundation, Boston, MA, USA). Using SCP allows the possibility of downloading the satellite data directly, processing the data, classifying supervised and unsupervised remote sensing images, and post-processing the data [78]. To run the CA-Agent model and prepare the simulation, we employed the Repast platform and AgentAnalyst extension in ArcMap 10.6 (Esri, California, USA). Further, to analyze the Markov model and validate the simulation result, we used IDRISI TerrSet software (Clark Labs, Worcester, MA, USA). This paper is structured as follows. Section 2 represents the study area in Tallinn and its 15 km buffer zone, the data collection, processing, analyzing, and framework of the CA-Agent model. The third section shows the results of model implementation in the study area and validation. Section 4 discusses the output and innovation of the proposed approach, and finally, the paper concludes with an overview of the whole process.

Study Area
In this paper, our focus was on built-up areas in Tallinn and its 15 km buffer zone to monitor the process of urban expansion and simulate its future trend ( Figure 1). Tallinn is the capital city of Estonia and is located in the northern part of the county and neighboring the Gulf of Finland. The latitude and longitude coordinates of Tallinn region are 59.43 and 24.75. Tallinn covers a 159.37 sq. km area. Considering the 15 km buffer zone, the study area is about 506 sq. km. Essentially, the sea area in the buffer has been excluded. Between 2000 and 2017 in Estonia, 138 km 2 of new built-up areas appeared, most of them around Tallinn. Mostly new dwelling areas replaced previous agricultural lands [79]. Population changes in the surrounding in Harju County follow a similar pattern; the decrease from 1990 to 2000 was around 13%, increasing from then, which has brought the population back to the same level as in 1990. The dominating migration has been from Tallinn into surrounding municipalities (feeding the new settlement areas) and from other regions of Estonia to Tallinn. The particularity of the Tallinn region is that the extension of the urban area happened without significant population growth.

Data Collection
In this study, the built-up areas of 1990, 2006, and 2018 were extracted from LULC images ( Table 1) resulted from the Landsat images classification using the SCP plugin in QGIS 3.10. The SCP plugin has an option to download the data from the United States Geological Survey (USGS) with radiometric and geometric corrections and the spatial record (level 1C). Using processed and georeferenced data, the projection used for this study was Lambert Conformal Conic (Estonian national grid of 1997, EPSG 3301). The other spatial layers were road networks (main and local roads, railways), waterbodies (watercourses and lakes), and the administrative boundary of Tallinn were downloaded from the Estonian Topographic Database (ETAK) and geoportal of the land board of Estonia. Polygon data of airport and wetlands was extracted from the CORINE Landcover database. The reference year of these data was 2018 and was resampled to 30 m resolution to be consistent with the LULC data and applicable for modeling purposes.

Image Processing
In this paper, we applied a supervised classification technique using the SCP complement for QGIS. Training samples bounded by polygons named Regions of Interest (ROIs) were obtained for four land use classes of built-up, water, vegetation, and others. Different numbers of ROIs were collected for each class. We applied the maximum likelihood (ML) classifier to obtain better results as many authors used this classifier in their research [38,[80][81][82]. To assess the accuracy of maps, Google Earth [24] and the same Landsat images were used as the ground truth images. Validation of the classification was calculated based on the "Kappa coefficient", which takes into account the agreement between observed correct and expected correct in the classified map and ground truth map [83] and "overall accuracy", which determines the percentages of correctly classified pixels by reference pixel samples [84,85]. The Kappa coefficient mathematical expression is given as follows: where P 0 indicates the observed agreement and P e is the expected agreement [63]. Overall accuracy is calculated using Equation (2): (2) where N is the number of classes in the LULC classification, n indicates the total number of collected sample units; A kk , which are the items in the major diagonal in the error matrix, represent the number of samples correctly identified [78]. The accuracy assessment results in Table 2 revealed excellent classification with accuracy of 98.20%, 97.80%, and 97.00% for the LULC maps of 1990, 2006, and 2018.

CA-Agent Model Framework
In our research, the framework's base model is an agent-based urban growth model developed by Li [86], a scenario-based model for a city with heavy population growth. The base model was performed on the level of cadastral parcels with the ownership information in a base year and presented urban growth scenarios for 20 years. To implement our CA-Agent model, it was necessary to improve the application of the base model's parameters, input data, procedures, and behavioral rules. To this end, the enhanced methodology and adjustment rules are defined as follows: (a). Cellular agent: In the CA-Agent model, a discrete cell space describes an agent's quantity or degree of development. We unified two approaches to upgrade the performance of the model. The built-up areas were dispersed in irregular polygons with different areas; however, the undeveloped cells were resized to the square polygon grids of 90 × 90 m using a fishnet tool in the GIS environment. While the processing took a long time, breaking the very large cells into smaller sizes with the fishnet tool will increase simulation results' accuracy and cover the whole cells homogeneously. This unified vector space can provide a more realistic representation [54] of the CA-Agent model. While each cell's status value is identical and related to the previous state, the adjacent neighboring cells and behavioral rules of the simulation model are not uniform, such that, based on different restrictions on lands, the degree of suitability of cells changes.
(b). Agent state: the state of a cellular agent is based on its temporal changes in built-up expansion. Temporal expansion of built-up areas allows investigation of agents as developed in the past or present and dedication to appropriate behavioral rules and probabilities. While the state of the agent is defined by its status and neighbors, its development status is affected by several factors such as cell size, a factor of buildable area and the least area needed for building a dwelling, the probability to be developed depending on the location of the cellular agent, whether it is in a previously developed area or an undeveloped area, also considering if it is in or near the Tallinn or suburban area, and determining the adjacent neighboring cells.
(c). Neighborhood status: the cellular agent uses the cell's neighbor information through a list of adjacent neighbors. Using the ArcGIS polygon neighbor tool, we explored the developed and undeveloped neighbors and each cellular agent's total number of neighbors. Table 3 represents an example of cellular agents and their adjacent neighbors after implementing the neighbor tool. The model extracts the neighbor information in each step to decide the cellular agent's development and percentage of its change in each time step to see if a neighbor has direct or indirect neighboring state as well as accessibility to local roads, which could be a grant for a cell to be accessed through its neighbors. Therefore, the dynamic changes of a neighbor's status is a factor in a cell's suitability. The neighboring cells influence the cells' development status within defined information, updated in each step. To delineate the heterogeneity of neighborhood effects, we defined the adjacent neighbors implemented in the base model [86] instead of using the Moore neighborhood, which is the most frequently used neighborhood [54] or applying a specified distance neighbor, which depends on the distance of neighbors from a kernel cell [44]. (d). Markovian transition probability: Adding a probability and randomness to the agent's behavior is critical in the simulation [64]. In the base model, the probability was determined in a random term. In contrast, the Markovian transition probability in the CA-Agent model was performed to add complexity to the agent's behavior and calculate the transition probability in built-up for two time periods from 1990 to 2006 and 2006 to 2018.
(e). Suitability factors and constraints: Analysis of suitability, which acts as environmental rules, was carried out using different accessibility factors that shape a cellular agent's behavior's suitability decision considering the dynamics of the study area's environment, neighborhood, and spatial characteristics. Constraints mainly refer to the factors preventing expansion [87]. The distance analysis was based on the Euclidean distance tool, and the constraints were analyzed using the buffer tool and fuzzy overlay analysis in the GIS environment. Like the base model with different selected suitability factors and constraints, in the CA-Agent model, the constraints are permanent during the run process and are precalculated to extract the number of unbuildable cells ahead of time. The ranges of suitability factors are between 0 and 100. During the model run, cellular agents normalized suitability rates to reach the actual value for each factor. The constraints and factors were converted to raster, and all the operations of cells run in the raster environment.
(f). Thresholds role in the agent's development status: Expansion of a city may face different physical limitations known as development thresholds [87]. Defining a threshold for the factors is necessary to give the complexity and reality to an agent's model and development status and boost the spatial heterogeneity. The CA-Agent model employs the exact terminology of allocating the thresholds to the agents. Thresholds are calculated based on the integration of constraints and factors during the initialization of the model.
(g). Behavioral rules: The behavioral rules form the development status of agents. These rules depend on different factors: the status of a cell, its neighbors, suitability criteria, constraints, and accessibility factors. Among other things, "if the cellular agent's ratio of unbuildable area to total area exceeds the threshold, then the agent will not be developed"; "if it falls in a constraint area, then it exhibited change"; and "if more neighbors are built up, then it is likely to develop" are the most critical behavioral rules. Consequently, the CA-Agent model can consider many rules that reflect the actual urban expansion process over time and space. Therefore, the number of cells changing their states is entirely defined by behavioral rules.

Spatiotemporal Patterns of Urban Expansion from 1990 to 2018
According to the LULC classification in this study, we adopted four classes: water (including rivers, ponds, and lakes), built-up (consisting of residential, industrial, and other impervious surfaces), vegetation (including evergreen forest, agriculture, and areas

Spatiotemporal Patterns of Urban Expansion from 1990 to 2018
According to the LULC classification in this study, we adopted four classes: water (including rivers, ponds, and lakes), built-up (consisting of residential, industrial, and other impervious surfaces), vegetation (including evergreen forest, agriculture, and areas mostly of grass), and other (covering bare earth, rocks, and wetland areas). As shown in Figure 3

Application of CA-Agent Model in Tallinn and Its Buffer Zone
In our model, an irregular polygon stated a cell. The cells are the dynamic agents. As shown in Figure 4, cellular agents have three different values; Value 1 is allocated to the cells developed to be built up between two timestamps, Value 3 is the previous development to be built up, and Value 2 shows the undeveloped cells. It is an irregular spatial unit that provides the situation closer to reality. The cell's size in the undeveloped area with Value 2 ranges from 127 to 8100 sq. m, while the agents with Value 3 are defined to show the initial development of Tallinn and surroundings; during the model run, Value 2 agents have less likelihood to be developed in the neighboring areas unless they have high probability of being located in Tallinn or being a large cell. Neighboring with Value 1 agents who developed later than Value 3, the probability of being large, isolated, in Tallinn, or the buffer zone will be evaluated. As shown in Table 4, urban areas during the 28 years increased by 40.80 sq. km (+25.03% change) in Tallinn and its 15 km buffer zone. Urban expansion during 1990-2006 was 27.10 sq. km (18.15% increase), faster than the expansion of urban areas between 2006 and 2018, which was 13.70 sq. km and a rise of 8.40% in this timestamp.

Application of CA-Agent Model in Tallinn and Its Buffer Zone
In our model, an irregular polygon stated a cell. The cells are the dynamic agents. As shown in Figure 4, cellular agents have three different values; Value 1 is allocated to the cells developed to be built up between two timestamps, Value 3 is the previous development to be built up, and Value 2 shows the undeveloped cells. It is an irregular spatial unit that provides the situation closer to reality. The cell's size in the undeveloped area with Value 2 ranges from 127 to 8100 sq. m, while the agents with Value 3 are defined to show the initial development of Tallinn and surroundings; during the model run, Value 2 agents have less likelihood to be developed in the neighboring areas unless they have high probability of being located in Tallinn or being a large cell. Neighboring with Value 1 agents who developed later than Value 3, the probability of being large, isolated, in Tallinn, or the buffer zone will be evaluated.  We took advantage of Markovian transition probability results to allocate the probabilities in our model, which is presented in many factors such as the probability of a cellular agent to be developed depending on its size and its location if it is in a previously developed area, within the built-up area or suburban (Table 5).  Additionally, suitability factors consist of "distance to Tallinn", "distance to main roads", "distance to local roads", and "neighborhood status". Six constraints were defined to limit the buildable lands, which was a fuzzy overlay combination of "50 m buffer of main lakes", "30 m buffer of railways", "25 m buffer zone of watercourses", "50 m buffer of main roads", "50 m buffer of the airport", and "50 m buffer of wetlands". The con- We took advantage of Markovian transition probability results to allocate the probabilities in our model, which is presented in many factors such as the probability of a cellular agent to be developed depending on its size and its location if it is in a previously developed area, within the built-up area or suburban (Table 5).
Additionally, suitability factors consist of "distance to Tallinn", "distance to main roads", "distance to local roads", and "neighborhood status". Six constraints were defined to limit the buildable lands, which was a fuzzy overlay combination of "50 m buffer of main lakes", "30 m buffer of railways", "25 m buffer zone of watercourses", "50 m buffer of main roads", "50 m buffer of the airport", and "50 m buffer of wetlands". The constraints, regardless of their land LULC classes, play leading roles in determining the buildability of the agents. We applied different restrictions on constraints based on the guidelines of Estonian legislation on new constructions named "Riigi Teataja". This building code was prepared to promote sustainable development and ensure constructions' safety and purposeful performance in Estonia. It has differentiated the construction restrictions upon the type of the protected zones, whether roads, railways, water, or other zones. Figure 5 represents the results of indices analysis.

Simulation of Urban Expansion by 2018
In this step, each cell has one value. Based on the built-up states, if a cell is developed before 1990, its value is assumed to be 3. If a cell is developed between 1990 and 2006, its value is 1, and if it is undeveloped, its status value is 2. The cellular agent can change its status in each time step. At each iteration, the agent re-evaluates its current conditions based on the several probabilities of changes and behavioral rules and decides whether to change them or not. The simulated map of urban expansion for 2018 is represented in Figure 6.

Simulation of Urban Expansion by 2018
In this step, each cell has one value. Based on the built-up states, if a cell is developed before 1990, its value is assumed to be 3. If a cell is developed between 1990 and 2006, its value is 1, and if it is undeveloped, its status value is 2. The cellular agent can change its status in each time step. At each iteration, the agent re-evaluates its current conditions based on the several probabilities of changes and behavioral rules and decides whether to change them or not. The simulated map of urban expansion for 2018 is represented in Figure 6.

Simulation Validation Results
The first two built-up maps of 1990 and 2006 were used to validate the simulation model. Validation aims to explore how actual data fits the model output [82,88]. Using the Validate option in IDRISI, its statistical analysis results in the quantity of cells agreement/disagreement and the agreement/disagreement in cells location.
The first simulation for the year 2018 on built-up areas was validated against the actual map of built-up areas in 2018. Table 7 indicates the statistical results of the validation. "Kstandard" is the Kappa coefficient, "Klocation" refers to the kappa for the grid-cell level location to monitor how well the grid cells are located on the landscape, "MediumGrid (m)"

Simulation Validation Results
The first two built-up maps of 1990 and 2006 were used to validate the simulation model. Validation aims to explore how actual data fits the model output [82,88]. Using the Validate option in IDRISI, its statistical analysis results in the quantity of cells agreement/disagreement and the agreement/disagreement in cells location.
The first simulation for the year 2018 on built-up areas was validated against the actual map of built-up areas in 2018. Table 7 indicates the statistical results of the validation. "K standard " is the Kappa coefficient, "K location " refers to the kappa for the grid-cell level location to monitor how well the grid cells are located on the landscape, "MediumGrid (m)" calculates the agreement between the reference map of 2018 and the simulation map in terms of proportion correct [89], "Q disagreement ", proposed by Pontius and Millones [90] as an alternative to Kappa, determines the amount of disagreement regarding the fails in specifying the correct quantity of each category of comparison map with the reference map, and A disagreement counts error in match the spatial allocations due to differences in the location of comparison and the reference map categories [89][90][91]. Kappa's result denotes that the higher the index value, the better the accuracy, while Kappa > 0.8 indicated very high accuracy [34]. From the results of the image comparison between the simulated map of 2018 and the LULC map, it was evident that K standard (0.86), K location (0.89), and MediumGrid (m) (0.91) demonstrated high accuracy of the simulated map, while the Q disagreement (0.02) and A disagreement (0.07) declared minor cell error match in the simulation result. Therefore, it is clear that CA-Agent model runs reached an acceptable prediction, and it can be concluded that the simulated parameters fit the reality.

Simulation of Urban Expansion by 2030
Employing comparative analysis, the CA-Agent model's result provides evidence of the close resemblance with the actual data of built-up. Therefore, simulation of the built-up area for 2030 can be experimented with by applying this algorithm. Therefore, we performed the second run of 12-time steps for simulating the urban expansion by 2030; the simulated results are shown in Figure 7.  Table 8 illustrates the results of the statistics of the model run. It is evident from the table that, from the total of 69,624 cells, 58,236 were undeveloped cells. A total of 2342 cells were determined to be unbuildable cells, and the threshold of the suitable lands reached 94%. After the 12 runs of the model, the total number of 2881 cells were developed, with the total areas of 12.22 square kilometers adding to built-up areas. The simulation results revealed that the built-up areas would expand by 6.97% from 2018 to 2030.  Table 8 illustrates the results of the statistics of the model run. It is evident from the table that, from the total of 69,624 cells, 58,236 were undeveloped cells. A total of 2342 cells were determined to be unbuildable cells, and the threshold of the suitable lands reached 94%. After the 12 runs of the model, the total number of 2881 cells were developed, with the total areas of 12.22 square kilometers adding to built-up areas. The simulation results revealed that the built-up areas would expand by 6.97% from 2018 to 2030.

Discussion
The complexity of simulation of the urban expansion in cities and suburbs needs decision making about the proper factors affecting the phenomena. While it is not adequate to observe the simulation over one model [92], hybrid modeling leads to better results. Coupling the cellular automata, Markov, and agent-based modeling in a GIS environment in this study led to a simulation of the urban expansion in Tallinn and its buffer zone. It is worth discussing the implemented components of the CA-Agent model demonstrated in this paper as follows:

Interactions of Cellular Agents
Integrating the cellular automata and agent-based models in our research allowed us to synthesize the probabilities and rules into one model. This capability allowed the cellular agents to decide the probability of spatial transition and the rules of behavior on interacting with the other cellular agents and their environment. In line with Wahyudi et al. [72], we applied interactions between "cellular agents to cellular agents" through making a query and extracting information from other cellular agents and "cellular agents to the environment" through influencing it by modifying the cell status from non-urban to urban. In other words, we have defined the interactions in several steps. First, a cellular agent searched the lands and collected the information for the probable development status depending on the cell's size, the least area needed for building a dwelling, the location, and adjacent neighboring cells. Then, it assessed the situation based on the behavioral rules, suitability, and probability of developing it. Reaching the thresholds allowed the cellular agent to take the development action, where constraints and limitations of transition defined the cellular agents' response to be developed during the model run. Thus, the most crucial step in the model was defining cellular agents' interactions to reach the development decision.

Applying the Adjacent Neighborhood
We have verified that the neighbor's status is an initial element of behavioral rules and a prerequisite for a cellular agent's development action. Research on neighborhood effects mostly applied a kernel cell influenced by its neighbors such as a Moore neighborhood [93,94] or a distance decay neighborhood [44]. We did not replicate the previous research; instead, we applied the adjacent cell's neighborhood considering the polygon neighbor list, which could cover all the possibilities of the accessibility by neighbors. The adjacent neighborhood is based on the spatial and quantity influence of neighbors [95]. It can implicitly address spatial heterogeneity; through the adjacent polygon neighbor list, the cellular agent assessed the number and proportion of neighbors that have developed at each time step and made its development decision. If no immediate neighbor or neighbor's neighbor has been developed, the cellular agent's probability value to develop was small.

State of the Cellular Agent
Typically, many studies have employed a lattice of square cells or a raster data surface as the state of their agents [72,94,96]. In contrast, some other researchers tried to develop irregular space states for their simulations [73,97]. Regular cells rarely represent the basic unit of land use, so adopting a conventional regular cell-based CA model could affect the performance of the simulation. Instead, applying an irregular structure of cells with different sizes can better represent the reality [98] of urban expansion, which took place in a non-uniform space. Therefore, in our model, we have integrated the regular environment and irregular cells in the data model and operations, which performed more realistically in capturing the geometric details. We showed that the cellular agent's development status depends on different factors such as accessibility and suitability. Therefore, a cellular agent's final state changed to built-up when a cell passed the whole assessment tests. It is consistent with what has been found in the research by Dahal and Chow [73] on the level of irregular cadastral land parcels, Pinto and Antunes [99] on the level of irregular cells of census blocks, and Chen et al. [98] on the level of patch-based simulation.

Coupling Markovian Transition Probability with CA-Agent
Using Markov Model integrated with cellular automata is a widely used technique to describe the likelihood of the state for conversion between two timestamps. It estimates the transition probability [38,[100][101][102] as the potential changes of the future. Berberoglu et al. [103] have shown that conditional probabilities of the Markov model are reliable for allocating the to-be-changed state of cells. A similar application of the Markov model was obtained by Aburas et al. [68] in predicting the quantity of urban and non-urban areas. However, in line with the idea of Xu et al. [69], we agreed that to overcome the limitations of the Markov model, it is required to integrate results with other models. In this case, investigating the values of transition probability of Markov in a CA-Agent model is a new approach that we employed in our model. Instead of randomly assigning the probability values to the change factors, we used Markovian transition probability, which extracted the spatial conversion during the time, resulting in a better projection of agent cell development status.

Configuration of Suitability Factors
Suitability factors play essential roles in the model configuration, indicating the state of the agent's likelihood to develop from a non-developed to a developed one. In our model, the cellular agents made their decision regarding the influence of some factors consist of "distance to Tallinn", "distance to main roads", "distance to local roads", and "neighborhood status". The different variables mainly depend on the reality of driving forces of urban expansion in the study area and changes regarding modifying the extent. In line with the study performed by Mozaffaree Pour and Oja. [47] on the same area, we reached the selected factors. Additionally, Tan et al. [93] demonstrated that distance to the city center, major roads, and the main river negatively influences urban expansion. Similarly, Mustafa et al. [50] used the distance factors to roads, towns, and railways, and Liu et al. [94] applied spatial variables of distance to the town center, roads, and water as the suitability factor.

Urban Expansion Simulation
During the 28 years from 1990 to 2018, built-up areas increased by 25.03% in Tallinn and its 15 km buffer zone. The increase of urban areas despite the decrease in population characterizes changes in the way of life. Urban expansion during 1990-2006 had risen by 18.15%, which was faster than the expansion of urban areas between 2006 and 2018 with an 8.40% increase. We performed the model run two times, in 2018 and 2030. The first 12-years runs represented 1736 cells converting from undeveloped status to developed. Then, we evaluated the performance of the model by operating the Kappa Index of Agreement. It shows a degree of 0.86, a relatively high degree of correct simulation result for the first simulated map. Similar trends have been observed in actual data in the research by Oja [79] in Estonia.
Further, we performed the second simulation for the year 2030. After the 12 runs of the model, 2881 cells were developed, with 12.22 square kilometers adding to built-up areas. Therefore, the total built-up areas from 1990 to 2030 will reach 175.24 sq. km with an increase of 30.25% in total. The result is in good accordance with the actual changes.

Research Limitations
Besides many advantages of the CA-Agent model in simulating the cellular agents' interactions over time, this research has some limitations. Because of the lack of time-series spatial data in the cadastral level in the study area, we decided to investigate the model at the cell level. While implementing the CA-Agent model with precise spatiotemporal cadastral data will generate simulation closer to reality, another limitation involves the inherent of the model itself, as each model has its limitation. The CA-Agent model, like the other simulation models, suffers from the extrapolation of past trends. It is a constraintsdriven model rather than a preference-led model and lacks the dimension of human decision. Furthermore, despite significant advancement in computing and modeling, there is still the issue of complexity in urban systems, which requires mediating knowledge with the reality of environmental changes.

Conclusions
Trade-offs between agriculture, forest, and built-up lands need simulation modeling approaches to link expansion orientation and quantify spatial planning activities. In this paper, we employed the CA-Agent model to monitor the process of urban expansion and simulate urban expansion in Tallinn and its 15 km buffer zone. Application of the CA-Agent model for spatial planning activities let us consider how implementing different driving factors, probability of changes, and modifying thresholds will affect the outputs of the model and the distribution of the built-up over time. Correspondingly, spatially explicit consequences of urban expansion support the effective decision making across the cellular agent's characteristics and rules of behavior in spatial planning.
It is essential to highlight that even though the built-up area in the study area increased by 25.03%, the population of Tallinn decreased by 10.19% between 1990 and 2018. The dominating migration has been from Tallinn into surrounding municipalities (feeding the new settlement areas) and from other regions of Estonia to Tallinn.
In this regard, modeling the factors affecting urban expansion in Tallinn area is critical to detect future expansion trends. The integrated CA-agent model in our research allowed us to synthesize the probabilities and enhance the rules into one model. This capability allowed the cellular agents to decide the probability of spatial transition and the rules of behavior on interacting with the other cellular agents and their environment. The main conclusion that can be notified is that taking advantage of cellular-based modeling, adjacent neighborhood information, and Markovian transition probability provides the simulated map of urban expansion in 2018 with a Kappa degree of 0.86, which confirms a relatively high accuracy of the implemented model components. Consequently, the model was considered "approved" for simulating the urban expansion in 2030. Thus, implementing the CA-Agent model in the study area illustrated the temporal changes of land conversion and represented the present spatial planning results requiring regulation of urban expansion encroachment on agricultural and forest lands.
This conclusion follows that urban expansion is a dynamic spatial process affected by different physical drivers. We considered the suitability factors of "distance to Tallinn", "distance to main roads", "distance to local roads" and "neighborhood status" and six constraints, namely, "main lakes", "railways", "watercourses", "main roads", "airport", and "wetlands". It best suits the simulation results to consider the social and economic factors that could be conducted in future studies. Altogether, the modeling proves urban expansion as a result of unplanned sprawl.