Dynamic Simulation of Urban Expansion Based on Cellular Automata and Logistic Regression Model: Case Study of the Hyrcanian Region of Iran

The hypothesis addressed in this article is to determine the extent of selected land use categories with respect to their effect on urban expansion. A model that combines a logistic regression model, Markov chain, together with cellular automata based modeling, is introduced here to simulate future urban growth and development in the Gilan Province, Iran. The model is calibrated based on data beginning in 1989 and ending in 2013 and is applied in making predictions for the years 2025 and 2037, across 12 urban development criteria. The relative operating characteristic (ROC) is validated with a very high rate of urban development. The analyzed results indicate that the area of urban land has increased by more than 1.7% that is, from 36,012.5 ha in 1989 to 59,754.8 ha in 2013 and the area of the Caspian Hyrcanian forestland has reduced by 31,628 ha. The simulation results, with respect to prediction, indicate an alarming increase in the rate of urban development in the province by 2025 and 2037 that is, 0.82% and 1.3%, respectively. The development pattern is expected to be uneven and scattered, without following any particular direction. The development will occur close to the existing or newly-formed urban infrastructure and around major roads and commercial areas. If not controlled, this development trend will lead to the loss of 25,101 ha of Hyrcanian forest and, if continued, 21,774 ha of barren and open lands are expected to be destroyed by the year 2037. These results demonstrate the capacity of the integrated model in establishing comparisons with urban plans and their utility to explain both the volume and constraints of urban growth. It is beneficial to apply the integrated approach in urban dynamic assessment through land use modeling with respect to spatio-temporal representation in distinct urban development formats.


Introduction
As stated at the beginning of the abstract, globalization could not have occurred if the world community had not advanced to the point where it is now.This fact is accompanied by population growth, specifically in developing countries which leads to land use expansion, which is the focus of this study.Urban sprawl is a term used in describing low-density contemporary settlement patterns with slight comprehensive public planning that ends with emerging economically inefficient, socially inequitable, environmentally damaging and non-aesthetic cities [1].Increasing awareness of the negative aspects of urban sprawl is a major topic of debate [2].Urban change should be managed in a manner where the environmental and social sustainability of an ever-growing urban population of the world is ensured [3] which by 2050 is expected to account for 60% of the world population according 10 UN predictions [4].
According to the importance of the issue, knowledge-based management of urban growth is the focus of relevant literature on urban planning [5].Since the 1960s, this modeling has been used in an extensive manner [6] in describing the previous, current and future patterns in urban development [7].The logistic regression-based Cellular Automata (CA) model was first introduced by Wu [8].This combined model, which was applied by [9,10].By comparison, the logistic regression-based analysis is one of the most frequently applied in the past two decades for land use modeling prediction where many inductive modeling are applied [11][12][13]; but it should be noted that this model lacks quantification of change and temporal analysis [14][15][16].On the contrary, the MC models can predict the volume of land use change and identify the structural utility, since MCs only compute the probabilities of land use transitions and the amount of change are spatially non-explicit [17].Obviously the error problems in CA models are not without uncertainty.Some of the errors in these models are innate due to human error and incomplete technologies and like any computerized model CA models, may not fit the real state given that the inputs are error-free, because these models are only an approximation to reality.It can be deduced that due to this fact each CA model can be defined as serving a specific purpose.In order to achieve accurate modeling the cellular automata must be calibrated.This essential procedure has been ignored for some time but it was revived when it became a necessity to develop cellular automata as a reliable procedure for urban development simulation.Temporal calibration is made through DEM and a sequence of remote sensing images in order to extract land use information of different periods per year.These models required a greater amount of data input variables which are not without drawbacks.Many uncertainties become evident during the simulation process and the results obtained from available studies indicate that the use of images and DEM reduced the need for input data mass, that is a reduction in stochastic uncertainty.Temporal calibration is subject to multi-temporal which allows the transition rule values to become changed over time in order to correspond with the variable urban patterns [18].In this article an attempt is made to prepare and calibrate multi-temporal images and geo-referencing image data through DEM and 1/25,000 topographic layers.Here in order to obtain high accuracy and to reach low uncertainty rates in CA modeling, all the extracted layers from the images and DEM are calibrated in a uniform manner where a cell size of 30 m is defined for all layers.The spatial CA models overcome this limitation of MC based on predefined site-specific rules which mimic land use transitions [19], while the CA models are not able to determine the actual amount of change.Since every modeling technique has its unique limitations [9], proposed is a hybrid approach based on logistic regression and CA transition rules, which resulted in an improved model quality; nevertheless, this model was not able to quantify the amount of land use change.Puertas and Meza [16] came up with some promising results after integrating the logistic regression, MC and CA models in simulating urban expansion but the hybrid model failed to consider a spatial auto-regression in determining the driving forces of LUCCs and was missing in this.CA-based models are simple and allow for dynamic spatial simulation [20,21].Rienow and Goetzke [22] combined CA SLEUTH with support vector machines (SVM) approaches for modeling dynamic urban simulation of different types of urban growth along with the analyses of various geophysical and socioeconomic forces that determine local urbanization suitability.In this study for simulating urban growth dynamics we used integrated techniques CA, MC and logistic regression modeling.
CA, when combined with logistic regression models as done by Wu [8] and Puertas and Meza [16], can predict a complex urban reality, thus the potential in being applied in this study.This hybrid 3 of 18 framework can form the basic structure of this presented model but when such models are applied in the context of Gilan Province, several considerations will have to be made.The first important consideration is to adopt the CA framework which would suit the realities in the city in a sense that the outputs of the model would provide information at an appropriate scale for urban policy making.The second consideration is to insert the designated variables into the logistic regression models.Given the availability of data, it is important to identify the methods that can be adopted in quantifying the inputs fed into the logistic regression model.The third consideration is to build a link with the policy framework, that is, the CA-based framework and to implement the model by using inputs from the logistic regression model and urban policies in a manner that the model would able to establish forward and backward linkages with urban development and control policies.This consideration should ultimately be reflected in the simulation of urban development and growth.Therefore this study furthers the work conducted on modeling the urban development and growth of Gilan Province Jat et al. [23], which are mostly examples of modeling urban sprawl but not able to model the development of different land uses.How could some recent applications introduced in the empirical literature on urban growth modeling be adopted in this study?
Due to its geo-logistic stance, Gilan experienced significant growth with respect to urban land use areas in recent decades.Annual rural influx expands urban population and promotes the urbanization culture.Unsustainable growth of the city has caused many socioeconomic and environmental problems like reshaping the landscape, emissions, heavy traffic jam, deforestation, and changes in land use.All in all, due to the pleasant climate and picturesque landscape of the Hyrcanian forests and the Caspian Sea have made this part of country one of the most densely populated areas.It is usually the first choice of tourism, domestic and Middle Eastern with low accommodation costs.This fact can pave the way to forecasting sustainable growth in future development by applying the necessary administrative-planned measures and avoiding uncontrolled construction and shapeless growth.Accordingly, this study is run with the objective of presenting a dynamic hybrid model based on LR, MC and CA to forecast the auto-spreading orientation by the years 2025 and 2037 and to answer the question of which land use class would lead to the most negative effects of urban development.The outputs here can equip the decision makers and relevant authorities with future orientations of the restricted urban growth.The main questions for this research are: Which one of the efficiency classes would have the most effect on assessing future plans regarding urban development with respect to its quantitative aspect, orientation and the relevant drivers in a given study zone?; does the combination of Markov and LR in CA method yield an acceptable pattern as an output in urban development for a 12 year period (2025 and 2037) with respect to environmental and socio-economical features of a given zone?
This article is structured as follows: The materials and method with all their relevant subdivisions are presented in Section 2; the results and discussion with all their relevant subdivisions are presented in Section 3; the conclusion is presented in Section 4.

Methods and Materials
The method adopted in this study is survey analytic, where the available reliable procedures are applied to satisfy this topic.The applications of these materials follow the sequence according to the presented flowchart, where the LR variables are considered as drivers.The process and the reason of applying the applications are described in due course.The analyses conducted through these applications each led to acceptable outcomes allowing the authors to integrate them in CA.

Study Area
Gilan is a tourist province in the north of Iran located over an area of 14,042 km 2 within the latitudes of 36  1).This area consists of two mountainous and plain features.The major urban area covers the plain with 32% of the total area with a gradient less than 5%.As for the climate, annual precipitation is 1506 mm, 38% of which is confined to the autumn season, October in specific.This region has predominant naturally resources, like fertile and productive soil types and rich Hyrcanian forests and agricultural lands, including 52 cities, 13 of which are coastal line and 2916 villages [24].Domestic immigration and tourism are the two important factors in significant population growth recorded in the recent decades, that is, any increase from 2,081,037 in 1986 to 2,480,874 in 2011 [24].This is followed by urban sprawl and extensive land degradation in Gilan.One of the most striking aspects of land degradation here is the loss of great parts of the Hyrcanian forests and rangeland areas.

Study Area
Gilan is a tourist province in the north of Iran located over an area of 14,042 km 2 within the latitudes of 36°34′N -38°27′N and the longitudes of 48°53′E -50°34′E covering Eastern third of the Caspian Sea coast with MSL of −27 m on the plain to 3897 m on the mountain top (Figure 1).This area consists of two mountainous and plain features.The major urban area covers the plain with 32% of the total area with a gradient less than 5%.As for the climate, annual precipitation is 1506 mm, 38% of which is confined to the autumn season, October in specific.This region has predominant naturally resources, like fertile and productive soil types and rich Hyrcanian forests and agricultural lands, including 52 cities, 13 of which are coastal line and 2916 villages [24].Domestic immigration and tourism are the two important factors in significant population growth recorded in the recent decades, that is, any increase from 2,081,037 in 1986 to 2,480,874 in 2011 [24].This is followed by urban sprawl and extensive land degradation in Gilan.One of the most striking aspects of land degradation here is the loss of great parts of the Hyrcanian forests and rangeland areas.

Research Procedure
The urban growth dynamics of the study area are found where techniques are integrated CA, MC and logistic regression modeling.This model is designed through the data from multiple time scales, obtained from remote sensing and census data.Calibration of the model is based on LULC data compiled from 1989 to 2013, while simulations are conducted for the years 2025 and 2037.The data and their implementation processes are presented in Figure 2.

Research Procedure
The urban growth dynamics of the study area are found where techniques are integrated CA, MC and logistic regression modeling.This model is designed through the data from multiple time scales, obtained from remote sensing and census data.Calibration of the model is based on LULC data compiled from 1989 to 2013, while simulations are conducted for the years 2025 and 2037.The data and their implementation processes are presented in Figure 2.  The geo-referenced images are corrected radio-metrically and geometrically.In order to enhance the contrast of the features and facilitate classification of different land uses, Optimum Index Factor (OIF) is used in finding the best color composite.After pre-processing the images, they are classified through the Maximum Likelihood supervised classification method.Accordingly, the land use maps of the study area are prepared in five land use classes (Table 1), including: built-up areas, agricultural lands, water bodies, forest coverage and open land (rangeland and bare land).Finally, the changes in different land use types were detected by cross tabulation matrix in IDRISI Selva software (Clarklabs, Worcester MA, USA).The geo-referenced images are corrected radio-metrically and geometrically.In order to enhance the contrast of the features and facilitate classification of different land uses, Optimum Index Factor (OIF) is used in finding the best color composite.After pre-processing the images, they are classified through the Maximum Likelihood supervised classification method.Accordingly, the land use maps of the study area are prepared in five land use classes (Table 1), including: built-up areas, agricultural lands, water bodies, forest coverage and open land (rangeland and bare land).Finally, the changes in different land use types were detected by cross tabulation matrix in IDRISI Selva software (Clarklabs, Worcester MA, USA).

The Integrated Land Use Change Model
The integrated model of logistic regression, Markov chain and cellular automata begins with the logistic regression of the two datasets of urban and non-urban explanatory factors which determine the potential for change from one class of land use to another, in each cell.Next, using MC and reference maps of land use, the third step is estimated; that is, the proportion of the area that will likely change from one land use to another.Subsequently, the location of this estimated area of conversion is calculated through the CA.This allows the researchers to assign the land use class for a given year of prediction to each pixel.In order to calibrate and validate the adjustment of the model, a reference map from 2013 is compared to this estimated area map of land use/cover for the same year.Finally, the calibrated model was used to simulate future land covers for the target years 2025 and 2037.

Simulation of Future Land Use Change by LR
In this study, the future land use maps of the target years 2025 and 2037 are simulated once again by logistic regression.This is done to achieve a more accurate estimate of future land use change.According to relevant literature [15,25] and expert knowledge, a total of 12 independent variables are recognized as the factors which determine the urban sprawl: the distance to and from airport, distance to and from the central business district (CBD), distance from major roads, distance to the nearest cities, distance to the seashore, Digital Elevation Model (DEM), distance to the rivers, slope (in percent), forest areas, northing and easting coordination's and agricultural lands.These variables were separately introduced to the model as the input data so as to predict any changes in the extent of the built-up areas over the periods of 12 years: 1989-2001 and 2001-2013.The dependent variable in this model is the developed urban areas cells (i.e., conversion of other land use types into the built-up areas) as a binary raster map layer, where, the cells with 0 value represent areas without land use change while the cells with the value of 1 indicate that the areas underwent the land use change over a given time period (e.g., [1989][1990][1991][1992][1993][1994][1995][1996][1997][1998][1999][2000][2001]. It is worth mentioning that the cell size of all the map layers is 30 m and the analysis is run in IDRISI Selva software (Clarklabs, Worcester, MA, USA).The logistic regression was estimated for the 12 variables in a manner to making achieving the maximum relative operating characteristic (ROC) possible and adjust the odd ratio values.These values represent the dependent and independent nature of the variables between two binary data as degree of significance level.Regression is one of the well-known statistical approaches applied when the dependent variable is dichotomous [26].This approach model can reveal the correlation between a binary dependent and several independent categorical and continuous variables [27].Systematic sampling and stratified random sampling are the two approved approaches applied in logistic regression model; the first one is applied to reduce spatial dependency when homogeneous nature of the population is not and the second is applicable for running regression [15].Here, the second method is selected and applied to conduct regression [28].As a basic assumption in LR, where the probability of a dependent variable taking the value of 1 is based on the logistic curve, the probability value is calculated through the following equation: where P is the probability of forest loss in the cell, E(Y) is the expected value of the binary dependent variable β 0 is a constant value and β 1 is the coefficient of the independent variable x i The logistic function of Equation ( 1) is transformed into a linear response as follows: Furthermore, the binary dependent variable value is either 1 or 0 and the standard linear regression model is expressed as: where P is the probability of a grid cell for the occurrence of land use type i. x is the related driving factors; β represents the regression coefficients between land use types and driving factors.

Simulation of Future Land Use Change through MC
In this study, the Markov chain model is applied to simulate the spatio-temporal changes of land uses indifferent time future series.A finite set of states as discrete values are after specified in Markov Chain.The possibility in making a transition in the states of this specified value in the system is recorded in the form of transition matrix.In simple terms, the transition matrix records the probability of moving from one state (land use class) to another (land use class) [29].In this study the probability of changing each land use class to another is determined by Markov transition probability matrices for the periods between 1989 and 2013.In order to verify the CA-Markov model for the target years of 2025 and 2037, it is assumed that the target year is 2013; accordingly, based on the transition probability matrix of 1989-2001, a land use map is simulated for the year 2013.The model is validated through the Kappa coefficient by comparing the simulated land use map of 2013 with the actual land use map, derived from the satellite image of 2013.MC model is a discrete random process with the cellular lattice where the transition probability matrix based on neighborhood effects is applied in a spatial influence logarithm [30].MC models is lately being applied in a big scale in modeling LULC changes in future predictions including both urban and non-urban areas [31,32].This model is key-descriptive and is implemented in an appropriate manner [16].This method is developed into an essential prediction approach in geographic research and spatio-temporal simulations, due to its features of descriptive power, simple trend projection of land use and land cover transition [33,34].A chain is defined as a stochastic process having the property where the value of the process at times t, and X t , depending only on its value in time t − 1, X t−1 , not on the sequence of values X t−2 , X t−3 , . . ., X 0 that the process passed through in arriving at X t−1 [35].It can be expressed as: If a Markov sequence of random variant X n takes the a n discrete value a 1 , . . ., then: The Chi-square (X 2 -test) is run to test the change matrix to justify the Markov chain, which is expressed as follows: where, X 2 = r = number of row, c = number of columns, Qij = Actual number in cell ij, Eij = Expected number in cell ij and (r − 1) × (c − 1) = the degrees of freedom  [20].This is due to the fact that they have demonstrated the capacity to represent complex spatial processes, such as the representation of a variety of local behaviors together with global patterns [15,16].
The (CA) component in the Land use/cover model determines the assignment of land use in each cell.CA models define a new state of land use for each cell for a particular period and as the previously stated function of the cell and cell neighbors it is, in agreement with a group of defined transition rules [36].In this article, model "CA-Markov" is applied in the IDRISI software for modeling LULC.In this model, the Markov Chain controls the temporal dynamics (gains and losses), while the spatial dynamics is controlled by cellular automata, according to the characteristics of a spatial filter and the potential of land use change [37].

LULC Analysis
The LULC maps of this study area in the two 12 year periods are prepared through Landsat TM, ETM+ and OLI satellite images.processes, such as the representation of a variety of local behaviors together with global patterns [15,16].The (CA) component in the Land use/cover model determines the assignment of land use in each cell.CA models define a new state of land use for each cell for a particular period and as the previously stated function of the cell and cell neighbors it is, in agreement with a group of defined transition rules [36].In this article, model "CA-Markov" is applied in the IDRISI software for modeling LULC.In this model, the Markov Chain controls the temporal dynamics (gains and losses), while the spatial dynamics is controlled by cellular automata, according to the characteristics of a spatial filter and the potential of land use change [37].

LULC Analysis
The LULC maps of this study area in the two 12 year periods are prepared through Landsat TM, ETM+ and OLI satellite images.Five major land uses are determined in the study area as built-up areas, agricultural lands, water bodies, forest coverage and open land (rangeland and bare land).The overall accuracy of the generated maps is estimated through the error matrices.The Kappa coefficients of 89.23%, 90.84%, and 91.08% confirm the accuracy of the classified maps of each one of the periods, respectively.The LULC maps of the study area in different periods under study are shown in Figure 3. Land use map of the province in 2013 shows that the built-up area are spread mainly in the central parts, over an area of 59,754.8ha.The built-up areas are surrounded by fertile agricultural lands, mostly paddy fields covering 318,887 ha and a strip of Hyrcanian forest in the south, in an area of 751,242 ha.The position of the built-up areas reveals the potential loss of fertile agricultural lands and the unique Caspian Hyrcanian forest ecosystems in its vicinity.In addition, barren/open lands making 266,963 ha as a relatively wide strip in the south is narrowing down as the scattered spots in the West and Northwest increase.

LULC Change Detection
After checking the accuracy of the land use maps, they were cross-tabulated to quantify the variations in different land use classes over time.The results tabulated in Table 2 in the three periods of 1989-2001, 2001-2013 and 1989-2013.As observed in this Table, changes in built-up areas follow an upward trend over the study period from 1989 to 2013.The expansion is mainly due to conversion of 8644.41 ha of barren lands and 12,623.01ha of agricultural lands to the urban land use (Figure 4).In addition, deforestation of 2076.3 ha of the Hyrcanian forests is another reason for urban sprawl in the province.Conversion of 12,637.46ha of forest coverage and 6744 ha of agricultural lands into the barren land could also be a clear indicator for extensive land degradation in the study area.

LULC Change Detection
After checking the accuracy of the land use maps, they were cross-tabulated to quantify the variations in different land use classes over time.The results tabulated in Table 2 in the three periods of 1989-2001, 2001-2013 and 1989-2013.As observed in this Table, changes in built-up areas follow an upward trend over the study period from 1989 to 2013.The expansion is mainly due to conversion of 8644.41 ha of barren lands and 12,623.01ha of agricultural lands to the urban land use (Figure 4).In addition, deforestation of 2076.3 ha of the Hyrcanian forests is another reason for urban sprawl in the province.Conversion of 12,637.46ha of forest coverage and 6744 ha of agricultural lands into the barren land could also be a clear indicator for extensive land degradation in the study area.

CA-Markov Projection Model
The transition area matrix is a table which records the amount of pixels anticipated to change from one particular land use category to another.The transitional probability matrix of the CA-Markov model in Table 3 reveals the probability of changing from one land use to another making it possible to predict changes in future different land use types in a quantitative sense.Before running the forecast model for the target years of 2025 and 2037, the model validity was pre-tested for 2013 based on the changes over the period of 1989-2001, the probability of the land use changes predicted by the year 2013.According to the obtained results, the greatest land use changes could be observed in the conversion of barren lands into urban areas (probability rate = 0.1367) followed by agricultural lands (probability rate = 0.1147).This was verified by the results of the change detection analysis.
As mentioned earlier, the area of barren lands increased to 52,023 ha between 1989 and 2013, confirming the fact that barren lands were subject to the greatest changes within the given period.By comparing the simulated and real land use changes for the year 2013, it can be deduced that this developed model is capable of accurately estimating future land use changes.
The obtained Kappa coefficient of 0.89 indicates a good fit between the actual and the simulated maps.This fit indicates the accuracy of the simulation.Therefore, it was run to simulate changes by the years 2025 and 2037.According to the probability matrix, the probability of the barren lands to be converted into urban areas by 2025 is very high.The probability rate is estimated at 0.1235.
The agricultural and forest areas with probability rates of 0.0521 and 0.0431, respectively, are two further land uses with a high potential to change into urban areas by 2025.As expected, the conversion of water bodies into built-up areas is almost impossible.The probability rate is zero, simple terms; water bodies are the only land use that is likely to remain unchanged in the known future.The remarkable point is the high probability for conversion of forests to barren lands (probability rate is 0.1289).This indicates accelerated land degradation in the study area, which is apparent in the form of loss of forest areas possibly due to mismanagement of the Hyrcanian Mixed Forests.Likewise, the agricultural and barren lands are highly subject to be converted to urban areas with probability rates of 0.0521 and 0.1235 respectively.This clearly clarifies the need for reform in the current agricultural practices.The transitional probability matrix for the target year 2037 predicts the same trend of conversion to different land use.Among the various land use classes, barren lands account for the greatest conversion into the urban areas (0.1449) followed by agricultural lands (0.1323).Furthermore, forest areas are highly prone to change to barren lands (0.1023).Further details are tabulated in Table 3.The findings in Table 4 reveal that the built-up areas will be expanded up to 71,263 ha (5.08%) by the end of the first target year 2025.The urban land use increasing trend in urban development will continue up to 78,073 ha (5.56%) until the second target year 2037.The greatest change is expected to occur in the forest coverage area so that 17,951 ha of the Hyrcanian forests will have disappeared by 2025 and 25,101 ha by 2037.Changes in the area of agricultural land will also be significant, that is an increase of 16,509 ha (1.18%) by 2025 and 28,461 ha (2.03%) by 2037.It will also be a likely loss of 9931 ha (0.7%) and 21,774 ha (1.55%) of the barren lands by 2025 and 2037, respectively.The future expansion of urban areas by the target years 2025 and 2037 are shown Figure 5.As observed, urban development would go through its gradual evolution primarily on the periphery of the current population centers, with an orientation towards the surrounding agricultural lands and forested areas.Most of the established built-up areas will be observed with a central concentration surrounded with scattered networks of low-density areas.

CA-Logistic Regression Model
The logistic regression model is developed by 12 independent variables determined by literature reviews and expert opinion (Figure 6).This model is implemented separately for each of these variables as the input data for the period 1989 to 2013.The dependent variable in this implementation is the developed cells, that is, changes from no built-up areas to built-up areas which is prepared as a binary raster map where the value 1 indicates the changed and zero the unchanged areas during the study period.Figure 6 represents the pattern of the dependent variable.The output of the regression model determines the positive or negative role of any variable on the probability of any change regarding land use to be converted to urban.The influence of the independent variables in urban growth development throughout the study area in the 1989 to and 2013 period is presented in Table 5.According to this table, agricultural and forest lands, with the coefficient values of 17.901376 and 8.363241 respectively, have the highest potential to be converted into urban areas.
The variables of distance to nearest cities (with a coefficient of −0.000498), northing coordinates (with a coefficient of −0.000835), easting coordinates (with a coefficient of −0.000068) and slope (with a coefficient of −0.000319) have negative impact on development of cities in the province.Here, it is assumed that the extent and direction of urban development are dependent on the Northing and Easting geographical coordinates.Since the coefficients of the northern and eastern aspects in LR model are negative, there exists the possibility that land use in the southern and western sections will be converted into built-up areas.All in all, the intensity of urban development in the south is more than the west and this fact is supported by the higher coefficient value of the north direction in relation to that of the east (−0.000068).The area map subject to urban sprawl of the study area is based on the logistic regression model (Figure 7), where the faded areas indicate higher probabilities for urban expansion.

CA-Logistic Regression Model
The logistic regression model is developed by 12 independent variables determined by literature reviews and expert opinion (Figure 6).This model is implemented separately for each of these variables as the input data for the period 1989 to 2013.The dependent variable in this implementation is the developed cells, that is, changes from no built-up areas to built-up areas which is prepared as a binary raster map where the value 1 indicates the changed and zero the unchanged areas during the study period.Figure 6 represents the pattern of the dependent variable.The output of the regression model determines the positive or negative role of any variable on the probability of any change regarding land use to be converted to urban.The influence of the independent variables in urban growth development throughout the study area in the 1989 to and 2013 period is presented in Table 5.According to this table, agricultural and forest lands, with the coefficient values of 17.901376 and 8.363241 respectively, have the highest potential to be converted into urban areas.
The variables of distance to nearest cities (with a coefficient of −0.000498), northing coordinates (with a coefficient of −0.000835), easting coordinates (with a coefficient of −0.000068) and slope (with a coefficient of −0.000319) have negative impact on development of cities in the province.Here, it is assumed that the extent and direction of urban development are dependent on the Northing and Easting geographical coordinates.Since the coefficients of the northern and eastern aspects in LR model are negative, there exists the possibility that land use in the southern and western sections will be converted into built-up areas.All in all, the intensity of urban development in the south is more than the west and this fact is supported by the higher coefficient value of the north direction in relation to that of the east (−0.000068).The area map subject to urban sprawl of the study area is based on the logistic regression model (Figure 7), where the faded areas indicate higher probabilities for urban expansion.The expanding pattern for the future is distinct and clear in this map, indicating that areas near major roads and existing urban clusters have the highest potential in becoming converted into urban areas in the future.The sensitivity analysis is run through the ROC method in order to identify the factors with the greatest impact on growth and expansion of the built-up areas.To do this, the LR model was run 12 times in a deductive manner, that is, the next run contains −1 independent variable.Comparing the ROC values of the whole model with the individual ROCs obtained in each run, the effect of each of the independent variables is determined in Table 6, where ROC value of 1 shows an excellent spatial adaption between the simulated and actual urban development maps and ROC value of 0.5 indicates the existence of a chance of agreement between the projected probability and actual map assigned to random locations.The best independent variables are determined by the ROC values and the adjusted odds' ratios for each set of the variables.The validity and fitness of the selected collection is confirmed through the adjusted odds' ratios.
As observed in Table 6, the ROC of the LR model was run by excluding the variable of distance from residential areas is 0.7351.It can deduced that distance to the nearest cities will have the greatest impact on future urban development, followed by distance to agricultural lands, slope and distance to main roads.The expanding pattern for the future is distinct and clear in this map, indicating that areas near major roads and existing urban clusters have the highest potential in becoming converted into urban areas in the future.The sensitivity analysis is run through the ROC method in order to identify the factors with the greatest impact on growth and expansion of the built-up areas.To do this, the LR model was run 12 times in a deductive manner, that is, the next run contains −1 independent variable.Comparing the ROC values of the whole model with the individual ROCs obtained in each run, the effect of each of the independent variables is determined in Table 6, where ROC value of 1 shows an excellent spatial adaption between the simulated and actual urban development maps and ROC value of 0.5 indicates the existence of a chance of agreement between the projected probability and actual map assigned to random locations.The best independent variables are determined by the ROC values and the adjusted odds' ratios for each set of the variables.The validity and fitness of the selected collection is confirmed through the adjusted odds' ratios.
As observed in Table 6, the ROC of the LR model was run by excluding the variable of distance from residential areas is 0.7351.It can deduced that distance to the nearest cities will have the greatest impact on future urban development, followed by distance to agricultural lands, slope and distance to main roads.As observed in Figure 7, the pixels closer to the major roads have a higher potential to become converted into urban areas are of low probability of change due to the cells with a high elevation (DEM).Moreover, the contribution of other input data such as distance to industrial centers, distance to educational centers and barren/open lands is assessed as being negligible due to their low ROC values, (Table 6).In general, the obtained results on the capability of the Markov chain model and cellular automata for simulation of future changes in urban land are in accordance with those of Alsharif and Pradhan [27] and Deep and Saklani [38].

Conclusions
Understanding the dynamics of urban expansion would assist managers and urban planners to make rational decisions when the control of urban sprawl and its consequences for the future are of concern.Simulation and forecast of urban expansion is a must for urban planners and land conservationists in formulating sustainable development strategy(s).
In this assessment, the LR model is combined with MC and CA models in order to develop an efficient hybrid geospatial explicit approach.These three techniques are combined: (1) to develop a probability surface and to determine the most probable sites through the regression model; (2) to retrieve the quantity of change due to an inconsistent urban development pattern; (3) to apply the CA model as a significant tool to allocate probable changes under predefined conditional rules.In this technique, integrating environmental and socio-economic factors are of concern; moreover this integrated model is able to accept any spatial factor for measuring its impact on urban sprawl.It should be noted that this technique was tested and verified during its development, the calibration process and when compared to real land use of 2013.The logistic regression model explores the correlation between land conversion and the causative factors in a quantitative sense, which enables the researchers to distinguish the essential driving forces.The customary LR models in spatial studies have the restrictions of their own like: mere focus on natural factors and disregarding the spatial-interacted impact on key industries; accordingly, the integrated approach is introduced and implemented to compensate the existing constraints.The MC model is a stochastic progression that describes the probability of conversion of one state to another.However, the CA model allocated the amount of change, beginning with the cells of highest probability.Temporal land use and socio-economic and geophysical data were acquired and processed using a GIS spatial analysis tool.How the unsustainable urban development delivered its adverse effect on the natural landscape, at the cost of major loss of the Hyrcanian forests and rangelands, is revealed in this article.It is predicted that this deteriorating trend will continue into the future.With respect to the efficiency classes and their quantitative aspect and the related most effective drivers discussed in result and discussion, the following results have been obtained: The most destruction has occurred in the barren/open lands and forest coverage respectively, where the relevant drivers with the most affect, are involved, for example, the agricultural lands, distance to CBD and major roads out of 12.According to close estimates by the CA projections, the built-up areas will be developed to 71,263 ha by 2025 and 78,073 ha by 2037.According to the estimates, the urbanization in Gilan Province follows an upward trend, that is, an expansion of about 5.65% over the period between 1989 and 2037.Moreover, the results of land use change detection demonstrated a continuous decrease in barren/open lands and forest lands and an increase in agricultural lands in the study area for the same period.The situated land use maps of 2025 and 2037 reveal an alarming descending acceleration in the non-urban areas.Based on the predicted results, the expansion of the urban areas would follow a scattered pattern and indicate an extensive loss of forest and natural rangelands, which would in turn become an obstacle for sustainable development.This fact justifies the need for adopting comprehensive planning approaches in order to develop an up-dated urban sprawl management system to control expansion of the cities in Gilan Province.

Figure 1 .
Figure 1.Situation of Gilan Province in Iran.

Figure 1 .
Figure 1.Situation of Gilan Province in Iran.

Figure 2 .
Figure 2. Schematic presentation of the logistic and CA-Markov methods.

2. 3 .
Land Use/Land Cover (LULC): Change Detection To begin with, the changes in different land use are detected in order to quantify variations in the city boundaries over the years 1989-2013.The land use maps are extracted from Landsat TM, ETM+, and OLI satellite images within two 12-year time intervals of 1989, 2001 and 2013.The coordinate system of the images is set as Universal Transverse Mercator (UTM), at Zone 39 and North WGS 84.The images are geo-referenced by Nearest Neighborhood algorithm, where using a total of 220 training points are well-distributed throughout the study area.

Figure 2 .
Figure 2. Schematic presentation of the logistic and CA-Markov methods.

2. 3 .
Land Use/Land Cover (LULC): Change Detection To begin with, the changes in different land use are detected in order to quantify variations in the city boundaries over the years 1989-2013.The land use maps are extracted from Landsat TM, ETM+, and OLI satellite images within two 12-year time intervals of 1989, 2001 and 2013.The coordinate system of the images is set as Universal Transverse Mercator (UTM), at Zone 39 and North WGS 84.The images are geo-referenced by Nearest Neighborhood algorithm, where using a total of 220 training points are well-distributed throughout the study area.
Five major land uses are determined in the study area as built-up areas, agricultural lands, water bodies, forest coverage and open land (rangeland and bare land).The overall accuracy of the generated maps is estimated through the error matrices.The Kappa coefficients of 89.23%, 90.84%, and 91.08% confirm the accuracy of the classified maps of each one of the periods, respectively.The LULC maps of the study area in different periods under study are shown in Figure 3. Land use map of the province in 2013 shows that the built-up area are spread mainly in the central parts, over an area of 59,754.8ha.The built-up areas are surrounded by fertile agricultural lands, mostly paddy fields covering 318,887 ha and a strip of Hyrcanian forest in the south, in an area of 751,242 ha.The position of the built-up areas reveals the potential loss of fertile agricultural lands and the unique Caspian Hyrcanian forest ecosystems in its vicinity.In addition, barren/open lands making 266,963 ha as a relatively wide strip in the south is narrowing down as the scattered spots in the West and Northwest increase.Sustainability 2016, 8, 810 8 of 17

Figure 4 .
Figure 4. Changes in urban land use in 1989, 2001 and 2013 periods.

Figure 4 .
Figure 4. Changes in urban land use in 1989, 2001 and 2013 periods.

Figure 5 .
Figure 5. Simulated urban expansion based on CA-Markov model by the target years 2025 and 2037.

Figure 5 .
Figure 5. Simulated urban expansion based on CA-Markov model by the target years 2025 and 2037.

Figure 6 .
Figure 6.Raster layers of independent variables as the continuous and binary values.

Figure 6 .
Figure 6.Raster layers of independent variables as the continuous and binary values.

Figure 7 .
Figure 7. Probable areas of urban sprawl in Gilan Province based on logistic regression model.

Figure 7 .
Figure 7. Probable areas of urban sprawl in Gilan Province based on logistic regression model.
• 34 N -38 • 27 N and the longitudes of 48 • 53 E -50 • 34 E covering Eastern third of the Caspian Sea coast with MSL of −27 m on the plain to 3897 m on the mountain top (Figure

Table 1 .
Temporal Land Use Areas in Gilan Province.
Simulation of Future Land Use Change by CA Models based on cellular automata (CA) are in extensive use in simulating urban expansion

Table 2 .
Change Detection Results Based on Cross Tabulation of Gilan Province in Two 12 year Periods.

Table 4 .
Predicted Land Use Changes in Gilan Province, available: 2013 and predicted: 2025 and 2037.

Table 5 .
Influence of the Independent Variables in Urban Development in the 1989 to 2013 period.

Table 5 .
Influence of the Independent Variables in Urban Development in the 1989 to 2013 period.

Table 6 .
Estimated Statistical Values for the Logistic Regression Model Consisting of 12 Independent Variables.