Application of an Improved ABC Algorithm in Urban Land Use Prediction

Scientifically and rationally analyzing the characteristics of land use evolution and exploring future trends in land use changes can provide the scientific reference basis for the rational development and utilization of regional land resources and sustainable economic development. In this paper, an improved hybrid artificial bee colony (ABC) algorithm based on the mutation of inferior solutions (MHABC) is introduced to combine with the cellular automata (CA) model to implement a new CA rule mining algorithm (MHABC-CA). To verify the capabilities of this algorithm, remote sensing data of three stages, 2005, 2010, and 2015, are adopted to dynamically simulate urban development of Dengzhou city in Henan province, China, using the MHABC-CA algorithm. The comprehensive validation and analysis of the simulation results are performed by two aspects of comparison, the visual features of urban land use types and the quantification analysis of simulation accuracy. Compared with a cellular automata model based on a particle swarm optimization (PSO-CA) algorithm, the experimental results demonstrate the effectiveness of the MHABC-CA algorithm in the prediction field of urban land use changes.


Introduction
One of the most significant manifestations of human social and economic development is the change in the type of land use.Thus, the type of changes of land use have always been an important research field for experts and scholars [1,2].Land use changes not only have complex effects on land use patterns, but can also directly lead to changes in the ecological environment and generate complex social problems.Therefore, scientific studies on the development of land-use types and prediction of changes in land-use types can provide important help and support for the rational use of urban land resources, for the protection of the ecological environment, and for sustainable development [3][4][5].Studies in this area include cellular automata (CA) models [6], Markov models [7], CA-Markov models [8], conversion of land use and its effects (CLUE) framework [9], and so on.Although many experts and scholars have done a lot of research works in the field of land use, there are still problems, such as the factors biased toward the ecological field, directly taking the conditional probability of images as the conversion rule, the lack of consideration of social factors, and so on.
The swarm intelligence optimization algorithm is a kind of stochastic optimization algorithm [10] that can solve the problems that the traditional optimization technology cannot solve, so it is favored by many experts and scholars.In the swarm intelligence optimization algorithms, the artificial bee colony (ABC) algorithm is a popular algorithm that mainly simulates the behavior mechanism of bees swarming in nature to obtain the optimal solution for the problem [11].Because the ABC algorithm has the characteristics of less parameter settings, simple calculation, and good parallelism, it has a good optimization effect when dealing with optimization problems [12,13].However, it also has certain problems, such as premature convergence, slow convergence, poor local search ability, low accuracy, insufficient theoretical analysis, and so on [14].Therefore, there is still much room for improving the efficiency and accuracy of the ABC algorithm in the optimization processes.
To take advantage of the intelligent searching ability of the ABC algorithm, this paper introduces an improved hybrid ABC algorithm based on the mutation of inferior solutions (MHABC), and combines it with the cellular automata (CA) model to establish a new CA rule mining algorithm (MHABC-CA).Then, the urban land use change in Dengzhou city of Henan province, China, is taken as a case study to verify the effectiveness of MHABC-CA model.In summary, prediction of the changes of urban land use types based on the MHABC-CA model can provide decision support for the rational use planning of urban land, and has certain practical significance.

ABC Algorithm
Turkish scholar Karaboga proposed the basic model of the ABC algorithm [11] in 2005, and carried out extensive applications on function optimization problems, and proved its superiority in numerical optimization.Since then, the basic ABC algorithm has been widely used by experts and scholars in numerical optimization of unconstrained functions and constrained functions, neural network training, digital filter design, and so on [15].The effectiveness of ABC algorithms in solving optimization problems has been fully demonstrated in these studies.At present, the main purposes of studies on the ABC algorithm are to overcome the deficiencies and improve the performance of algorithm, and apply it in various fields to solve the practical problems.The studies on the ABC algorithm are mainly carried out in four aspects: firstly, to study the characteristics of the ABC algorithm and improve its performance; secondly, to combine the ABC algorithm with other intelligent algorithms to achieve the improvements; thirdly, to study the parallelization of the algorithm; fourthly, to research the theoretical proof of the stability and convergence of the ABC algorithm.
Because of the advantages of ABC algorithm, it has been applied in various research fields.For example, Halime et al. adopted the ABC algorithm to train the weights of a neural network, and monitor and predict the sludge volume index using image analysis and the trained neural network at a full-scale activated sludge plant [16].Santhi et al. introduced the crossover operation into the ABC algorithm for more efficient job scheduling [17].Also, the global artificial bee colony (GABC) algorithm enhanced by the global optimal solution is proposed by Zhu et al. [18].These studies led to the development of the ABC algorithm, and also promoted the application of the algorithm to other fields.However, deep research into the abilities of exploration and exploitation is still needed to improve the performance of the ABC algorithm [19].

Land Use Prediction
At present, the prediction models of land use change mainly include driving force models, Cellular Automata models, Markov models, CA-Markov models, CLUE models, and so on.Many scholars have conducted a lot of in-depth and meticulous studies in this field.For instance, Niu et al. proposed an activity-based land use/transport interaction (LUTI) model to predict the impact of land use policies for urban activities [20].Valerio Amici et al. used the MaxEnt algorithm in studies of land cover classification and land use change [21].Zhao et al. extended the LandSys I model to introduce artificial neural networks (ANNs) into the framework of CA, multi-agents, and geographic information systems (GIS) to predict land use changes [22].Huang et al. used a Markov model to simulate land use prediction based on the Markov model [23].Anputhas et al. used the CLUE-S model to predict land-use conversions and their impact on small areas [24].
The studies of land use change in China are mostly based on CA-Markov models, neural networks, GIS, and logistic models.For example, Li et al. used the CA-Markov model to predict land use changes in Harbin [25].Huang et al. used the CA-Markov model to study land use changes in Qingjiang [26].Zhang et al. simulated and predicted the land use evolution based on MCE-CA-Markov in the Three Gorges reservoir area [27].Doumetana et al. analyzed land use changes in the Cullen County based on GIS and CA-Markov model [28].The dynamic changes of land use in the Balikun Lake Basin were analyzed and forecasted based on the CA-Markov model by Lune et al. [29].Wang et al. presented a genetic Back Propagation (BP) neural network model to predict land use changes [30].Zhang et al. carried out the predictive analysis of land use changes in Ganzhou District based on CA-Markov model [31].Yu et al. studied and applied land use structural changes based on the logistic-Markov method [32].Many experts and scholars have done a lot of research works in the field of land use.However, in most of these studies, conditional probability or a single ecological factor is used as the rules for the conversion of cellular automata, which results in a conversion rule that is too simple, and that lacks consideration for the impact of socio-economic factors on land use changes.

Standard ABC Algorithm
The standard ABC algorithm classifies artificial bee swarm into three categories by simulating the honey collecting mechanism of actual bees: employed bees, onlookers, and scouts.Each position of the food source represents a possible solution for the optimized problem, and the quality or fitness of the solution is represented as the nectar amount.The goal of the entire colony is to find the food source that has the maximum nectar.In the standard ABC algorithm, the employed bees use the previous nectar source information to find a new nectar source and share the information with the onlookers; onlookers wait in the hive and find new honey sources based on the information shared by the employed bees; the task of the scout is to update the stagnant food source, and find a new and valuable source of nectar.Through collaboration of these three categories of bees, the ABC algorithm can gradually converge to an optimal or near-optimal solution.Details of the ABC algorithm can be found in the literature [33].

Mutation Method of Inferior Solutions
In the optimization process, if an employed bee cannot be improved in the specific evolution times (Limit value), it will be abandoned.A mutation mechanism is adopted to direct the inferior solution to perform the mutation operation in the worse position [34].This mutation method can increase the breadth of search, which can improve the search ability of the algorithm.The specific implementation is described as follows: in the onlooker stage, calculate the follow probability P i of onlooker, and generate a random number rand to compare with the P i.If rand < P i , then perform the exploitation operation of the onlooker.For the inferior solution that does not meet the above condition, follow Equation (1) to perform the mutation operation.
where v ij denotes the j-th element of V i , and j is a random index; M is the mutation coefficient, which is in the range of [−1, 10]; X global is the global optimal value so far and it is also the j-th optimal value.

Binary Crossover Operation
The binary crossover operation is to take the crossover probability cr in the range of [0, 1], and generate a random number rand in the range from 0 to 1.If rand ≤ cr, let the employed bee perform the exploitation operation and crossover operation with the global optimal value.Otherwise, the crossover operation will not be performed.The binary crossover operation is shown in Equation (2), which can improve the exploitation ability of the algorithm and the abilities of exploration and exploitation can be adjusted or balanced through the adjustment of cr.A greater cr means the stronger exploitation ability of the algorithm, and vice versa, the stronger exploration ability [33].
Then, the two mechanisms of the mutation method of inferior solutions and the binary crossover operation were integrated into the ABC algorithm to construct the MHABC algorithm.Through the assistance of the two strategies described above, the MHABC algorithm can improve the breadth search and the depth search in the optimization process [34].

Basic Model of CA
CA is a grid dynamics model with discrete states, discrete spaces, and discrete times.The CA uses a series of transformation rules constructed by model, which can be used to drive globally complex behaviors and realize the simulation of complex geospatial processes.From a formal point of view, the CA can be described as a four-tuple [35].The specific description can be expressed in Equation (3).
where V represents the cell space, and the geographical space is generally divided by rules, each cell after division represents a cell; d represents the dimension of the cell space; S represents the collection of cell states, which is a listable, countable, finite set of discrete values, and each element in the set represents a cell state; Ω represents the neighbors of the cell, apparently Ω ⊂ V; and R represents the conversion rule or rules set, which determines transition of the cell state.The transition of cell state can be expressed as the mapping f that the cell state at time t map to the state at time t + 1 under the rule, that is, f = S t R → S t+1 .The basic model of CA consists of four components: Cell, Cell State, Neighbor, and Rules.In other words, the CA is composed of the cell space and the transformation rules defined on it.In essence, it is a method that can change the unordered, irregular, and unbalanced state to the orderly, regular, and balanced state [35,36].
Cells are the most basic components of CA.When using the CA to solve the problem, the research object is usually abstracted as a combination of a series of units, and one unit is a cell.Different cells have no difference besides different properties, that is, the position of each cell in the CA model is equal.The state change of cell is related to the surrounding neighbors, and the change rule is universal.For example, when studying the urban development, the entire urban space is divided into regular grids with the same shape and size, and then each grid point is a cell.
In any CA model, at a specific time, each cell has a uniquely determined Cell State (State).The set of all cell states is a finite, countable discrete set.The state of the cell can be expressed in binary form, such as (0, 1), (birth, death), (black, white), or can take the values within a finite set of integers S. For example, in the CA model of urban land use, the set of cell states can be set to {0, 1}, where 0 indicates that the current cell state is not a city, and 1 indicates that the current cell state is a city.
Neighbor is an important attribute of the CA model that distinguishes it from other models.The state of each cell in the CA model is usually determined by its neighbors.The neighbors of a cell can be understood as a set of cells within the cell space that is less than a certain distance threshold from the cell, or it can be a set of cells that satisfy other conditions.For example, in the CA model of urban land use, the neighbors of a cell refers to an analysis window whose distance to the cell is less than a certain threshold, such as 3 × 3, 5 × 5, and so on.
The conversion rule (Rule) refers to the definitive description of the mapping relationship between the current state of the cell and its neighbors' attributes and the cell state of next time, or the conversion rule refers to a functional relationship.The formal expression can be expressed in Equation (4).
where S t+1 i represents the state of cell i at time t + 1; S t i represents the state of cell i at time t; N t i represents the neighbors' attributes of the cell i at time t; and f represents the conversion rule.

Construction of Interval Model of CA
The schematic diagram of the interval model based on CA is shown in Figure 1.The optimized transition rules can be discovered by the CA model, and a transition rule is constructed by a cell state value and attributes' intervals, each of which is defined by a lower and an upper threshold value for each attribute.The specific construction steps of the CA interval model are described as follows: (1) Select a value interval (continuous attribute value) or a single attribute node (discrete attribute value) with definite upper and lower bounds on each attribute of the cell.
(2) Logically join and connect a set of selected intervals and nodes, and generate a set of conditions.
(3) Generate a cell state value for a set of conditions combination, and form a complete rule by combining the conditions and the cell state value.
Information 2018, 9, x 5 of 22 The conversion rule (Rule) refers to the definitive description of the mapping relationship between the current state of the cell and its neighbors' attributes and the cell state of next time, or the conversion rule refers to a functional relationship.The formal expression can be expressed in Equation (4).
where   +1 represents the state of cell i at time t + 1;    represents the state of cell i at time t;    represents the neighbors' attributes of the cell i at time t; and f represents the conversion rule.

Construction of Interval Model of CA
The schematic diagram of the interval model based on CA is shown in Figure 1.The optimized transition rules can be discovered by the CA model, and a transition rule is constructed by a cell state value and attributes' intervals, each of which is defined by a lower and an upper threshold value for each attribute.The specific construction steps of the CA interval model are described as follows: (1) Select a value interval (continuous attribute value) or a single attribute node (discrete attribute value) with definite upper and lower bounds on each attribute of the cell.
(2) Logically join and connect a set of selected intervals and nodes, and generate a set of conditions.
(3) Generate a cell state value for a set of conditions combination, and form a complete rule by combining the conditions and the cell state value.

Normalization Processing
Because the ranges and types of attribute values of different geographic features are not the same, if mining the transformation rules using the model on the original attributes' values of the geographic features, it will result in the wrong rules or the model will not work.Therefore, it is necessary to normalize the values of feature attributes.The specific normalized formula is shown in Equation ( 5) [35].
where   ,   are the original value of feature attribute and the new value after normalization, respectively; and   ,   are the minimum value and maximum value of interval of feature attribute, respectively.In addition, the normalization equation normalizes the value interval of the attribute to [0, 1] for easy comparison and calculation.As shown in Figure 2, after the normalization of the values of discrete and continuous attributes by Equation ( 5), the model can be used to process all attribute values or variables in the same manner.

Normalization Processing
Because the ranges and types of attribute values of different geographic features are not the same, if mining the transformation rules using the model on the original attributes' values of the geographic features, it will result in the wrong rules or the model will not work.Therefore, it is necessary to normalize the values of feature attributes.The specific normalized formula is shown in Equation ( 5) [35].
where X old , X new are the original value of feature attribute and the new value after normalization, respectively; and X min , X max are the minimum value and maximum value of interval of feature attribute, respectively.In addition, the normalization equation normalizes the value interval of the attribute to [0, 1] for easy comparison and calculation.As shown in Figure 2, after the normalization of the values of discrete and continuous attributes by Equation ( 5), the model can be used to process all attribute values or variables in the same manner.

Conversion of Interval Model based on CA
After the processing of the CA model, a set of the transition rules for the problem can be mined.To facilitate understanding and processing, the transition rule that is mined by the CA model could be represented by the following reasoning [37].where n is the number of attributes; the Lower_k and Upper_k are the best lower and the best upper threshold value of the interval on the k-th attribute, respectively,  ∈ {1,2, … , }; Ci refers to a cell status value  ∈ {1,2, … , }; and m is the number of cell state.Thus, the CA model can be divided into two parts: attribute conditions and cell state.The rule optimization by the optimization algorithm is only conducted on the attribute conditions.To process the model in the computer program, the attribute conditions is expressed as a one-dimensional vector, that is, cell feature vector that is described in Equation (6).

IF
where n is the number of attributes; and  2−1 ,  2 are the best lower threshold value and the best upper threshold value of the interval on the kth attribute, respectively.Then, the process of mining conversion rules turns to the process of optimizing the corresponding cell feature vector X for a specific dominant cell state.

CA Rule Mining Algorithm based on MHABC Algorithm
In this section, the MHABC algorithm is introduced into the CA model to construct an improved rule mining algorithm MHABC-CA, which can discover optimized transition rules by simulating the behaviors of a swarm of bees.As we mentioned above, a transition rule is constructed by a cell state value and its intervals defined by a lower and an upper threshold value for each attribute.According to the CA transformation model, the process of mining transformation rules is a process of optimizing the corresponding cell feature vector X for a specific advantage cell state.Therefore, the MHABC algorithm can help the CA model find the best lower and upper threshold values of the intervals in each attribute, and mine a set of transition rules.Therefore, the

Conversion of Interval Model based on CA
After the processing of the CA model, a set of the transition rules for the problem can be mined.To facilitate understanding and processing, the transition rule that is mined by the CA model could be represented by the following reasoning [37].where n is the number of attributes; the Lower_k and Upper_k are the best lower and the best upper threshold value of the interval on the k-th attribute, respectively, k ∈ {1, 2, . . . ,n}; C i refers to a cell status value i ∈ {1, 2, . . . ,m}; and m is the number of cell state.

IF
Thus, the CA model can be divided into two parts: attribute conditions and cell state.The rule optimization by the optimization algorithm is only conducted on the attribute conditions.To process the model in the computer program, the attribute conditions is expressed as a one-dimensional vector, that is, cell feature vector that is described in Equation ( 6).
where n is the number of attributes; and X 2k−1 , X 2k are the best lower threshold value and the best upper threshold value of the interval on the kth attribute, respectively.Then, the process of mining conversion rules turns to the process of optimizing the corresponding cell feature vector X for a specific dominant cell state.

CA Rule Mining Algorithm based on MHABC Algorithm
In this section, the MHABC algorithm is introduced into the CA model to construct an improved rule mining algorithm MHABC-CA, which can discover optimized transition rules by simulating the behaviors of a swarm of bees.As we mentioned above, a transition rule is constructed by a cell state value and its intervals defined by a lower and an upper threshold value for each attribute.According to the CA transformation model, the process of mining transformation rules is a process of optimizing the corresponding cell feature vector X for a specific advantage cell state.Therefore, the MHABC algorithm can help the CA model find the best lower and upper threshold values of the intervals in each attribute, and mine a set of transition rules.Therefore, the data mining for a transition rule can be regarded as a process of searching for optimal food sources in the MHABC-CA model.
The main steps of MHABC-CA model can be drawn as follows: firstly, it obtains and preprocess the remote sensing images and land-use type data, and converts them into cellular feature vectors by the CA model.The feature vector corresponding to the current dominant cell is calculated.Secondly, the optimization process invokes the improved MHABC algorithm to optimize the feature vector and obtain a conversion rule.Finally, pruning is done on the complete conversion rule to improve its quality.The current dominant cell denotes the optimization solution, which is optimized by the MHABC optimization algorithm, and is chosen according to the fitness function values of the solutions.
The flow chart of MHABC-CA algorithm is shown in Figure 3.The algorithm is mainly divided into three parts.The first part is data processing, which is mainly to preprocess the remote sensing data and to normalize data of different types.The second part is the main algorithm, which is to calculate the states of the dominant cells, and to call the optimization algorithm to operate and perform the rule pruning.The third part is the optimization algorithm, which is to optimize the feature vector X corresponding to a dominant cell based on the MHABC algorithm.
Information 2018, 9, x 7 of 22 data mining for a transition rule can be regarded as a process of searching for optimal food sources in the MHABC-CA model.
The main steps of MHABC-CA model can be drawn as follows: firstly, it obtains and preprocess the remote sensing images and land-use type data, and converts them into cellular feature vectors by the CA model.The feature vector corresponding to the current dominant cell is calculated.Secondly, the optimization process invokes the improved MHABC algorithm to optimize the feature vector and obtain a conversion rule.Finally, pruning is done on the complete conversion rule to improve its quality.The current dominant cell denotes the optimization solution, which is optimized by the MHABC optimization algorithm, and is chosen according to the fitness function values of the solutions.
The flow chart of MHABC-CA algorithm is shown in Figure 3.The algorithm is mainly divided into three parts.The first part is data processing, which is mainly to preprocess the remote sensing data and to normalize data of different types.The second part is the main algorithm, which is to calculate the states of the dominant cells, and to call the optimization algorithm to operate and perform the rule pruning.The third part is the optimization algorithm, which is to optimize the feature vector X corresponding to a dominant cell based on the MHABC algorithm.To describe the algorithm more clearly, the pseudocode of MHABC-CA is given in Algorithm 1.

Data Processing of MHABC-CA Algorithm
For the CA model in the application of geography, because of the complexity and irregularity of the geographical area, the study area needs to be divided into grids.This paper samples the high-resolution remote sensing images and urban land use type maps of Dengzhou city, and extracts the feature values of attributes and the corresponding cells' values.However, the inconsistent data types of the sampling data leads to the need for normalization processing of data to ensure the smooth progress of mining.

Optimization Process of MHABC-CA Algorithm
The optimization process of the MHABC-CA algorithm is to optimize the feature vector X corresponding to a dominant cell state based on the MHABC optimization algorithm.Therefore, in the algorithm processing, optimization is the core step of the algorithm.The special symbols involved in the optimization algorithm are briefly described in Table 1.
Table 1.The specific meaning of mathematical symbols in the optimization algorithm.

SN
The number of food sources The number of Employed bees SN o The number of onlookers n Feature dimension of Cell X Bee populations of Employed bees, X = (X 1 , X 2 , . . . ,X SN e ) In the optimization process of the algorithm, the feature vector X is analogous to the position of nectar source in the ABC algorithm.The optimization process of the feature vector is analogous to the honey-collecting process in the ABC algorithm.Therefore, the optimization process of the feature vector of attributes is as follows.
The algorithm calls the operation of the optimization algorithm.The SN feature vectors are generated by the random number function, and the dimension is set to 2*n.Simultaneously, the feature vector of each cell is initialized.The specific generation method is described in Equation (7).
where v 2k−1 i , v 2k i are the lower bound value and the upper bound of the interval on the k-th feature of the i-th cell feature vector, respectively; Rand(0, 1) is a random number function that generates a random number between 0 and 1.When i , exchange the values of both.After the feature vectors are initialized, the quality of the cell feature vectors, that is, the quality of the conversion rules generated by the current cell states, is calculated.The initial positions of nectar of the employed bees are the feature vectors that ranked in the top n e .
The Gini index [38] is taken as the quality evaluation method of a transition rule in this paper.Thus, instead of measuring the nectar amount in original ABC algorithm, the fitness function is used for the classification, which is defined in Equation (8).
where TP (true positives) is the number of records covered by the rule that have the class predicted by the rule; FN (false negatives) is the number of records not covered by the rule, but they have the class predicted by the rule; FP (false positives) is the number of records covered by the rule, but their class is not predicted by the rule; TN (True negatives) is the number of records not covered by the rule and that do not have the class predicted by the rule.A higher value of fitness indicates higher quality of the transition rule.
(2) Employed bees stage: Generate the crossover probability cr in the range of [0, 1], and generate a random number rand in the range from 0 to 1.If rand ≤ cr, the current solution performs the crossover operation with the feature vector X corresponding to the dominant cell state according to Equation (9).Otherwise, the solutions are updated according to the search formula in the basic ABC algorithm.
where new_v 2k−1 i , new_v 2k i are the lower bound value and the upper bound of the interval on the new honey position searched for in the k-th feature near the i-th honey source position (i.e., cell feature vector), respectively; β is a random number between [0, 1]; X global is the global optimal value so far and also is the optimal value of the j-th iteration; and j is the iteration number of the optimization algorithm.
After completing the solutions updating, if i , then exchange their values.(3) Onlookers stage.After completion of the optimization process of employed bees, the following probabilities P of Onlookers are calculated based on the profitability of the employed bees' positions according to Equation (10).
where fitness is the profitability function of the nectar position of solution X; P(X i ) is the following probability that the i-th employed bee is selected.If the following probability P is greater than the generated random number, the Onlooker searches for a new nectar position near the current nectar position, and the specific search process is similar with the employed bee.Otherwise, the mutation method of inferior solutions is applied for the current nectar position according to Equation (1), and the new mutation position performs its search task according to Equation ( 9).After searching for a new position, the qualities of cell feature vectors in this iteration are compared to select the feature vector with the best quality as the optimal solution for this iteration optimization.Then, the optimal solution is compared with the one of the previous iterative optimization, and the one with the higher quality is updated as the new optimal solution.The algorithm continues the optimization process of next iteration.
(4) Scouts stage.According to the basic principle of the ABC algorithm, the abandonment counters of iterations of all employed bees are compared, and the largest number of non-improvement iterations is denoted as abs.Then, the abs is checked with a predefined Limit.If abs > Limit, the related employed bee attached to the nectar position is converted into a Scout bee.The initialization operation is called to generate a new random nectar position and the abandonment counter is reset.The scout bee then becomes employed bee.
(5) Recording the optimal solution.The algorithm records the optimal solution in the iteration.If the iteration number is greater than the maximum iteration number of the algorithm, the optimization algorithm ends the run.

Pruning Rule and Updating Sample Set
(1) Pruning of conversion rules Rule pruning is to remove the unnecessary condition items in the conversion rules to improve its quality.Therefore, a rule pruning method is constructed in this paper, and its specific steps are as follows: removes each condition item in the conversion rule in turn, and calculates the rule's quality.If the quality is reduced, then returns the removed condition item.On the contrary, removes the condition item.Repeat the rule pruning until removing any of the condition items will reduce the quality of the rule.
(2) Updating of the sample set To avoid the repeated mining for a dominant sample, a sample updating method in the literature [39] is introduced to remove the sample data from the dataset that matches the current rule.This method can improve the efficiency of the mining process, and improve the quality of the sample set.

Study Area
Dengzhou City is a famous historical and cultural city in China.It is the central city of the Danjiangkou Reservoir area, and is also the canal head city of the south-to-north Water Transfer Project designated by the State Council.As shown in Figure 4, from the geographical point of view, Dengzhou is located near the large triangle cities of Wuhan-Xi'an-Zhengzhou and the small triangle cities of Nanyang-Xiangyang-Shiyan.Thus, it plays a "bridge" role in these cities.Dengzhou has a total area of 2369 square kilometers, with 28 townships (streets and districts).The total population is 1,771,200 (2016), and the permanent population is 1,434,700, of which the urban population is more than 400,000 and the urbanization rate is 36.62%.The area marked by the blue short broken lines in the Figure 4    Erenhot-Guangzhou G55 and the highway from Neixiang to Dengzhou in Henan Province have been built and passed through.The Luoyang-Zhanjiang railway, the Mengxi-Huazhong railway, and the Zhengzhou-Chongqing high-speed railway both transit and set up stations.National Road 207 passes by the Dengzhou, and the National Road G328 and Provincial Highways S231, S335, S249, S248, and S244 all transit through the city.As show in Figure 5, these traffic lines form a three-dimensional traffic network in Dengzhou.As regular grids are the working basis of the cellular automata model, it is necessary to rasterize the image data by the ArcGIS software in the paper.Taking into account the actual area and building density of Dengzhou city, the image data is rasterized with 50 m × 50 m resolution.In the case study, the prediction of urban land use type was studied to verify the effectiveness of the MHABC-CA algorithm and the distance variables of traffic road were taken as the main factors which affecting urban development.

Data Pre-Processing
Considering the purpose of the experiment and the availability of experimental data, this study selected the traffic distance data of Dengzhou city and used ArcGIS software to extract the required spatial distance variable data.Thus, six spatial distance variables including railways, highways, national roads, provincial roads, urban loops, and the number of cells were selected for this study.The descriptions, ranges, and standardized range of these distance variables are summarized in Table 2.For the CA model, each cell attribute feature is represented by a spatial distance variable.
The spatial analyst module in ArcGIS software was used, and the [Distance][Straight Line] tool in the module was called to calculate the distances from the cell center to the traffic roads, and to obtain the values of the six different spatial distance variables of traffic roads.Before the specific mining experiment, data need to be sampled for regulatory mining.To ensure the rationality of the experimental data, a stratified random sampling method was used in this paper, and 12,044 sample points were extracted from the image of 2005 and 2010 for rule mining experiments.The distance from a grid unit to the highway 0-56,100 (m) 0-1 DIST_GD The distance from a grid unit to the national road 0-42,100 (m) 0-1 DIST_SD The distance from a grid unit to the provincial road 0-14,400 (m) 0-1 DIST_HL The distance from a grid unit to the city loop 0-35,400 (m) 0-1 NEIGHBOR The number of city cells in the 7 × 7 window 0-49 0-1 In this paper, the sampling tool [Sample] in ArcGIS software is used to extract the corresponding sample of these spatial distance variables.Some extracted sample data is shown in Table 3.A complete sample data consists of cell attribute's features (i.e., the six types of spatial distance variables in Table 2) and cell state value.The value set of cell state is {0, 1}.If a cell transforms into an urban cell, its development status (denoted as DEV _STATUS) will be identified as 1, otherwise it will be identified as 0.

Data Pre-Processing
Considering the purpose of the experiment and the availability of experimental data, this study selected the traffic distance data of Dengzhou city and used ArcGIS software to extract the required spatial distance variable data.Thus, six spatial distance variables including railways, highways, national roads, provincial roads, urban loops, and the number of cells were selected for this study.The descriptions, ranges, and standardized range of these distance variables are summarized in Table 2.For the CA model, each cell attribute feature is represented by a spatial distance variable.
The spatial analyst module in ArcGIS software was used, and the [Distance][Straight Line] tool in the module was called to calculate the distances from the cell center to the traffic roads, and to obtain the values of the six different spatial distance variables of traffic roads.Before the specific mining experiment, data need to be sampled for regulatory mining.To ensure the rationality of the experimental data, a stratified random sampling method was used in this paper, and 12,044 sample points were extracted from the image of 2005 and 2010 for rule mining experiments.The distance from a grid unit to the highway 0-56,100 (m) 0-1 DIST_GD The distance from a grid unit to the national road 0-42,100 (m) 0-1 DIST_SD The distance from a grid unit to the provincial road 0-14,400 (m) 0-1 DIST_HL The distance from a grid unit to the city loop 0-35,400 (m) 0-1 NEIGHBOR The number of city cells in the 7 × 7 window 0-49 0-1 In this paper, the sampling tool [Sample] in ArcGIS software is used to extract the corresponding sample of these spatial distance variables.Some extracted sample data is shown in Table 3.A complete sample data consists of cell attribute's features (i.e., the six types of spatial distance variables in Table 2) and cell state value.The value set of cell state is {0, 1}.If a cell transforms into an urban cell, its development status (denoted as DEV _STATUS) will be identified as 1, otherwise it will be identified as 0. The above sample data was adopted as the experimental data, and then MHABC-CA algorithm was used to mine the rules of urban land use change in Dengzhou city to verify the effectiveness of the MHABC-CA algorithm.

Prediction Experiment of Urban Land Use Type
The selection of algorithm parameters can greatly affect the execution performance, thus, before taking the MHABC-CA algorithm to mine the rules for prediction of urban land-use type, the algorithm parameters should be identified.The parameters settings of MHABC-CA algorithm, which adopted the recommended parameter from the related researches and our previous experiments results, are shown in Table 4.In the ABC algorithm, the Limit parameter is one of the important parameters, and it is recommended to be performed for tuning or control in literature [40].However, an extensive parameter tuning is too expensive and also is not the focus of this paper, so we adopted the recommended formula Limit = SN × D [41], where SN is the number of food sources and D is the dimension of a problem, that is, the dimension of the cell feature vector 2 × n.In this paper, we selected six features to mine the rules for urban development.Thus, n is 6, = 20 × 2 × 6 = 240.To facilitate the calculation and evaluation, we set the Limit parameter value to 250 in the experiments.

Food sources number (SN) 20
The number of food sources which is equal to the number of employed or onlooker bees.

Individual search limit (Limit) 250
If the fitness value of a honey source has not been improved in the number of Limit iterations, the honey source will be discarded.

Number of iterations 2500
The maximum number of iteration of the algorithm Rule convergence threshold 500 The maximum number of convergence rules.
Minimum fitting ratio 0.01 When the ratio of remaining samples to total number of samples reaches this parameter, the mining algorithm will be stopped.

Effective coverage 0.005
The ratio of minimum samples to total number of samples that matches a rule which is to control the quality of the rule.
The MHABC-CA algorithm reads the sample data file, and calculates the state of current dominant cell.Then, according to the parameter setting of the algorithm, the MHABC optimization algorithm will optimize the mining process for conversion rule.As the corresponding conversion rule is generated from the sample data, the pruning method is used to trim the rule, and the sample data that matches the current rule will be removed.In the study, 23 conversion rules are mined using the MHABC-CA algorithm, and partial conversion rules are shown in the following reasoning.For the above conversion rule, whether the cell meets one of the rules determines whether it will switch its cell state at the next stage.The form of conversion rules in upper reasoning is not easily understood.Thus, we take the first rule as an example to describe its meanings as follows; if the distance from the cell to the railway is less than 6955.7 m, the distance from the cell to the highway is less than 1674.4m, the distance from the cell to the national road is less than 280 m, the distance from the cell to the provincial road is less than 6771.5 m, the distance from the cell to the urban loop is less than 3183.1 m, and the number of urban cells in the 7 × 7 neighborhood window is less than 21.5, then the state value of the cell is 1, which indicates that the cell will be converted to a urban state at the next stage.On the contrary, for the second rule, if the distance from the cell to the railway is less than 344.55 m, the distance from the cell to the highway is less than 8705.35m, the distance from the cell to the national road is less than 80.25 m, the distance from the cell to the provincial road is less than 1771.5 m, the distance from the cell to urban loop is less than is less than 383.1 m, and the number of urban cells in the 7 × 7 neighborhood window is less than 30.2, then the state value of the cell is 0, which indicates that the cell will be converted to non-urban areas in the next time.With continuous iteration simulation, the urban land use conditions of Dengzhou city in 2015 can be simulated by the MHABC-CA algorithm.

Analysis of Simulation Results
The mining quality of the MHABC-CA algorithm can be tested by conducting comparative tests and analysis for the simulated results and actual situation of urban land use of Dengzhou city in 2015.Therefore, two analysis methods, visual characteristics of urban land use types and quantitative simulation accuracy, were done in the paper.

Comparison of Visual Features
Visual features are people's intuitive visual perception to images.Therefore, comparing the visual features of simulated results images with actual images is the most intuitive and simple analytical method for testing simulation results [42].The comparison of visual features allows us to conduct a more comprehensive verification and analysis of the simulation results of the MHABC-CA algorithm.Although there is a certain degree of subjectivity, the analysis method has certain significance on the experimental results.
As shown in Figure 8, the (a) and (b) are the actual and the simulated land use maps of Dengzhou city for 2015, respectively.The urban area of Dengzhou city is divided into eight regions, A, B, C, D, E, F, G, and H, in this paper.The detailed descriptions are as follows: Area A is the northeast corner of the city.The simulation results of MHABC-CA algorithm are very close to the actual situation.There is no case of over-fitting and insufficient simulation.
Area B is the true north side of the city.The simulation results of MHABC-CA algorithm are basically same as the actual situation, but the simulation is insufficient.
Area C is the northwest corner of the city.The simulation results of MHABC-CA algorithm are basically same as the actual situation, but there is serious insufficient simulation.
Area D is the west side of the city.The simulation results of MHABC-CA algorithm are consistent with the actual situation.
Area E is the core urban area of the city.The simulation results of MHABC-CA algorithm are basically same as the actual situation, but the simulation results are over-fitted.
Area F is the east side of the city.The simulation results of MHABC-CA algorithm are basically inconsistent with the actual situation, and there is a certain problem of over-fitting.
Area G is the southwest corner of the city.The simulation results of MHABC-CA algorithm are basically same as the actual situation, but there is a certain degree of over-fitting.
Area H is the southwest corner of the city.The simulation results of MHABC-CA algorithm are basically same as the actual situation, but there is a certain degree of insufficient simulation.Area B is the true north side of the city.The simulation results of MHABC-CA algorithm are basically same as the actual situation, but the simulation is insufficient.
Area C is the northwest corner of the city.The simulation results of MHABC-CA algorithm are basically same as the actual situation, but there is serious insufficient simulation.
Area D is the west side of the city.The simulation results of MHABC-CA algorithm are consistent with the actual situation.
Area E is the core urban area of the city.The simulation results of MHABC-CA algorithm are basically same as the actual situation, but the simulation results are over-fitted.
Area F is the east side of the city.The simulation results of MHABC-CA algorithm are basically inconsistent with the actual situation, and there is a certain problem of over-fitting.
Area G is the southwest corner of the city.The simulation results of MHABC-CA algorithm are basically same as the actual situation, but there is a certain degree of over-fitting.
Area H is the southwest corner of the city.The simulation results of MHABC-CA algorithm are basically same as the actual situation, but there is a certain degree of insufficient simulation.In summary, although there is a certain degree of over-fitting simulation and insufficient simulation, the simulated urban land use situation predicted by the MHABC-CA algorithm is basically consistent with the actual situation of Dengzhou city, and the simulation can achieve the predicted effect from the view of visual features method.

Quantitative Analysis of Simulation Accuracy
The subjective influence of the comparison method of visual features is relatively large, so we want to adopt a quantitative analysis method to analyze and compare the changes of urban land use types.The error matrix is a standard method that can represent the data accuracy, which is represented by a matrix of n rows and n columns [43].The error matrix can be applied to the MHABC-CA model to analyze its simulation accuracy.In the matrix, the number of cells of urban land use in the actual situation and the simulated results are both recorded as a; the number of cells in the simulation result is urban land, but not urban land in the actual situation, is recorded as b; the number of cells that neither in the simulation results nor in the actual situation are urban land is denoted as c; and the number of cells that are not the urban land in the simulation results, but are the urban land in the actual situation, is denoted as d.Then, the error matrix can be used to calculate the quantitative simulation accuracy of the MHABC-CA algorithm, and the specific calculation method is shown in the Table 5.In summary, although there is a certain degree of over-fitting simulation and insufficient simulation, the simulated urban land use situation predicted by the MHABC-CA algorithm is basically consistent with the actual situation of Dengzhou city, and the simulation can achieve the predicted effect from the view of visual features method.

Quantitative Analysis of Simulation Accuracy
The subjective influence of the comparison method of visual features is relatively large, so we want to adopt a quantitative analysis method to analyze and compare the changes of urban land use types.The error matrix is a standard method that can represent the data accuracy, which is represented by a matrix of n rows and n columns [43].The error matrix can be applied to the MHABC-CA model to analyze its simulation accuracy.In the matrix, the number of cells of urban land use in the actual situation and the simulated results are both recorded as a; the number of cells in the simulation result is urban land, but not urban land in the actual situation, is recorded as b; the number of cells that neither in the simulation results nor in the actual situation are urban land is denoted as c; and the number of cells that are not the urban land in the simulation results, but are the urban land in the actual situation, is denoted as d.Then, the error matrix can be used to calculate the quantitative simulation accuracy of the MHABC-CA algorithm, and the specific calculation method is shown in the Table 5.To compare the simulation accuracy of the MHABC-CA algorithm with other algorithm, we introduced a cellular automata model based on particle swarm optimization (PSO-CA) algorithm.Similar to MHABC-CA rule mining algorithm, the method of mining a rule based on PSO-CA rule mining algorithm is also to find the optimal cell feature vector for a specific cell state.The cell feature vector is the position of the particle of PSO algorithm in the search space.Similar algorithms can also be found in the literature, for example, Feng et al. presented an improved cellular automata (CA) model of urban growth based on particle swarm optimization (PSO) approach with inertia weight, and the model was applied in Fengxian District of Shanghai Municipality, eastern China, to simulate the spatial-temporal process of urban growth from 1992 to 2008 at 30 m spatial resolution [44]; and Rabbani et al. presented a hybrid cellular automata (HCA) model with PSO algorithm, and applied it to landsat satellite imagery of the city of Tehran, Iran with 28.5-m spatial resolution to simulate the urban growth from 1988 to 2010 [45].
In the comparison experiments, an assumption that fitness evaluation is time-consuming and much bigger than algorithm internal calculations has been made, thus, the usual manner is to compare them on equal number of fitness evaluations consumed [46,47].For the PSO algorithm, the population size (SN 2 ) is 50, and c1 = c2 = 1.108, and w = 0.721 [48].The maximum number of iterations (MFE 1 ) of MHABC-CA is 2500, and the maximum number of iterations (MFE 2 ) of PSO-CA is 2050.Therefore, the maximum number of fitness evaluations of MHABC-CA algorithm is (SN + SN + 1) × MFE 1 + SN = 102,520, and the maximum number of fitness evaluations of PSO-CA algorithm is SN 2 × MFE 2 + SN 2 = 102,550.The two algorithms have a similar number of fitness evaluations.Then, each of the algorithms was repeated 30 times independently to analyze the experiment results.
To compare the execution performance of MHABC-CA algorithm and PSO-CA algorithm, the execution time (unit: Second) was statistically calculated in Tables 6 and 7. Max, Min, Mean, and SD represent the maximum value, minimum value, average value, and the standard deviation of the experiments results, respectively.The results showed the MHABC-CA algorithm can have the similar performance in execution time under the same operating conditions.For the validation phase, the validation results should exclude the areas that are resistant or excluded to development at the initial time, because those areas are not candidates for simulated change [49,50].The excluded areas from development consisted of water, wetlands, government, and local parks, among others [51].Then, evaluation was performed by excluding the excluded areas

Figure 3 .
Figure 3. Flow chart of the improved hybrid artificial bee colony algorithm based on the mutation of inferior solutions (MHABC)-CA algorithm.

Figure 3 .
Figure 3. Flow chart of the improved hybrid artificial bee colony algorithm based on the mutation of inferior solutions (MHABC)-CA algorithm.
is Dengzhou City.The geomorphology of Dengzhou city is characterized by few mountains and wide plains.The terrain is flat, the traffic is well developed, and the urban development is mainly influenced by traffic factors.Taking full account of the actual situations, the experimental verification of the MHABC-CA algorithm is mainly carried out to forecast the land use change from 2010 to 2015 in Dengzhou city.Information 2018, 9, x 12 of 22 total area of 2369 square kilometers, with 28 townships (streets and districts).The total population is 1,771,200 (2016), and the permanent population is 1,434,700, of which the urban population is more than 400,000 and the urbanization rate is 36.62%.The area marked by the blue short broken lines in the Figure 4 is Dengzhou City.The geomorphology of Dengzhou city is characterized by few mountains and wide plains.The terrain is flat, the traffic is well developed, and the urban development is mainly influenced by traffic factors.Taking full account of the actual situations, the experimental verification of the MHABC-CA algorithm is mainly carried out to forecast the land use change from 2010 to 2015 in Dengzhou city.

Figure 4 .
Figure 4. Geographic location map of Dengzhou city.

Figure 4 .
Figure 4. Geographic location map of Dengzhou city.

Figure 6 .
Figure 6.Remote sensing image of Dengzhou city in 2015.

Figure 6 .
Figure 6.Remote sensing image of Dengzhou city in 2015.

Figure 8 .
Figure 8.The actual and simulated land use maps of Dengzhou city for 2015.

Figure 8 .
Figure 8.The actual and simulated land use maps of Dengzhou city for 2015.

Table 5 .
Simulation accuracy calculation method in the error matrix.c)/(a + b + c + d) × 100% Attribute 1 ≥Lower_1 and Attribute 1 ≤ Upper_1 And Attribute 2 ≥Lower_2 and Attribute 2 ≤Upper_2 Number of bee swarm.FoodNumber: Number of foods sources.FoodNumber = SwarmNumber/2.Trial: Stagnation number of a solution.Limit: Maximum number of trial for the scouts to abandon a food source, and re-initiate a new solution.MFE: Maximum number of iterations.As the number of iterations reaches this value, the iteration process of algorithm will be stopped.MSN: Minimum number of samples.As the number of samples lower than this value, the mining process will be stopped.GenConvRule(); // Generate a conversion rule.PruneRule(best_rule); // prune the unnecessary conditions from the current rule.UpdateDataSet(data_set, best_rule); // remove the sample data that matches the current rule.UpdateRuleSet(RulesSet, best_rule); // update the current rule into the rules set.

Table 2 .
The value range of the six spatial distance variables.

Table 2 .
The value range of the six spatial distance variables.

Table 4 .
Parameter settings of improved hybrid artificial bee colony algorithm based on the mutation of inferior solutions (MHABC)-cellular automata (CA) algorithm.

Table 6 .
Execution performance comparison of MHABC-CA algorithm and particle swarm optimization (PSO)-CA algorithm.