K-Means Clustering Algorithm Based on Chaotic Adaptive Artiﬁcial Bee Colony

: K-Means Clustering is a popular technique in data analysis and data mining. To remedy the defects of relying on the initialization and converging towards the local minimum in the K-Means Clustering (KMC) algorithm, a chaotic adaptive artiﬁcial bee colony algorithm (CAABC) clustering algorithm is presented to optimally partition objects into K clusters in this study. This algorithm adopts the max–min distance product method for initialization. In addition, a new ﬁtness function is adapted to the KMC algorithm. This paper also reports that the iteration abides by the adaptive search strategy, and Fuch chaotic disturbance is added to avoid converging on local optimum. The step length decreases linearly during the iteration. In order to overcome the shortcomings of the classic ABC algorithm, the simulated annealing criterion is introduced to the CAABC. Finally, the conﬂuent algorithm is compared with other stochastic heuristic algorithms on the 20 standard test functions and 11 datasets. The results demonstrate that improvements in CAABA-K-means have an advantage on speed and accuracy of convergence over some conventional algorithms for solving clustering problems.


Introduction
Clustering procedure [1][2][3] is a process that divides a set of objects into clusters according to the predefined criteria such that objects in the same cluster are more parallel to each other than other objects in different clusters.Clustering is often used in solving a part of complicated tasks in pattern recognition [4], image analysis [5], and other fields on data processing [6].An excellent clustering algorithm still has a higher interferencefree capability and lower time complexity than traditional algorithm when processing large amounts of data.The clustering algorithms can be subdivided into two categories: hierarchical clustering and partitional clustering.The hierarchical clustering algorithm divides the pattern into fewer structures continuously, and it is usually described by the tree structure.Partition clustering is the division of a set of objects into K non-intersecting subsets with high internal similarity.The center-based clustering algorithms are the most popular partitional clustering methods.
K-means is simple and efficient, in which case it becomes one of the most popular center-based cluster methods [3].However, relying on the initialization of K states and convergence towards the local minimum are significant shortcomings of K-means classification.In order to overcome these problems, many other methodologies have been applied to algorithm.A clustering algorithm which based on genetic algorithm was proposed by Mualik and Bandyopadhyay, and its effectiveness is proved on real-life datasets.Simulated Annealing (SA) approach is proposed to solve the clustering puzzle by Selim and Al-Sultan (1991).Beyond that, many heuristic algorithms like Particle Swarm Optimization (PSO) [7], Differential Evolution (DE), and ABC, have also been successively adapted in the clustering algorithm optimization improvements.
ABC is a swarm intelligence algorithm that is derived from the honeybee colony's gathering behavior.After Turkish scholars Karabogac Denis raised this problem in 2005 [8,9], it was widely applied to solve function optimization problems, for the advantages of less parameters, simpler structure, and being easier to implement [10].Compared with PSO and Genetic Algorithm (GA), ABC's advantage is proved further in Section 4. Similar to other swarm intelligence algorithms, the performance of optimized ABC mainly depends on its search strategy.Due to the randomness of the search mechanism, the algorithm is easy to get stuck at local optimal value and has slow convergence speed.ABC has been optimized gradually after the proposal, and it has been extended to various fields.Inspired by PS0 algorithm, the global optimal solution is used to guide the search formula (GABC) in [11].Inspired by the DE algorithm, another new improved algorithm MABC was proposed.In [10], comparing with the original ABC algorithm, the MABC modifies the employed bee stage and onlooker bee stage, which improves the efficiency.The proposed algorithm is then applied to solve a loudspeaker design problem using FEM.The CGABC based on the crossover is proposed in [12].The Crossover operator of genetic algorithm is introduced into the Global optimized Artificial Bee Colony algorithm.Crossover is the transfer of good genes from the parent of a population to its offspring.The brand new ABC_elite is proposed in [13].To better balance the tradeoff between exploration and exploitation, it proposes a depth-first search (DFS) framework.The article introduces two novel solution search equations which incorporates the information of elite solutions and can be applied to the employed bee phase.Furthermore, many studies increase search efficiency by changing the greedy search mechanism.Sharma T K [14] changes the search path of scout bee.Two new mechanisms for the movements of scout bees are proposed.In the first method, the scout bee follows a non-linear interpolated path while in the second one, scout bee follows Gaussian movement.Yang, Weifeng Gao [15,16] improves the greedy search and adapts to more optimization problems using introduced adaptive methods.In addition, to enhance the global convergence, when producing the initial population and scout bees, both chaotic systems and opposition-based learning method are employed.Xiang W L [8] proposes a depse-first search framework.Gao W [17] increases information sharing among individuals through improvement.In addition, many scholars [18,19] choose to combine the ABC algorithm with other familiar algorithms for optimization.Each bee should select whether adopts greedy strategy or not based on its fitness value on each generation.A great progress in solving complex numerical optimization problems has achieved in [19,20].With the continuous improvement and optimization, ABC has been applied in more fields, such as workshop scheduling [21][22][23][24][25] software aging prediction [26], machine learning [27], multi-objective optimization [28], dynamic optimization [29,30] and so on.
ABC has unique advantages for data optimization problems.In this work, the improved ABC is extended to clustering procedures.Crossover operation and adaptive threshold are integrated to improved ABC algorithm.The simulated annealing technique and Fuch chaotic perturbation operation are drawn into the algorithm.Furthermore, the initialization equation and the fitness function are reformed according to the shortcomings of the K-means.The CAABA-K-means has been proved to be superior on speed and accuracy of convergence over some conventional algorithms for solving clustering problems.
The remainder of this work is distributed as follows: Section 2 discusses the ABC algorithm and clustering analysis problems.The chaotic adaptive ABC (CABC) algorithm adapted for solving K-means clustering problems is introduced in Section 3. Section 4 shows that our method outperforms some other methods by showing experimental studies.Section 5 is the conclusion, which summarizes our proposed method.

Artificial Bee Colony Algorithm
ABC is a swarm intelligence algorithm, which imitates the division of labor and the search mode of bees to find the maximum amount of nectar [8].In the classic ABC algorithm, the artificial bee colony is divided into three categories according to their behaviors: employed bee, onlooker bee, and scout bee.In the beginning, the number of employed bees equals to onlooker bees, and the third kind of bees appear gradually.The employed bee uses information about the initial honey source to find new honey source and shares the information with the onlooker bee.The onlooker bee waits in the hive and chooses a better source according to the greedy selection mechanism.However, if the honey source information has not been updated for a long time, the corresponding employed bee will be transformed into a scout bee.The task of scout bee is to search for the honey source randomly around the hive and find a new valuable honey source eventually.Self-organization, self-adaption, social division, and collaboration are significant features of the entire colony.ABC simulates the foraging behavior as the process of searching for the optimal solution, defines the adaptability to the environment of individuals as the objective function of the problem to be solved, and takes the greedy selection method as the basis for eliminating the different solution.This process is iterated until the optimal solution is reached and the entire function gradually converges.The steps can be described specifically as follows.

Initialization Stage
In ABC algorithm, the fitness value represents the quality of nectar sources and candidates solution corresponding to food sources.It is assumed that the population size is N.The initial solution is obtained through Equation (1).
where x ij is the j-th component of the i-th vector.i = 1, 2, . . .N, j = 1, 2, . . .D. x max j and x min j are the upper bound and lower bound of the j-th component, and rand ∈ [0, 1] is a random number from 0 to 1.The algorithm executes global searching randomly for food sources and derives the revenue value.

Employed Bee Stage
After initialization, the ABC algorithm starts the stage of employed bees.Employed bees search randomly around the current region according to Equation (2) and shares the information with the onlookers; thus, a new set of honey sources, V i = (V i1 , V i2 , . . .V iD ), is generated where In addition, j = 1, 2, . . .D is the index, which is chosen randomly.In addition, i = k is a necessary requirement to reduce duplication of effort.If the fitness value of a new source V ij is ameliorated, the source will be superseded by the new one.

The Probability of Selecting the New Food Source
The fitness value is calculated by Equation (3), which is an evaluation criteria of nectar source quality.
where i = 1, 2, . . .N. f it i denotes the fitness value of x i .The larger f it i value means the higher quality of honey source.In addition, f i is the value of the i -th nectar source s objective function.
When the fitness values are calculated, they are applied to calculate the probability of selecting the i-th honey source P i , which can be used as a basis for onlooker bees to select honey sources.
where f it i denotes the fitness value of the x i .

Scout Bee Stage
If the search steps reach a certain threshold, but no better position is found, the position of this employed bee is re-initialized randomly according to Equation (1).
Each food source is verified by employed bees and/or onlooker bees for its potential inclusion as a candidate position.The ABC algorithm utilizes employed bees, onlooker bees and scout bees to iteratively search the solution space until reach the maximum number of iterations.If a food source is not improved further through limit trials, it is deemed as an exhausted source and abandoned.Under different conditions, the three kinds of bees transform into each other and the result gradually approaches the optimal solution.The transformation diagram is as Figure 1.
higher quality of honey source.In addition,  is the value of the  -th nectar source′s objective function.
When the fitness values are calculated, they are applied to calculate the probability of selecting the -th honey source  , which can be used as a basis for onlooker bees to select honey sources.
where  denotes the fitness value of the .

Scout Bee Stage
If the search steps reach a certain threshold, but no better position is found, the position of this employed bee is re-initialized randomly according to Equation (1).
Each food source is verified by employed bees and/or onlooker bees for its potential inclusion as a candidate position.The ABC algorithm utilizes employed bees, onlooker bees and scout bees to iteratively search the solution space until reach the maximum number of iterations.If a food source is not improved further through limit trials, it is deemed as an exhausted source and abandoned.Under different conditions, the three kinds of bees transform into each other and the result gradually approaches the optimal solution.The transformation diagram is as Figure 1.

K-MEANS Cluster Algorithm
Clustering methods are suitable for finding internally homogeneous groups in data.The K-means algorithm is one of the oldest clustering techniques [1], which is constructed based on the iterative hill-climbing process.The main idea is to gather the original data into K clusters according to similar attributes.The main processing procedure is as follows.Firstly, K samples are randomly selected from the original data, and each sample is taken as the center of K clusters.Then the distance between the K center samples and the remained samples is separately calculated.According to the calculation, each sample is classified into the nearest cluster.The iterative process is repeated until the cluster no longer changes.Therefore, the traditional K-means clustering is expressed as follows.

K-MEANS Cluster Algorithm
Clustering methods are suitable for finding internally homogeneous groups in data.The K-means algorithm is one of the oldest clustering techniques [1], which is constructed based on the iterative hill-climbing process.The main idea is to gather the original data into K clusters according to similar attributes.The main processing procedure is as follows.Firstly, K samples are randomly selected from the original data, and each sample is taken as the center of K clusters.Then the distance between the K center samples and the remained samples is separately calculated.According to the calculation, each sample is classified into the nearest cluster.The iterative process is repeated until the cluster no longer changes.Therefore, the traditional K-means clustering is expressed as follows.
where k is the num- ber of clusters.
K-means criterion function is expressed as follows: where d x i , C j represents the distance between data x i and its clustering center C j , and J represents sum of its internal distances.

Chaotic Adaptive Artificial Bee Colony (CAABC) for Clustering
During the iterative optimization process of the classic ABC algorithm, both employed bee and onlooker bee follow a completely random search strategy.As a result, the ABC algorithm has strong global search capabilities.However, the algorithm selects the honey source blindly at the stage of the employed bee.Only the random number φ ij , between -1 and 1], can be used to control the search region of the neighborhoods, and the search direction of onlooker bee is guided by the information of honey source from employed bee.In the entire iteration process of the algorithm, the blindness and randomness make the algorithm more complex and sacrifice the accuracy of algorithm results.For balancing the exploration and exploitation performance, ABC has been optimized from different perspectives.To improve the convergence speed, GABC combines the ABC with PSO algorithm and the global optimal solution is referred to change the random neighborhood searching of classic ABC, which enhances the exploitation performance of algorithm.However, simply adding the global optimal solution not only enhances the convergence speed but also increases the risk of premature convergence.Therefore, inspired by GABC, the information carried by ordinary individuals in this paper is effectively utilized and the search space is adaptively adjusted to avoid premature convergence.In order to overcome the disadvantages of K-means clustering algorithm, such as over dependence on the initial clustering center, easily falling into local optimum, and the premature and slow convergence of the artificial bee colony algorithm due to the limitations of search strategies, a hybrid clustering method combining the improved artificial bee colony algorithm and K-means algorithm is proposed which makes full use of the characteristics of the improved artificial bee colony algorithm and K-means algorithm.

MAX-MIN Distance Product Algorithm
Initialization affects the global convergence and the performance of the algorithm, so it is particularly important in the evolutionary algorithm.K-means clustering algorithm has high sensitivity to the initial stage.Based on [31,32], we propose a max-min distance product algorithm for initialization.The initialization process not only reduces the randomness of colony initialization but also reduces the sensitivity of Kmeans clustering to initial points.In [33], the max-min distance means is used to search the optimal initial clustering center, in which case convergence speed and accuracy of the algorithm have been significantly improved.However, it may lead to clustering conflict when the initial clustering center is excessive dense.The maximum distance method proposed in [34] reduces the number of iterations effectively, but there would be the problem of initial point deviation.It is possible that the product of two distances is the same, but the density of the points is quite different.
To improve the efficiency, we propose a max-min distance product algorithm.We can get T m from T according to Equation (6).T m represents the product of the maximum and minimum values in od.
T m = mix(od) * min(od) where k points are randomly selected as the initial cluster centers from the original data set.Meanwhile, is a data set used to store the distance between other points in the data set among cluster centers.T is an array that stores the product of elements in od.T m is the product of the maximum center distance and the minimum center distance.The points that corresponding to the T m are selected as cluster centers instead of initial points.The distribution of current initial points can be dispersed by the max-min distance product.Moreover, it can avoid the situation that two distance products are equal, but the point density of their regions is quite varied, and magnify the difference between points.The enhanced selection method is better.

New Fitness Function
Fitness function is the crux of population evolution, which determines the solution quality directly.It is the key factor which affects the stability and convergence of the algorithm.Based on the characteristics of the iteration of ABC and the basic idea of the K-means algorithm, a new fitness function is adopted in this work: where CN i represents the number of samples in the i -th cluster.
J i denotes the sum of the distance between the sample points among the centers.The new fitness value can give rise to equilibrate the influence of the numbers and distance of samples.The phenomenon of inaccurate judgment caused by the same value of CN i or J i is avoided, which improves the adaptability of the function and makes the iterative process more accurate.

Arregate-Dispeise Algorithm
To improve convergence speed and accuracy, aggregate-disperse algorithm is introduced here.On the basis of the simplex method, we propose an "aggregate-disperse operation" as a guiding strategy for the iteration.According to the relationship among the global optimal solution, elite solution and ordinary individual, the search range and step length change adaptively.

•
The Simplex Method The simplex method is a traditional optimization method which uses the iterative transformation of the vertex of the geometric graph to approach the optimal value gradually.Take a dual function as example.Take three points X 1 , X 2 , X 3 , which are not collinear, as the vertices to form a triangle.Calculate the function value f (X 1 ), f (X 2 ), f (X 3 ) and compare them to each other.Calculate the function value f (X 1 ), f (X 2 ), f (X 3 ) and compare them to each other.
(1) f (X 1 ) > f (X 2 ) > f (X 3 ) means X 1 is the worst solution, and X 3 is the best one.
The algorithm should search in the opposite direction to find the minimum.X 4 = (X 2 + X 3 )/2 is the midpoint of X 2 X 3 , X 5 is on the extension line of X 1 X 4 , and X 5 is called the reflection point of X 1 with respect to X 4 : where α is the reflection coefficient, which equals 1 as usual.The geometric relationship is shown in Figure 2: Algorithms 2021, 14, x FOR PEER REVIEW 7 of 25

Aggregate and Disperse Operator
The purpose of the aggregate operation is to provide guidance for the population gathering in a potential direction during the iteration.With this operation, the algorithm can increase the convergence speed in the initial stage and strengthen the local search ability in the later stage. ,  ,  are three given solutions, the worst individual is moved toward the others.In order to accelerate the convergence speed, the elitist solution  and the global optimal solution  are used to guide the search process.The parameter setting is  ∈ (0,2) [31] to ensure that the new solution maintains convergence while moving toward a better direction generally. can be considered as the punishment parameter for the poor individual, and -α is the encouragement parameter.
∈ (0,1) denotes that the better solution is not found, so the influence of the original strategy should be moderately weakened. ∈ (1,2) denotes that the original strategy should be strengthen.
The parameters are transformed to form a convex combination to avoid negative weight in the later stage.
The simplex method and ABC are fused in the two-dimensional coordinate space in Figure 3: where  ∈ (0，2),  ∈ (0，1).The vectors OA, OB, and OC represent the global optimal solution, elite solution, and ordinary individual respectively.If  =  + (1 − ), according to convex combination analysis, the point  lies on the line segment , and  ends up in the triangle ′′.When the searching area is extended to n-dimensional space, the point  will fall into an area with the line segment  as the median line, which is the potential space limited by elite solutions and globally optimal solutions.Finally, multiple planes intersect at the global optimum.The results converge to the globally optimal solution with high probability.However, if the three solutions are collinear, it will be trapped in the local optimum because the search space is too narrow.Therefore, the disperse operation is used to expand the search space.
The vectors ,  are random numbers between 0 and 1.After the dispersion operation, the search area is extended to triangle area Δ′.In multidimensional space, multiple planes intersect at the global optimum with high probability eventually.Figure 4. demonstrates the disperse operation in the two-dimensional plane.(2) f (X 5 ) ≤ f (X 3 ) denotes that that direction of searching is correct, the algorithm should keep going in this direction.Let α = 1.5.If f (X 6 ) ≤ f (X 5 ), X 5 is replaced by X 6 to form a new simplex, or X 6 is dropped. ( means that the searching is going in the right direction, but doesn't need to expand.(4) f (X 2 ) ≤ f (X 5 ) ≤ f (X 1 ) demonstrates that X 5 has gone too far to need to be retracted.(5) If f (X 5 ) > f (X 1 ), X 1 , X 2 need to be retracted toward X 3 .

•
Aggregate and Disperse Operator The purpose of the aggregate operation is to provide guidance for the population gathering in a potential direction during the iteration.With this operation, the algorithm can increase the convergence speed in the initial stage and strengthen the local search ability in the later stage.x global , x ebest , X k are three given solutions, the worst individual is moved toward the others.In order to accelerate the convergence speed, the elitist solution x ebest and the global optimal solution x global are used to guide the search process.
The parameter setting is α ∈ (0, 2) [31] to ensure that the new solution maintains convergence while moving toward a better direction generally.α can be considered as the punishment parameter for the poor individual, and -α is the encouragement parameter.
α ∈ (0, 1) denotes that the better solution is not found, so the influence of the original strategy should be moderately weakened.
α ∈ (1, 2) denotes that the original strategy should be strengthen.The parameters are transformed to form a convex combination to avoid negative weight in the later stage.
The simplex method and ABC are fused in the two-dimensional coordinate space in Figure 3: where β ∈ (0, 2), ϕ ∈ (0, 1).The vectors OA, OB, and OC represent the global optimal solution, elite solution, and ordinary individual respectively.If OE = βOA + (1 − β)OB, according to convex combination analysis, the point E lies on the line segment AB, and OF ends up in the triangle ∆A B C. When the searching area is extended to n-dimensional space, the point F will fall into an area with the line segment AB as the median line, which is the potential space limited by elite solutions and globally optimal solutions.Finally, multiple planes intersect at the global optimum.The results converge to the globally optimal solution with high probability.However, if the three solutions are collinear, it will be trapped in the local optimum because the search space is too narrow.Therefore, the disperse operation is used to expand the search space.
The vectors ζ, γ are random numbers between 0 and 1.After the dispersion operation, the search area is extended to triangle area ∆ABC .In multidimensional space, multiple planes intersect at the global optimum with high probability eventually.Figure 4. demonstrates the disperse operation in the two-dimensional plane.

Adaptive Adjustment
In the iterative process of ABC algorithm, the neighborhood search range is controlled by a random parameter, and the neighborhood search is performed randomly and aimlessly.The effectiveness of the algorithm is influenced by the blindness and randomness visibly.In order to remedy the defects, an adaptive parameter is used to adjust the algorithm's search step length.Furthermore, in order to have stronger adaptive performance, we replace the fixed-size parameter with an alterable one, (), during the iteration.
() = −2(( − ^1.9))+ 2; (13 where  = /   . and    are the number of current iteration and the maximum iteration severally.As is shown in Equation ( 13), the step length factor () decreases and adjusts adaptively with the iteration process.In the initial time, the global searching is executed efficiently with a large step length, and the step length is variable in the later process to achieve a detailed local search.

Genetic Crossover
The randomness of the searching method limits the optimization ability and affect the convergent rate of canonical ABC.To balance the performance of the algorithm, the crossover operation is carried out to intersect with the global optimal solution based on unbiased adaptive optimization.The main goal of the GA algorithm is for reference, and the diversity of the population and overall optimization ability are further increased by

Adaptive Adjustment
In the iterative process of ABC algorithm, the neighborhood search range is controlled by a random parameter, and the neighborhood search is performed randomly and aimlessly.The effectiveness of the algorithm is influenced by the blindness and randomness visibly.In order to remedy the defects, an adaptive parameter is used to adjust the algorithm's search step length.Furthermore, in order to have stronger adaptive performance, we replace the fixed-size parameter with an alterable one, s(iter), during the iteration.s(iter) = −2(exp(−qˆ1.9))+ 2; (13) where q = iter/max cycle.iter and max cycle are the number of current iteration and the maximum iteration severally.As is shown in Equation ( 13), the step length factor s(iter) decreases and adjusts adaptively with the iteration process.In the initial time, the global searching is executed efficiently with a large step length, and the step length is variable in the later process to achieve a detailed local search.

Genetic Crossover
The randomness of the searching method limits the optimization ability and affect the convergent rate of canonical ABC.To balance the performance of the algorithm, the crossover operation is carried out to intersect with the global optimal solution based on unbiased adaptive optimization.The main goal of the GA algorithm is for reference, and the diversity of the population and overall optimization ability are further increased by crossing with the excellent parent generation.Crossover operations are performed to find more valuable individuals in the searchable space.The larger the size of the intersection, the more combinations of the allogeneic genes are exchanged, and the wider the searchable range is.However, with the expanding of the size of the intersection, the increase of searchable scope shrinking.The larger the scope of the crossover operation means the smaller the probability that any individual in the space can be searched.Therefore, the probability of excellent vertices being searched will affected by the scope of intersection.
CR is the local search coefficient, that is used to control the activity of individuals during the local search.The smaller the value means the more active of individual's behavior.
The improved algorithm in this paper has as good local search ability because of the ergodicity of chaotic search.Combined with the characteristics of gene crossover and the ergodicity of the chaotic disturbance, we conduct a comparative test for different CR values from 0 to 1, and finally concludes that the algorithm can achieve better performance when CR = 0.6.
Combining the improvements above, we can get a new position updating formula, the calculation process is shown as Equation ( 14).
If the location of three solutions is on the same line, the position updating criterion is changed to Equation ( 15) on the base of disperse operation: In the iteration, the cross factor cr = 0.6, x global represents the global optimal solution, x k,j is the ordinary individual selected from {1, 2 . . .N} randomly, and x ebest is the elitist solution.After sorting the solution, x ebest is selected from the top R * N solutions randomly, where N is the population number, and R = 0.1.
In order to verify the effectiveness of the improved method in this part, the improved ABC algorithm whose position updating according to the "aggregate-disperse operation" and cross operation is temporarily named CAABC-2.As the components of CAABC, its effectiveness will be proved in the fourth part.

New Chaotic Disturbance
Chaos is a unique movement pattern of a nonlinear system with particular features of sensitivity to the initial value, randomness, and ergodicity.Chaotic search is generated by iterating chaos sequence through a certain particular format and extending the numerical range of the chaos variables to the value range of the optimization variables through the form of the carrier wave.Fuch chaos [32], as a new type of discrete mapping, has unique advantages over logistic chaotic mapping, with more optimized chaotic performance and fewer iterations.It is proved that the chaotic map has no rational number fixed point, then the mapping relational formula is used to establish a chaotic model that is used to solve the Lyapunov exponent, and the sensitivity of chaotic maps to initial values is investigated under large variation and small variation on initial starting points.The chaotic map is then used to establish chaotic generator to replace the finite-collapse map, and to improve the dynamic performance of chaotic optimization.The method improves the search efficiency by continuously reducing the searching space of variables and enhancing search precision.It is more ergodic and does not fall into local optimum with incorrect initial value setting.The expression is: Thus, iteration sequence X n+1 is obtained.In the formula, n = {1, 2, 3 . . .N}; The Lyapunov exponent of Fuch chaos is solved in [32], and the results shows that Fuch chaos has a stronger chaotic property and a more homogeneous ergodic property than Logistic chaos and Tent chaos.
In this work, the adaptive value of the function is computed based on a novel function and the chaotic disturbance is increased to 15% of individuals with poor performance and the elite solution to update the historical optimal adaptive value.If the new solution is superior, the new position will be used to replace the original one.The chaotic algorithm can effectively avoid converging on local optimum and gain higher precision.
To verify the effectiveness of the Fuch chaos in the CAABC, on the basis of CAABC-2, chaotic disturbance is added.It is named CAABC-1 temporarily.

New Probability of Selecting Based on SA
Simulated Annealing (SA) algorithm is a heuristic Monte Carlo inversion method [33].The temperature attenuation function is used to control the temperature declining process for simulated annealing of solid-state systems.In this work, the Metropolis algorithm is integrated into ABC.When the adaptive value of the new honeybee source is lower than the current one, it might be accepted with a certain probability.The annealing temperature T determines the probability.
For the simulated annealing nonlinear inversion, the cooling function is: where T(t) is the temperature of t times, T 0 is initial temperature value, and σ is the coefficient of cooling, generally between 0.9 and 1.In this work, the single variable experiments were carried out within the standardized threshold for several times, and the algorithm achieved the best performance when the value of σ was determined to be 0.95 eventually.The difference between the new fitness value F and the current fitness value F is: If ∆F < 0, the new food source is selected, or the selection is conducted according to the Metropolis algorithm.
The inferior solution with poor performance is accepted possibly according to the metropolis rule, therefore, the points are easier to escape from the local optimum, and the prematurity of ABC algorithm has been largely curbed.

The Procedures of CAABC-K-means
The novel clustering algorithm is integrated with chaotic adaptive artificial bee colony (CAABC) and K-means cluster (KMC) algorithm.The new location obtained by CAABC is used as the initial point of KMC for iteration process, and then the new center point obtained after calculation is applied to update the swarm.In order to match up to KMC, the max-min distance product algorithm and a novel fitness function are proposed based upon ABC algorithm.In the search space, the step length is reduced adaptively when the search approaches the optimum solution.Moreover, the cross-operation increases population diversity in the position updating process.Furthermore, the ergodicity of Fuch chaotic perturbation is carried out on the elite solution and infeasible solutions, meanwhile, the inferior solution is accepted with a certain probability according to the metropolis rule.Hence, the points are easier to jump out of the local optimum, and the prematurity of ABC algorithm has been largely curbed.The employed bee is translated into a scout when the food source of which has been exhausted.If a scout discovered a valuable food source, it would be employed.
In this way, CAABC algorithm and K-mean clustering are alternately performed until the end of the algorithm.The flow chart of algorithm execution is shown in Figure 5.The main steps of the algorithm can be described as follows:

1.
Initial parameters are set as follows: N represents the number of population, D denotes the space vector dimension, max cycle is the maximum iteration times, and cross parameter cr = 0.6.limit is the threshold of maximum optimization times, and the annealing coefficient σ = 0.95.The initial population is obtained according to the max-min distance product algorithm.

2.
The fitness value can be obtained according to Equation (3), and then solution approaches to the global optimal solution.At the same time, chaotic perturbations are added into the elite solution, which is selected from the preponderant solution set randomly and the infeasible solution in the bottom 15% according to Equation ( 16).
The position is updated according to Equation (14) or Equation (15).Eventually, the location of the honey source is extended to the D-dimensional space.Whether the new solution is accepted depends on the Metropolis criteria.

3.
Onlooker bee executes the employed bee option and neighborhood searching performs under the same criteria.4.
The updated location information, which is obtained after all the onlooker bees have completed the search, is used as the clustering center, the data set is performed a K-means iterative clustering, and the clustering center of each class is refreshed with the clustering division.

5.
If limit for abandonment is reached, the employed bee determines whether the number of updates reaches the limit.If the limit is reached, the employed bee is translated into a scout when the food source of which has been exhausted.A new round of honey source searching begins.6.
If the number of iterations has reached the maximum "max cycle", the optimal solution is output, otherwise, the algorithm goes back to step 2. 7.
K-means algorithm is executed to get results.

Numerical Experiments
In order to verify the effectiveness of CAABC, we design an optimization performance test on 20 benchmark functions.In order to make fair comparison, the parameter settings are referred as [35].The algorithm is compared with the classic ABC, the Hybrid Artificial Bee Colony which proposed memory mechanisms (HABC) [34], the Improved Artificial Bee Colony which charges permutation as employed to represent the solutions (IABC) [36] and the DFSABC algorithm respectively.In order to verify the effectiveness of each component of the algorithm, CAABC-1 and CAABC-2 are also compared.The details of benchmark functions are listed in Table 1.In addition, we also use three standard evaluation index to evaluate the clustering performance of CAABC-K-means algorithm and other algorithms in this part.

Number
Equation Name Domain The simulation experiment is coded using MATLAB ® R2019a, running on a system with 2.5 GHz Core-i5 CPU, 4 GB RAM, and Windows 10 operating system.

Test Environment and Parameter Settings
The experimental parameters are set as follows: dimensions D are 30 and 60 respectively, and the maximum number of iterations max cycle is set to 15e4 and 30e4 respectively.In addition, the population size N is set to 20 and the limit is set to D * N/2.Under different dimension conditions, we run each benchmark function for 20 times independently.

CAABC Performance Analysis
To demonstrate the superiority and effectiveness of the CAABC, the CAABC algorithm is compared with other well-known algorithms on twenty benchmark problems.The population parameter settings are same as the setting mentioned in [36]: N = 25, the maximum number of evaluations maxcycle = 10, 000 * D, and other function parameter settings are shown in Table 2. Tables 3 and 4 demonstrate the comparison under the 30-dimensional and 60-dimensional parameter settings respectively.The best results are shown in bold.All algorithms are executed in the same machine environment.Each result was recorded after separate trials for 20 times.The results are listed in Tables 3  and 4; it can be clearly seen that most results of CAABC are remarkable in accuracy of convergence.Other algorithms are run with longer CPU time, which proves the superiority of CAABC algorithm.We select five representative test function for comparison, namely, Sphere (unimodal separable function US), Rosenbrock (unimodal nonseparable function UN), Rastrigin, Alpine (multimodal separable function MS) and Ackley (multimodal nonseparable function MN).The result is given in Figures 6 and 7, which can make it visualized clearly from different views.In addition, the abscissa of the Figure 6 and 7 represents the number of iterations of the algorithm, the ordinate represents the value of the optimization function.With the increase of dimension, the results of CAABC are even closer to ideal results.It indicates that the optimization effect of the CAABC is better than canonical ABC and    With the increase of dimension, the results of CAABC are even closer to ideal results.It indicates that the optimization effect of the CAABC is better than canonical ABC and According to the results in Tables 3 and 4, it shows that CAABC is superior to or at least equal to other algorithms in the rest of benchmark functions except for the several benchmark functions.The case shown in Table 3 happens only on f 9 , f 11 and f 14 .In addition, in these functions, the difference between the improved algorithm and the others is less than 5%.However, in 60 dimensional comparison experiment, in Table 4, the improved algorithm achieves good results in f 15 .At the same time, it can be clearly seen in Figure 5 that CAABC also has a better convergence rate.Based on the above experimental sresults, the superiority of this algorithm is proved.
With the increase of dimension, the results of CAABC are even closer to ideal results.It indicates that the optimization effect of the CAABC is better than canonical ABC and other mentioned algorithms.The best value, the worst value, the average value, or the standard deviation, might be more ideal when the running time is basically the same.By comparing with CAABC-1 and CAABC-2, the effectiveness of the improved algorithm components has also been verified.Overall, compared with classic ABC and other improved ABC algorithms, the accuracy and efficiency of convergence have been enhanced.The exploration and exploitation performance are productively balanced at the same time.

CAABC-K-means Performance Analysis
The CAABC clustering algorithm on four standard evaluation indices are tested and compared with other well-known algorithms to evaluate the clustering performance of the proposed algorithm.To prove the clustering performance of the improved algorithm, in addition to the comparison algorithm mentioned above, experimental comparison with PA [35] and GPAM [37] clustering algorithm are also added.The general parameter setting is shown in Table 2.In addition, the maximum number of iterations max cycle is set to 100.The eleven datasets are Iris (7 January 1988), Balance-scale (22 April 1994), Wine (7 January 1991), E.coli (1 September 1996), Glass (9 January 1987), Abalone (12 January 1995), Musk (9 December 1994), Pendigits (7 January 1998), Skin Seg (17 July 2012), CMC (7 July 1997), and Cancer datasets (3 March 2017)(http://archive.ics.uci.edu/ml/).They have been considered to study and evaluate the performance of algorithms by many authors.The details of Iris, Balance-scale, Wine, E.coli, Glass, Abalone, Musk, Pendigits, Skin Seg, CMC, and Cancer datasets are summarized in Table 5.The optimal result is shown in bold in Tables 6-9.We use standard evaluation index, Normalized Mutual Information (NMI), Accuracy (ACC), and F-score [38] to evaluate the clustering performance of CAABC-K-means algorithm and other algorithms.The corresponding results and the running time of the algorithm are analyzed in Tables 6-9.
The NMI is defined as follows: In the function, the I is mutual information between the sample and the label and H is the entropy.
In addition, the Accuracy (ACC) can be described as follow: where N S is the number of samples, and N C is the correct number of samples.In this paper, F-score is used to measure the accuracy of the clustering results.The performance comparisons among all the models are reported before and visualized in Table 9. improve population diversity.Moreover, the simulated annealing criterion is integrated into the probability selection to achieve a better precision.By selecting the appropriate functions, the characteristics of ABC for group optimization are retained, and the local optimal solution can be avoided effectively.In addition, CAABC-K-means has the global search ability of CAABC, which reduces the number of iterations of K-means.The problem of poor global search ability of Kmeans algorithm is solved by the combination of two algorithms.Furthermore, according to the characteristics of the clustering algorithm, the impacts of the sample numbers and the distance between the sample centers are took into account in fitness selection, which reduces the possibility that the distribution of samples is excessive clustered.To evaluate the performance of the confluent algorithm, it is compared with other stochastic heuristic algorithms on several benchmark functions and real datasets.It can be concluded from the primary results of experience, which are very promising in terms of the accuracy of the solution found and the processing efficiency, that the CAABC-K-means clustering algorithm achieves better results.
The efficiency and accuracy of the algorithm have been improved, but the time complexity cannot be reduced effectively because of the location update formula which is still guided by global optimal solution.How to ensure the advantages of the existing algorithm while reducing the time complexity will be our next research direction.Applying the proposed algorithm to solve other optimization problems and improving the performance of the clustering algorithm will be considered in our future work.

Figure 1 .
Figure 1.Structure Diagram of Artificial Bee colony Algorithm.

Figure 1 .
Figure 1.Structure Diagram of Artificial Bee colony Algorithm.

Figure 6 .
Figure 6.Comparison with other improved Artificial Bee Colony algorithms in 30 dimensions.

Figure 7 .
Figure 7.Comparison with other improved Artificial Bee Colony algorithms in 60 dimensions.

Figure 6 . 25 (
Figure 6.Comparison with other improved Artificial Bee Colony algorithms in 30 dimensions.

Figure 6 .
Figure 6.Comparison with other improved Artificial Bee Colony algorithms in 30 dimensions.

Figure 7 .
Figure 7.Comparison with other improved Artificial Bee Colony algorithms in 60 dimensions.

Figure 7 .
Figure 7.Comparison with other improved Artificial Bee Colony algorithms in 60 dimensions.

Table 3 .
Comparison with other improved Artificial Bee Colony in 30 dimensions.

Table 4 .
Comparison with other improved Artificial Bee Colony in 60-dimensions.

Table 5 .
The datasets downloaded from UCI Machine Learning Repository.

Table 6 .
The Normalized Mutual Information for classifying the eleven training datasets.

Table 7 .
The Accuracy for classifying the eleven training datasets.

Table 8 .
The average running time (sec.)for classifying the eleven training datasets.