Intelligent Dendritic Neural Model for Classiﬁcation Problems

: In recent years, the dendritic neural model has been widely employed in various ﬁelds because of its simple structure and inexpensive cost. Traditional numerical optimization is ineffective for the parameter optimization problem of the dendritic neural model; it is easy to fall into local in the optimization process, resulting in poor performance of the model. This paper proposes an intelligent dendritic neural model ﬁrstly, which uses the intelligent optimization algorithm to optimize the model instead of the traditional dendritic neural model with a backpropagation algorithm. The experiment compares the performance of ten representative intelligent optimization algorithms in six classiﬁcation datasets. The optimal combination of user-deﬁned parameters for the model evaluates by using Taguchi’s method, systemically. The results show that the performance of an intelligent dendritic neural model is signiﬁcantly better than a traditional dendritic neural model. The intelligent dendritic neural model has small classiﬁcation errors and high accuracy, which provides an effective approach for the application of dendritic neural model in engineering classiﬁcation problems. In addition, among ten intelligent optimization algorithms, an evolutionary algorithm called biogeographic optimization algorithm has excellent performance, and can quickly obtain high-quality solutions and excellent convergence speed. datasets in the dendrite neural model. based on evolutionary ideas, PBIL based on mathematical statistics, ABC, ACO, PSO, and WOA based on swarm intelligence, and new algorithms HHO, ChOA. The experiment uses Taguchi’s method to obtain a reasonable combination of four parameters of DNM. The experiments compare and analyze effectiveness, convergence speed, and classiﬁcation accuracy of the algorithm. The experimental results show that the intelligent dendritic neural model (DNM-BBO) is obviously superior to the traditional dendritic neural model. At the same time, through the comparison of intelligent optimization algorithms, the result shows that BBO algorithm has excellent performance, and its robustness, accuracy and convergence speed are the best. The intelligent dendritic neural model established in this study is a powerful tool for solving classiﬁcation problems and provides more choices in practical engineering applications.


Introduction
Since the dawn of the big data era, every corner of human society has accumulated a large amount of data. There is an urgent need for computer algorithms that can properly evaluate and utilize data, while machine learning just meets the urgent need of the era. Classification problems, as a hot topic of machine learning, have a widespread application in reality [1], such as common spam recognition speech recognition, tumor recognition, bank credit loan business and so on [2,3]. For solving various classification problems, lots of machine learning techniques have been proposed, such as decision tree [4], naïve Bayesian classifiers [5], support vector machine [6], artificial neural networks [7], k-nearest neighbor [8], ensemble learning [9] and so on.
Due to the high-dimensional characteristics of complex nonlinear problems, traditional methods cannot effectively solve these problems. Among them, artificial neural network simulates the information processing mechanism of biological neural network, which has a good fitting effect on nonlinear problems and has been successfully used in text classification, pattern and speech recognition. In 1943, an artificial neural network was proposed by McCulloch and Pitts firstly [10]. This model is relatively simple but significant. In 1957, Rosenblatt came up with perceptron model, which was based on the M-P model [11]. Perceptron has fundamental principles of modern neural networks, while its structure is consistent with the real biological nerves. However, the linear perceptron has limited functions and cannot even solve the simple XOR problem. In 1986, with research based on multilayer neural networks, Rumelhart put forward the backpropagation algorithm (BP) for weight correction of a multilayer neural network [12]. It solved the learning problem of a multilayer feedforward neural network and proved that a multilayer neural network has a strong learning capacity, which can complete lots of learning tasks and solve a large number of practical problems.
With the deepening of research, engineering problems become more and more complex. The field of machine learning is also quickly evolving, and many scholars have proposed many different neural networks, such as a cyclic neural network [13], convolutional neural network [14], feedforward neural network [15] and dendritic neural model [16][17][18].
The artificial neural network is originally constructed in the form of a perceptron [19]; almost all modern artificial neural networks use this kind of structure, but experiments in the field of physiology have found that biological neurons are far more complex than the above model. Studies have shown that dendritic structures contain a large number of active ion channels, and synaptic input may have a linear effect on its adjacent synaptic input [20]. In addition, many studies have shown that the biological plasticity mechanism also plays a local role in dendrites [21]. These characteristics greatly promote the role of local nonlinear components in neuron output and endow neural networks with higher information processing capabilities [22]. Different from other neural models, a dendritic neural model considers the nonlinearity of synapse, and it simulates the process of the information transmission of a neuron [17]. Because of its easy explanation and simple implementation, it has been used by many scholars to solve various complex problems, for example, tourism economic forecast [23], bankruptcy prediction [24], breast cancer classification [25], liver disorders [26] and so on [27][28][29][30][31][32][33]. References [25,26] use traditional dendritic neural models with backpropagation algorithm to optimize weights and thresholds. Backpropagation is gradient descent in essence; it has poor robustness, and it is extremely easy to fall into local traps [34]. References [27,35], respectively, proposed to use particle swarm optimization and a states of matter search algorithm as the optimization algorithm. However, their comparative experiment is single, and there is a lack of systematic and complete research on the application of intelligent optimization algorithm in dendritic neural models.
With the continuous innovation of evolutionary computation, intelligent algorithm has a rapid development and a wide range of practical applications in various fields such as model symmetry/asymmetry, model architecture and hyper-parameters, clustering and prediction, becoming a novel method to solve traditional optimization problems in machine learning.
Intelligent optimization algorithm is a cluster of algorithms. With continuous research and development, the algorithm cluster is growing, and a variety of algorithms arises at the historic moment, most of them inspired by biological evolution in nature. A example is the genetic algorithm (GA), which maintains and improves multiple candidate solutions based on population method and uses population characteristics to guide search [36,37], and another typical algorithm is the differential evolution algorithm (DE) of heuristic search based on population [38,39]. Moreover, inspired by biogeography theory, a biogeographic optimization algorithm (BBO) is proposed [40,41], which based on study of mathematical model of biological species migration. As another type of intelligent optimization algorithm, based on mathematical statistics theory, the estimation of distribution algorithm realizes population search and evolution by constructing probability models. Populationbased incremental learning (PBIL) is a classical estimation of distribution algorithm [42,43]. Another kind of intelligent optimization algorithm are swarm intelligence algorithms, inspired by natural phenomena such as particle swarm optimization (PSO) [44,45], ant colony optimization (ACO) [46,47], artificial bee colony algorithm (ABC) [48,49] and whale optimization algorithm (WOA) [50,51]. In recent years, the Harris hawks optimization algorithm (HHO) inspired by the group cooperation behavior of the Harris eagle and chimp optimization algorithm (ChOA) inspired by chimpanzee hunting behavior in its group are new swarm intelligence optimization algorithms [52,53].
The contribution of this paper is as follows: 1. This paper proposes an intelligent dendritic neural model firstly, which uses an intelligent optimization algorithm to optimize the model instead of the traditional backpropagation algorithm. The experimental results show that the performance of 1. This paper proposes an intelligent dendritic neural model firstly, which uses an intelligent optimization algorithm to optimize the model instead of the traditional backpropagation algorithm. The experimental results show that the performance of the intelligent dendritic neural model is superior to the traditional dendritic neural model, which provides an effective approach for the application of dendritic neural model in engineering classification problems. 2. Through the comparison of scientific experiments, the effective intelligent learning algorithm of dendritic neural model for classification problems is determined. By comparing different types of intelligent optimization algorithms, the result shows that the biogeographic optimization algorithm has excellent performance, can quickly obtain high-quality solutions and has excellent convergence speed.

Dendritic Neural Model
There exists four layers in the dendritic neural model; they are the synaptic layer, dendritic layer, membrane layer and soma layer. Each layer has corresponding functions and characteristics. The detailed structure of the whole model is shown in Figure 1, where = { 1 , 2 , 3 , … , } is the input data of model, m branches represent m dendritic layers and O is actual output of model. There are weights and thresholds in each synapse. In the training process of the model, the algorithm will continuously adjust the weights and thresholds in each synapse to optimize the performance of the model. The specific functions of each layer are introduced in detail below.

Synaptic Layer
Input vector = { 1 , 2 , 3 , … , } inputs data from synapses, and the output of synapse layer is obtained by the activation of the sigmoid function. The nodes in the synaptic layer contain the weights and thresholds of the dendritic neural network. In the training process of the model, the intelligent optimization algorithm needs to optimize all the weights and thresholds in the synaptic layer. The expression of the synaptic layer is (1).

Synaptic Layer
Input vector X = {x 1 , x 2 , x 3 , . . . , x n } inputs data from synapses, and the output of synapse layer is obtained by the activation of the sigmoid function. The nodes in the synaptic layer contain the weights and thresholds of the dendritic neural network. In the training process of the model, the intelligent optimization algorithm needs to optimize all the weights and thresholds in the synaptic layer. The expression of the synaptic layer is (1).
where k is a user-defined constant. ω ij and θ ij are the corresponding weight and threshold. According to different values of ω ij and θ ij , there are six situations: x 0 < θ ij < ω ij , y 0 <ω ij < θ ij , z ω ij < θ ij < 0, { θ ij < ω ij < 0, | θ ij < 0 <ω ij and } ω ij < 0 <θ ij. In case x, the output Y ij is proportional to the output x i , which called the Excitatory state; in case z, the output Y ij is inversely proportional to the output x i , which is called the Inhibitory state; in cases y and }, the value of the output Y ij is always close to 0, which called the Constant-0 case , the output is proportional to the output , which called the Excitatory state; in case , the output is inversely proportional to the output , which is called the Inhibitory state; in cases  and , the value of the output is always close to 0, which called the Constant-0 state and in cases  and , the value of the output is always close to 1, which called the Constant-1 state. Figure 2 shows the details of the four states corresponding to the six situations.

Dendritic Layer
The dendrite layer multiplies results of n synaptic layers. The whole process can be expressed as (2).

Membrane Layer
The membrane layer is connected to m dendritic branches; the function of the membrane is to perform the sum operation over the results of all branches. The whole process can be expressed as (3).

Soma Layer
Finally, the final output of the soma layer is obtained by the sigmoid function, which can be expressed as (4).
where and are self-defined parameters.

Dendritic Layer
The dendrite layer multiplies results of n synaptic layers. The whole process can be expressed as (2).

Membrane Layer
The membrane layer is connected to m dendritic branches; the function of the membrane is to perform the sum operation over the results of all branches. The whole process can be expressed as (3).

Soma Layer
Finally, the final output of the soma layer is obtained by the sigmoid function, which can be expressed as (4).
where k soma and θ soma are self-defined parameters.

Backpropagation Algorithm
The traditional dendritic neural model uses the backpropagation algorithm to update weights and thresholds of the model, which adjusts the weight by backpropagation of the training error layer by layer through the chain rule. Firstly, the least squared error between the actual output O p and the desired output T p of the target is obtained as follows in (5). The synaptic parameters ω ij and θ ij are updated along the negative gradient direction; the whole process can be described as (6) and (7).
where η represents the learning rate; it is usually set to 0.1. According to the structure of the dendritic neural model in Section 2, the whole process of partial differential derivation can be described as (8) and (9).
This paper introduces the intelligent optimization algorithm into the training of the model. Different from the idea of backpropagation algorithm, the intelligent optimization algorithm updates ω and θ by iterating to find the optimization individual. Assuming that the input number of the dendritic neural model is n and m is the number of dendritic layers, if a vector represents a viable solution, then X i (i [1, Q]) can be expressed as (10). 11 , ω 12 , . . . , ω mn , θ 11 , θ 12 , . . . , θ mn } (10) where X i denotes the ith feasible solutions and ω and θ are weight and threshold in the dendritic layer. Generally, the intelligent optimization algorithm first initializes a certain number of feasible solutions randomly, and then iteratively optimizes all feasible solutions according to the characteristics of different optimization algorithms until meeting the termination conditions. The optimal solution X best is the optimal solution of the weight and threshold of the dendritic neural model. In the iterative optimization process, the minimum mean squared error (MSE) is used as a loss function to assess the quality of each feasible solution. The minimum mean squared error is calculated by (11).
where T p is desired output of the pth sample, and O p is the actual output of the pth sample.

Genetic Algorithm
Referring to the genetic evolution theory of Darwin and Mendel, the genetic algorithm was proposed by J. Holland in 1975 firstly. It was developed by simulating the biological evolution mechanism in nature. The individual of population is called the chromosome. The constant updating of chromosomes during iteration is called heredity. Inheritance consists of three parts-selection, crossover and mutation. Chromosome quality is usually evaluated by fitness function. At the beginning of the genetic algorithm is the generation of random individuals, according to a predetermined loss function to evaluate each individual, and giving a fitness value. Through the fitness function, the selected individuals produce the next generation. This operation inherits the idea of survival of the fittest in nature, and then the selected individuals combine to produce a new generation through crossover and mutation. The new generation is better than the previous generation because it inherits the excellent characteristics of the previous generation, and the whole population gradually moves towards the optimal solution. According to Algorithm 1, genetic algorithm is mainly divided into the following three parts.
(a) Selection operator: based on the assessment of individual fitness, selection operator usually selects individuals with higher fitness and eliminates individuals with lower fitness. The common selection methods: fitness allocation method based on proportion or ranking, roulette selection method and so on. (b) Crossover operator: in the process of biological evolution in nature, two chromosomes form new chromosomes by gene recombination. Therefore, crossover is the core link of the whole process. The design of crossover operator needs to be analyzed for each specific question. The familiar crossover operators such as single point crossover, uniform crossover, multi-point crossover and so on. (c) Mutation operator: mutation changes genes inherited on chromosomes by random selection. Mutation itself can be seen as a random algorithm, strictly speaking, an auxiliary algorithm used to generate new individuals.

Begin:
Randomly initialized population of chromosomes ({X i }, i ∈ [1, Q]) Evaluate the fitness value for each chromosome using Equation (11) while Termination criterion Selection the best chromosome by Roulette Wheel Selection Generating new chromosomes through single-point crossover and mutation Evaluate the fitness value of the new chromosome Replace the population's worst chromosomes with the greatest new chromosomes t = t + 1 end while return the best solution End

Differential Evolution Algorithm
In 1996, Rainer and Kenneth, to solve Chebyshev polynomials, proposed the differential evolution algorithm. The basic idea of DE is to randomly generate a group of the initial population and randomly select three individuals in the initial population; the new individual is generated through summing up the vector difference of two individuals with the third individual according to certain rules. Then, it would compare the new individual with another individual randomly selected in population. If the new individual is superior to the compared individual, it would retain the new individual and eliminate the compared old individual; if the new individual is inferior to the one compared with it, the algorithm would abandon the new individual, retain the old individual compared with it and reselect other individuals from population to generate new individuals. Repeated in this way, the individuals of the initial population are continuously updated until the population reaches a certain optimal state. As shown in Algorithm 2, the process of the differential evolution algorithm includes mutation, crossover and selection operations, which is very similar to the genetic algorithm, but each process has a completely different meaning. It has a variety of mutation strategies. The strategy adopted in the experiment is DE/rand/1, and the formula is shown in (12).

Begin:
F0: Initial mutation operator; CR: Crossover operator Q: Population; D: Dimension Initialize the population and calculate the fitness of each population while Termination criterion

Population-Based Incremental Learning Algorithm
PBIL guides the evolution of the population by maintaining a probability vector. The algorithm selects the individual with the best fitness in each generation to update the probability vector, and the next generation population is generated by the updated probability vector sampling. Repeating these steps, the final optimal solution is obtained. The probability vector is the core part of a population-based incremental learning algorithm. According to Algorithm 3, the main operations related to the probability vector in the algorithm are described below.

Begin:
Q: population; LR: learning factor p m : mutation probability; MS: mutated offset value Initialization probability vector P t , p t i = 0.5, i = 1, 2, . . . , L while Termination criterion Generate population according to P t sampling Evaluate the fitness of each individual according to Equation (11) Find the individual B t with the best fitness in the population Update P t according to Equations (13) and (14) t = t + 1 end while return the best solution End Assuming that the optimal individual of the current generation is B t = b t 1 , b t 2 , . . . , b t L , the probability vector P t of the current generation is updated by the following Formula (13). where LR indicates the learning factor, which is generally set to 0.01. For each p t i in the probability vector P t , if random(0, 1) is less than the mutation probability p m , the vector value p t i is mutated; otherwise, it is not mutated. Formula is as follows in (14).
where p t * i represents the ith mutated vector value of P t , and MS represents the mutated offset value.

Particle Swarm Optimization Algorithm
PSO is a stochastic search algorithm developed by simulating the foraging behavior of birds. The algorithm abstracts individual birds into massless, volumetric and informative particles. Since bird populations do not know the specific location of food at first, they can only get the information that food is within a certain range, so bird populations adopt a stable and simple method: search the area around the bird that is closest to the food. Inspired by this, particle swarm optimization randomly initializes many particles in a solution space, and each particle is given random speed and position information. Then these particles begin to move or stay in place with the initialized speed. After sharing information with the surrounding particles, the approximate orientation of the optimal solution and the position information of the particles closest to the optimal solution is obtained, and the particle moves progressively towards the optimal solution position. The whole process is shown in Algorithm 4. The mathematical formula of particle velocity update mode is described as follows in (15).

Begin:
Initialize population of particles ({X i }, i ∈ [1, Q]) Evaluate fitness for each particle using Equation (11) Set pbest i = X i , gbest = min{pbest} while Termination criterion for i = 1: Q Update the velocity and position of X i by Equation (15) Evaluate the fitness of X i if fit(X i ) < fit(pbest i ) pbest i = X i if fit(pbest i ) < fit(gbest) gbest = pbest i end for t = t + 1 end while return the best solution End 3.6. Ant Colony Optimization Algorithm ACO was proposed by M. Dorigo firstly, and inspired by real ant colony behavior. The study found that individual ants communicate with each other through a pheromone that allows them to collaborate on complex tasks. Ants tend to move towards the directions with high pheromone concentrations during movement. They not only leave pheromones on the path they pass but also sense the presence and concentration of pheromones during Symmetry 2022, 14, 11 9 of 35 movement to guide their movement direction and search for food. The ant colony algorithm simulates such optimization mechanisms to find the optimal solution through information exchange and cooperation among individuals. The whole simulation process is shown in Algorithm 5. According to the pheromone quantity and heuristic information on each path, the transition probability of the ant at point i at time t to select the next movable point j is (16).
, j ∈ allowed k 0, otherwise where allowed k = {n − tanu k } represents the set of points that ant k at point i is allowed to choose to go next. α represents the influence of pheromone remaining on the path on the subsequent ants' choice of the path, β represents the expected heuristic factor, τ ij is the pheromone concentration and η ij is the heuristic function.

Artificial Bee Colony Algorithm
ABC simulates the honey-collecting behavior of honeybees. The location of a nectar source represents the solution, and the pollen quantity of the nectar source represents the value of fitness function. All bees are separated into three groups: the employed bees, the onlooker bees and the explored bees. The number of employed bees is the same as the number of nectar sources, and the number of onlooker bees is half of the number of nectar sources. Firstly, the employed bees take charge of the initial search for nectar sources and collects and shares information. Then the onlooker bees is in charge of staying in the hive and collecting nectar according to the information provided by the employed bees; last, the explored bees are responsible for randomly searching for new nectar sources to replace original ones after original ones are abandoned. According to Algorithm 6, each stage is described as follows.

Begin:
keep: elitism parameter; ρ: local pheromone decay rate Q: Population; D: Dimension Initialize the population and calculate the fitness of each population while Termination criterion for i = 1: Q for j = 1: Q Use each solution to update the pheromone for each parameter value: where x kj represents the domain nectar source, and k is not equal to i. ρ ij is a random constant between −1 and 1. After obtaining the new nectar source through the above formula, the fitness function values of new and old nectar sources are compared by using the greedy algorithm, and the superior one is selected. (b) Onlooker bees' stage: at this stage, employed bees share nectar information in the dance area. Then onlooker bees analyze information and adopt a roulette strategy to select nectar source tracking mining to ensure that the probability of nectar source mining with higher fitness value is greater. (c) Explored bees' stage: if a nectar source has not been renewed after several mining sessions, the nectar source should be abandoned and the explored bees stage starts. The explored bees use Formula (18) to randomly search for new nectar sources to replace abandoned ones.
where x minj and x maxj represent the minimum and maximum values of the jth dimension.  (17) Store the best nectar sources in a greedy way, and record if not updated end for end for for l = 1: O Check whether there is nectar source stagnant update. If so, update through Equation (18) end for Complete generation update t = t + 1; end return the best solution End

Whale Optimizaton Algorithm
Mirjalili proposed a WOA-inspired predation mode of humpback whales. Humpback whale individuals can identify and surround prey. According to Algorithm 7, the whole process can be abstracted into three stages: searching prey, surrounding prey and bubble net attack.
(a) Surrounding prey: humpback whales can recognize prey and continuously reduce their surrounding range. The optimal solution represents target prey or location close to target prey, and other whales will keep approaching it. The entire mathematical model can be interpreted as follows in (19) and (20).
where X * (t) is the current optimal whale position vector, t is iterations, X(t) is the current humpback whale position vector, A·D is bounding step length A and C represent different coefficient vectors.
(b) Bubble net attack: when a humpback whale surrounds prey, it spits out bubbles in a spiral way to surround the prey. When |A| ≤ 1, the Formula (21) simulates the spiral hunting behavior of the humpback whale.
where D = |X * (t) − X(t)|, D is the distance between current humpback whale position and prey, and b is the constant that determines shape of spiral. l is a random constant in [−1,1]. (c) Searching prey: when |A| > 1, whale individuals are forced to stay away from the optimal whale location of the current generation so that whale individuals randomly search for prey, which is no longer affected by the current optimal whale. Its mathematical model can be described as follows in (22) and (23).
where X rand (t) represents the random whale location in the current whale population.

Harris Hawks Optimization Algorithm
Inspired by the process of Harris eagle predation, Heidari proposes HHO. The Harris eagle is a group predator, and all are members of the division of labor and coordinated action. The exploration phase is a global search process in which eagles track and detect prey from the air. When the target prey is determined, all members of the group gradually approach the location of the prey, find a suitable position around the prey and complete the encircling for the final attack preparation. The whole simulation process is shown in Algorithm 8. According to the escape behavior of prey and the chase strategy of Harris eagle, Harris Hawks Optimization uses four different strategies to simulate the attack.
(a) When |E| ≥ 0.5 & r ≥ 0.5, in this solution, prey has enough physical strength to try to escape by jumping but is eventually captured; the formula is as follows (24) and (25).
where DX(t) represents position deviation between eagle group and prey after the tth iteration, and j is the jumping distance during the escape of the prey. X Prey is the position of the prey. (b) When |E| < 0.5 & r ≥ 0.5, in this solution, prey does not have sufficient energy and is directly captured by the eagle, the formula can be described as follows (26).
(c) When |E| ≥ 0.5 & r < 0.5, in this solution, the prey has plenty of energy to escape and has the opportunity to escape. Therefore, the eagles form a more intelligent encirclement. The implementation strategies are as follows (27) and (28).
When |E| < 0.5 & r < 0.5, in this solution, there is a prey energy shortage but still a chance to escape. Therefore, when the eagles form a hard encirclement to narrow average distance from their prey, the formula can be interpreted as follows in (29) and (30).

Chimp Optimization Algorithm
During chimpanzee hunting, any chimpanzee can randomly change its position in the space around its prey. The mathematical description is as follows in (31) and (32).
where d is the distance between chimpanzees and prey; t is the current number of iterations; X prey (t) is the prey position vector; X chimp (t) is the chimpanzee position vector and a, m, and c are coefficient vectors. In the chimpanzee community, according to the diversity of intelligence and ability shown by individuals in the process of hunting, chimpanzee groups are classified as "driver", "barrier", "chaser" and "attacker". Each type of chimpanzee has its own ability to think independently and use its own search strategy to explore and predict the location of prey. While they have their own tasks, they also exhibit chaotic individual hunting behavior at the end of the hunt due to social incentives to obtain sexual behavior and benefits. The Chimp Optimization Algorithm solves the problem by simulating the co-hunting behavior of the four chimpanzee species. According to Algorithm 9, the formula for updating the positions of the four chimpanzee groups is described as follows (33)- (35).
Update by formulae (24) and (25) Update by formulae (27) and (28) Update by formulae (29) and (30) end for t = t + 1 end while return the best solution End

Biogeography-Based Optimization Algorithm
BBO is an intelligence algorithm that uses biogeographic theory to solve optimization problems. In a biogeographic optimization algorithm, habitat is used to represent individuals in intelligent optimization algorithm, the suitable index variable (SIV) represents variables in individuals and the habitat suitability index (HSI) represents individual fitness. In nature, the habitat suitability index of each habitat for biological population is different. Habitats with high HSI can accommodate more species and have high species migration rates and low species migration rates. Individual migration can share excellent SIV between habitats. The migration models in biogeography-based optimization mainly include the linear migration model, trapezoidal migration model, secondary migration model and cosine migration model. The linear migration model is used in this paper. Formula (36) is the calculation of immigration rate λ(S); Formula (37) is emigration rate µ(S). The entire migration operation is shown in Algorithm 10.
where E represents the probability of maximum emigration, I represents the maximum immigration probability.
Begin: Initialize f, m, a, and c; Q: Population Initialize the population and calculate the fitness of each chimp X Attacker represents the best search agent X Barrier represents the second best search agent X Chaser represents the third best search agent X Driver represents the fourth best search agent while Termination criterion for i = 1: Q Update f, m, a and c according to different group strategies end for Update X Attacker ,X Barrier ,X Chaser ,X Driver t = t + 1 end while return the best solution End Mutation operation simulates the phenomenon that diseases, natural disasters and other factors change the living environment of habitat and lead to the deviation of habitat population from the equilibrium point. According to Algorithm 11, the mutation probability M s of the species is calculated as follows (38).
where P s is probability of S species in the habitat; P Smax is the maximum value of P s , and M max represent the maximum mutation rate. The above introduces how the intelligent optimization algorithm optimizes the weight and threshold of the dendritic neural network. At the same time, the principle, characteristics, important formulas and pseudocode of each algorithm are introduced in detail. Each algorithm has its own characteristics, which also determines the adaptability of the algorithm.

Experiment
For the classification problem of DNM, the experiment compared the results of ten intelligent optimization algorithms and traditional backpropagation algorithms on six classical classification datasets. The software environment of the experiment is MATLAB2018a, and the hardware environment is Intel(R) Core(TM) I5-9500 CPU @ 3.00 GHz, 8.00GB.
All datasets are from the UCI database [54]. Table 1 shows the attributes, number and classes of data in detail. All six data sets are binary classification problems. All datasets are randomly divided into two groups: 70% training and 30% testing. These six datasets contain common classification problems in the real world. For example, the Australian Credit Approval dataset and Banknote Authentication dataset are related to finance, and the Bank Credit Approval dataset concerns credit card applications and is used to evaluate the credit of the new applicant, which can effectively identify the good or bad of the customer, provide the basis for issuing the card and help the bank to establish the first line of defense against credit card risk. The Banknote Authentication dataset is to classify the data extracted from the images of real and counterfeit similar banknote samples. The Breast Cancer dataset and Diabetic Retinopathy dataset are classified datasets about common breast cancer and diabetes. By classifying and predicting different influence attributes of diseases, it has a positive effect on medical treatment activities. The last two dataare on actual production life; the Car Evaluation dataset is a classified data set to assess car safety, and the Glass Identification dataset is a classified data set to classify the type of glass. To eliminate interference, all experimental results are the average results of 30 independent experiments. The experiment compares the performance of eleven algorithms, including ten intelligent algorithms (GA, DE, PBIL, PSO, ACO, ABC, WOA, HHO, ChOA and BBO) and a traditional back propagation algorithm. The population size of ten intelligent optimization algorithms is 50, and the number of iterations is 250. Other parameters intelligent optimization algorithms of are set according to the experience and characteristics of the algorithm. The initialization parameters of intelligent optimization algorithm are shown in Table 2. The learning rate of the backpropagation algorithm is 0.01, and the number of iterations is 250.
There are four user-defined parameters in DNM, which are the number of synaptic layers M, k of synaptic layer, k soma and θ soma in the activation function of the output layer. The experiment uses Taguchi's method [55] to obtain a reasonable combination of four parameters of DNM. Taguchi's method adopts the orthogonal experimental design method, which is a scientific experimental design method to select appropriate and representative points from a large number of experimental points.  Table 3 shows the optimal parameters of six datasets obtained through the test of orthogonal array L 25 5 4 . The orthogonal matrix experimental results of eleven algorithms in six datasets are recorded in detail in the Appendix A. In the process of model training optimization, MSE is the loss function of the model, which is the evaluated fitness value of each solution. The accuracy of the classification is to evaluate the performance of the model during the test. Formula (11) is the calculation of MSE, and the calculation method of accuracy is as follows (39).
where TP, TN, FP, and FN repent true positives, true negatives, false positives, and false negatives respectively.  Figure 3 shows the convergence diagram of the algorithm in the iterative process of optimizing the weights and thresholds of the model, and Figure 4 shows the boxplot of the algorithm after iterative optimization, which can intuitively see the distribution of the optimal solution obtained by the algorithm. At this time, the user-defined parameters of the model adopt the optimal parameters in Table 3. Table 4 summarizes the classification accuracy and minimum mean squared error of each algorithm under the optimized parameters.

Conclusions
In this paper, an intelligent dendritic neural model is proposed for the first time, which uses intelligent optimization algorithm instead of the traditional BP algorithm to train the model. In the experiment, ten intelligent optimization algorithms including GA,  As we can see, the BP algorithm converges very slowly in the iterative process. In Australian Credit Approval dataset, Diabetic Retinopathy dataset, and Glass Identification dataset, the algorithm falls into local optimization and loses its function. Its robustness is very poor. ACO and ChOA also have slow convergence speed. In the Australian Credit Approval dataset and Diabetic Retinopathy dataset, it is the same as the BP algorithm, which falls into local traps and has poor overall performance. In the six datasets, BBO has the best convergence effect, followed by PSO. In the first 50 iterations of Australian Credit Approval dataset and Diabetic Retinopathy dataset, the convergence speed of PSO is better than that of BBO, but after 50 iterations, BBO's convergence value is lower than that of PSO, indicating that BBO has better global search ability than PSO. Unlike the DE algorithm, which has a good effect on some datasets (Breast Cancer, Car Evaluation) and poor effect on some datasets (Australian Credit Approval, Diabetic Retinopathy), BBO has good robustness.
From the distribution of optimal solutions, DNM + BBO still has obvious advantages, with smaller error and stable performance, followed by is DNM + PSO. DNM + ACO and DNM + ChOA have many outliers, which means that the algorithm has poor performance and cannot find a good feasible solution.
As can be seen from Table 4, the classification accuracy of BP in the six datasets is very low, where in the Diabetic Retinopathy dataset it is as low as 39% and the highest at 86.5%. BBO has the highest classification accuracy in all six datasets, reaching 99.5% in the Banknote Authentication dataset. PSO is inferior to BBO. The performance of ACO and ChOA is unstable. For example, in the Diabetic Retinopathy dataset, ACO is as low as 38.7%, and ChOA is as low as 69.6%. Other data sets such as WOA, GA, PBIL and so on have good performance, but the overall performance is worse than BBO. In addition, MSE is the loss function value of the model, and BBO is the smallest in six data sets, which shows that BBO has good performance.
Among the ten intelligent optimization algorithms, DNM + BBO has the fastest convergence speed and the highest accuracy, followed by DNM + PSO. Otherwise DNM + WOA, DNM + GA and so on have good convergence speeds and higher classification, but they are still lower than BBO as a whole. Meanwhile, just like the ideal of "no free lunch", not all intelligent optimization algorithms are suitable for the classification of dendritic neural models. For example, DNM + ACO and DNM + ChOA have poor performance.
In contrast, intelligent optimization algorithms have obvious advantages, the performance of the intelligent dendrite neural model is far better than that of the traditional dendrite neural model.

Conclusions
In this paper, an intelligent dendritic neural model is proposed for the first time, which uses intelligent optimization algorithm instead of the traditional BP algorithm to train the model. In the experiment, ten intelligent optimization algorithms including GA, DE, PBIL, PSO, ACO, ABC, WOA, HHO, ChOA, BBO, and traditional BP algorithm were selected to train and test the model on six datasets (Australian Credit Approval, Banknote Authentication, Breast Cancer, Car Evaluation, Diabetic Retinopathy, Glass Identification). These ten algorithms are representative intelligent algorithms, such as GA, DE and BBO based on evolutionary ideas, PBIL based on mathematical statistics, ABC, ACO, PSO, and WOA based on swarm intelligence, and new algorithms HHO, ChOA. The experiment uses Taguchi's method to obtain a reasonable combination of four parameters of DNM. The experiments compare and analyze effectiveness, convergence speed, and classification accuracy of the algorithm. The experimental results show that the intelligent dendritic neural model (DNM-BBO) is obviously superior to the traditional dendritic neural model. At the same time, through the comparison of intelligent optimization algorithms, the result shows that BBO algorithm has excellent performance, and its robustness, accuracy and convergence speed are the best. The intelligent dendritic neural model established in this study is a powerful tool for solving classification problems and provides more choices in practical engineering applications.