A Cyclical Non-Linear Inertia-Weighted Teaching–Learning-Based Optimization Algorithm

: After the teaching–learning-based optimization (TLBO) algorithm was proposed, many improved algorithms have been presented in recent years, which simulate the teaching–learning phenomenon of a classroom to e ﬀ ectively solve global optimization problems. In this paper, a cyclical non-linear inertia-weighted teaching–learning-based optimization (CNIWTLBO) algorithm is presented. This algorithm introduces a cyclical non-linear inertia weighted factor into the basic TLBO to control the memory rate of learners, and uses a non-linear mutation factor to control the learner’s mutation randomly during the learning process. In order to prove the signiﬁcant performance of the proposed algorithm, it is tested on some classical benchmark functions and the comparison results are provided against the basic TLBO, some variants of TLBO and some other well-known optimization algorithms. The experimental results show that the proposed algorithm has better global search ability and higher search accuracy than the basic TLBO, some variants of TLBO and some other algorithms as well, and can escape from the local minimum easily, while keeping a faster convergence rate.


Introduction
As is well known, the research and application of swarm intelligence optimization mostly focus on nature-inspired algorithms. In the past decades, many classical nature-inspired optimization algorithms based on population have been proposed. These algorithms have been proven effective in solving global optimization problems and specific types of engineering optimization problems, such as GA (genetic algorithm) [1,2], ACO (ant colony optimization) [3,4], PSO (particle swarm optimization) [5,6], ABC (artificial bee colony) [7][8][9] and DE (differential evolution) [10][11][12], etc. However, any algorithm has its own merits and demerits in solving diverse problems. In order to solve the demerits, such as easily being trapped into a local optimum and slow convergence, numerous improved algorithms for various swarm intelligence algorithms are presented, including the variants of various algorithms or their hybrid algorithms. In general, the quality of an optimization algorithm depends on three basic factors, i.e., the ability obtaining the true global optima value, the fast convergence speed and a minimum of control parameters. Therefore, our ultimate aim in the practical optimization applications is that the optimization algorithms should have higher calculation accuracy, faster convergence speed and a minimum of control parameters for ease of use in program.
The teaching-learning-based optimization (TLBO) algorithm is a swarm intelligence algorithm, which simulates the phenomenon of teaching and learning in a class. A few years ago, the TLBO algorithm was first proposed by Rao et al. [13,14]. The TLBO is a parameter-less algorithm [15] requiring only the common control parameters, such as population size and numbers of generation, and does not need any other specific control parameters. Therefore, there is no burden of tuning control parameters in the TLBO algorithm. So, the TLBO algorithm is simpler, more effective, and its computational cost is relatively less. Because the TLBO algorithm has the ability to achieve better results at faster convergence speed than other algorithms mentioned above, it has been successfully applied in many diverse optimization fields [16][17][18][19][20]. Of course, TLBO also has some disadvantages too. Although TLBO has high search accuracy and fast convergence speed, but it has poor exploiting ability and is easy to fall into the local optimum and appeared premature convergence in multimodal function. Therefore, many improved algorithms for TLBO have been presented so far. To improve the performance of TLBO, some variants of the TLBO have been presented. Rao et al. proposed an ETLBO (elitist TLBO) algorithm [15] for solving complex constrained optimization problems, and applied a modified TLBO algorithm [17] to the multi-objective optimization problem. Aiming at neural network training in portable AI (artificial intelligence) devices, Yang et al. [21] proposed the CTLBO (compact teaching-learning-based optimization) algorithm to solve global continuous problems, which can reduce the memory requirement while maintaining the high performance. Wang et al. [22] presented an improved TLBO algorithm. In order to balance the diversity and convergence, an efficient subpopulation was employed in the teacher phase and a ranking differential vector was used in the learner phase. To improve the convergence solution, the neighbor learning and differential mutation is introduced into the basic TLBO by Kumar Shukla et al. [23]. Combining TLBO and ABC, Chen et al. [24] propose a new hybrid teaching-learning-based artificial bee colony (TLABC) to solve the parameters estimation problems of solar photovoltaic. Furthermore, some other improved TLBO algorithms [25][26][27][28] have been presented for solving the global function optimization problem. We have even proposed a non-linear inertia weighted teaching-learning-based optimization algorithm (NIWTLBO) [29], which has a fast convergence rate and high accuracy. However, its exploiting ability is weaker.
To enhance the exploiting ability and avoid the premature phenomenon of NIWTLBO, we propose a new improved TLBO variant which is a cyclical non-linear inertia weighted teaching-learning-based optimization algorithm (called as CNIWTLBO). This algorithm uses a cyclical non-linear inertia weight factor to replace the old one to control the memory rate of learners, and employs a non-linear mutation factor to control the learner's mutation randomly in teacher and learner phase. Experiments are validated on 21 well-known benchmark problems. Simulation results of CNIWTLBO are compared with the original TLBO and other variants of improved TLBO. As a result, the CNIWTLBO has improved the exploitation ability. Furthermore, it has not only faster convergence speed than the basic TLBO, even providing higher search accuracy for most of these benchmark problems.
The rest of this paper is organized as follows. The basic TLBO algorithm is briefly introduced in Section 2. In Section 3, the proposed CNIWTLBO algorithm will be described in detail. Section 4 provides the simulation results and discussions demonstrating the performance of CNIWTLBO in comparison with other optimization algorithms. Finally, the conclusion and future work are summarized in Section 5.

Teaching-Learning-Based Optimization
The basic TLBO working is divided into "Teacher Phase" and "Learner Phase". Learning from the teacher to make the student's knowledge level closer to the teacher's is termed as "Teacher Phase", and learning through interaction with learners to increase their knowledge is called as "Learner Phase". In TLBO, population is described as a group of learners. Each learner is considered as an individual of the evolutionary algorithm. The subjects offered to the learners are comparable to the different design variables which are the input parameters of the objective function in the optimization problem. The fitness value of the objective function is the learner's total result. The learner having the best solution of the optimization problem is considered as the teacher in the entire population.
The class {X 1 , X 2 , . . . , X NP } is composed of one teacher and some learners, where X i = X i,1 , · · · , X i, j , · · · , X i,D (i = 1, 2, . . . , NP) denotes the i-th learner, NP is the number of learners (i.e., population size), D represents the number of major subjects in the class (i.e., dimensions of design variables). X i,j represents the result of the i-th learner on the j-th major subject. The class X (i.e., population) is randomly initialized by a search space bounded by NP × D matrix whose values between the lower bound and upper bound of design variable.

Teacher Phase
In the teacher phase, the teacher provides knowledge to the learners to increase the mean result of the class. The learner with the best fitness in the current generation is considered as the teacher X teacher , and the mean result of the learners on a particular subject j (j = 1, 2, . . . , D) is represented as . The mean result of a class may increase from a low level to the teacher's level. Due to the individual differences and the forgetfulness of memory, it is impossible for the learners to gain all the knowledge of the teacher to reach the teacher's level. The solution of each learner is updated by Equation (2): where X teacher,j is the result of the teacher in subject j. The r is a random number in the range [0, 1], T F is the teaching factor, which decides the value of mean to be changed and its value is either 1 or 2. The values of r and T F are generated randomly in the algorithm and both of them are not supplied as input parameters to the algorithm.
Since the optimization problem is a minimization problem, so the optimization goal is to find the minimum of objective function f. If the new value gives a better function value in each iteration, then the old value X old i,j is updated with the new value X new i, j . The updated formula is given as: where f X new i and f X old i represent the new and old total result of the i-th student, respectively. All the new values accepted at the end of the teacher phase become the input to the learner phase.

Learner Phase
During this phase, the learners increase their knowledge through mutual interaction among themselves. A learner interacts randomly with other learners to achieve new knowledge, so that he can achieve the goal of increasing their knowledge level. In this learner phase, let the i-th learner is X i and randomly chosen another learner is X q , i q. The updated formula for the new result of the i-th learner is given as: where r is a random number between 0 and 1, f (X i ) and f (X q ) are the best solution of the learners X i and X q , respectively. If the objective function has the better fitness value with new value, then accept the new value. Similarly, the learner will be updated by using Equation (3).

Algorithm Description
In the basic TLBO algorithm, the teacher tries to shift the mean of the learners towards himself or herself by teaching in the teacher phase and the learners improve their knowledge by interaction among themselves in the learner phase. The learners improve their level by accumulating knowledge in the learning process, i.e., they learn new knowledge based on their existing knowledge. The teacher tends to hope that his students should achieve the knowledge equal to him as soon as possible. But it is impossible because of the forgetting characteristics of the student.
In the NIWTLBO, a phenomenon was described that a student usually forgets a part of existing knowledge due to the physiological phenomena of the brain [29]. Moreover, the learning curve and the forgetting curve presented by Ebbinghaus were introduced. As we know, new knowledge needs to be learned many times before it can be firmly remembered. Over time, we will forget a part of what we have learned. Therefore, we need to review the old knowledge again and again to keep our level of knowledge periodically. In order to simulating this learning process, a cyclical memory weight factor is added to the existing knowledge of the student. This weight factor is non-linear inertia, which control the memory rate of learners cyclically. So we introduce the cyclical non-linear inertia weight factor w c into Equations (1) and (4) based on the basic TLBO, which scale the existing knowledge of the learner for calculating the new value. Compared with the TLBO algorithm, the previous knowledge accumulation of learners is determined by the weight factor w c and which is used to calculate new values.
Let T to be the number of iteration of the algorithm in one learning cycle. So the cyclical non-linear inertia weight factor is defined as Equation (5).
where iter is the current iteration number of the algorithm, and MAXITER is the maximum number of allowable iterations which is an integral multiple of the T. The w cmin is the minimum value of cyclical non-linear inertia weight factor w c and its value should be between 0.5 and 1. The value w cmin should not be too small, otherwise the individuals are worse due to remembering too little existing knowledge in one iteration. If w cmin is too small, it is difficult for the algorithm to converge to the true global optimal solution. In our experiment, the value is 0.6. The w c is called as cyclical memory rate, and its curve is shown as Figure 1. The cyclical non-linear inertia weight factor w c is applied to the new equations shown as Equations (7) and (8). During a learning cycle in this improved TLBO, individuals attempt to search diverse areas of the search space at an early stage. In the latter stage, the individuals move in a small range to adjust the trial solution slightly so as to be able to explore relatively small local space. Then repeat the learning cycle over and over again. In order to get a new set of better learners (i.e., individuals), the difference between the existing mean result and the corresponding result of the teacher is added to the existing learners in the teacher phase. Similarly, in the learner phase, the difference between the existing result of a learner and the corresponding result of another learner selected randomly is added to the existing learner. As Equations (1) and (4) show, the value added to the existing learner is formed from the difference of result and the random number r . So the difference value is largely determined by the random number r in the teacher and learner phases. In our proposed method, the random number r in the basic TLBO is modified as follows: In order to get a new set of better learners (i.e., individuals), the difference between the existing mean result and the corresponding result of the teacher is added to the existing learners in the teacher phase. Similarly, in the learner phase, the difference between the existing result of a learner and the corresponding result of another learner selected randomly is added to the existing learner. As Equations (1) and (4) show, the value added to the existing learner is formed from the difference of result and the random number r. So the difference value is largely determined by the random number r in the teacher and learner phases. In our proposed method, the random number r in the basic TLBO is modified as follows: where rand(0, 1) is a uniformly distributed random number within the range [0, 1]. Equation (6) generates a random number in the range [0. 5,1] which is similar to the method proposed by Satapathy [27]. The r was called as dynamic inertia weight proposed by Eberhart [30]. Thus, the mean value of the original random number r is increased from 0.5 to 0.75, so the probability of stochastic variations is increased and the difference value added to the existing learners is enlarged. Meanwhile, the w c increases from small to large in one learning cycle. Under the combined effects of the w c and r , the proposed algorithm will not generate premature convergence. Instead, it can improve population diversity, avoid prematurity in the search process and increase the ability of the basic TLBO to escape from local optima. In this way, the algorithm performance is enhanced. On the surface of some multimodal functions, the original random number r may result in some populations clustering near a local optimal point. With the new dynamic inertia weight r , the population has more chances to escape from the local optima and continuously move towards the global optimum point until reaching the true global optimum. Now, the cyclical non-linear inertia weight factor and the dynamic inertia weight factor are applied to the basic TLBO algorithm. In the teacher phase, the new set of improved learners can be expressed by the equation as follows: And in the learner phase, the new set of improved learners can be expressed by the equation as follows: where w c is given by Equation (5), and r is given by Equation (6). In order to keep the diversity of the algorithm and enhance the global searching ability at the beginning of each iteration cycle, two individuals with the worst solution will be randomly mutated into new individuals. The probability of mutation is expressed by Equation (9). If P c > Rand(0, 1) in each iteration, the worst two individuals will be randomly mutated. The mutation process is very simple, that is, the design variables of the two individuals will be initialized randomly in the search space.
In this way, it will expand the diversity of populations and restrain the prematurity of the algorithm.

Behavior Parameter Analysis
In the CNIWTLBO algorithm, there are two parameters which are w cmin and T. The w cmin is the minimum value of cyclical non-linear inertia weight factor w c . Its value is in the range [0.5, 1]. So, the w c is between 0.5 and 1. As is proved in the experiment, the algorithm performance is better when the w cmin value is 0.6 (i.e., the w c will increase from 0.6 to 1 in one learning cycle). If w cmin is selected at a very small value, the initial value of w c will become small in one learning cycle which results in the individuals remembering too little existing knowledge in the beginning phase. In this case, it is difficult for the algorithm to converge to the true global optimal solution. Due to the mean value of the r is increased from 0.5 to 0.75, if w cmin is selected a large value, it will generate premature convergence.
The T is the number of iteration of the algorithm in one learning cycle. For the complex function which is difficult to convergence, it needs 4 learning cycles at least. There is no strict limit to the simple function. The T value depends on the function complexity. The higher the function complexity, the greater the T value. The maximum number of allowable iterations must be an integral multiple of the T. A larger number of experiments proved that the T should be set around 200 which can get better results.

Framework of CNIWTLBO
The framework of the CNIWTLBO algorithm is described as follows (Algorithm 1):

Algorithm 1 The Framework of CNIWTLBO
Step 1: Initialize the parameters of the algorithm. Set the number of population (NP, i.e., the number of students), the dimension of decision variables (D, i.e., the number of subjects), the generation number iter = 1, and the maximum number of iterations (Maxiter, i.e., the maximum generation number).
Step 2: Initialize the population. Generate a random population with NP solutions within the range of specified values, P = {X 1 , X 2 , · · · , X NP }. Calculate the population fitness value of the objective function f (x).
Step 3: Calculated the cyclical non-linear inertia weight factor w c and the dynamic inertia weight r according to the Equations (5) and (6), respectively.
Step 4: Choose the individual with the best fitness in the population as the teacher X teacher , calculate the average result of each subject M j .
Step 5: Execute the teacher phase. Calculate the new marks of the learners using Equation (7); Evaluate all learners by calculating the fitness value of the objective function and update the old values of the individuals using Equation (3).
Step 6: Execute the learner phase. Calculate the new values of the students using Equation (8); Re-calculating the fitness value of the objective function and update the old values of the individuals according to Equation (3) in the same way.
Step 7: Execute mutation strategy. Calculate the probability of variation P c using Equation (9). If P c > rand(0, 1), then the two individuals with the worst solution will be randomly mutated into new individuals.
Step 8: Algorithm Termination: If the terminating condition is satisfied, i.e., Iter > Maxiter, then stop the algorithm procedure and output the best solution. Otherwise, go to Step 3.

Benchmark Tests
In this section, CNIWTLBO is tested on some benchmark functions to evaluate its performance by comparing with the basic TLBO and NIWTLBO, as well as with other optimization algorithms mentioned in literature. All algorithms are coded using the Matlab programming language and run in Matlab 2017a environment on a laptop having Intel core i7 2.60 GHz processor and 8 GB RAM.
In the experiments, 21 well-known benchmark functions with unimodality/multimodality characteristics are adopted. The details of these functions are shown in Table 1. The "C" denotes the characteristic of function; "D" is the dimensions of function; "Range" is the boundary of the variables in each function; And "MinFunVal" is the theoretical global minimum solution. Table 1. The benchmark functions adopted in the paper.

No.
Function Goldstein-Price cos( xi

CNIWTLBO vs. Particle Swarm Optimization (PSO), Artificial Bee Colony (ABC), Differential Evolution (DE) and Teaching-Learning-Based Optimization (TLBO)
In order to identify the ability of the proposed algorithm to achieve the global optimum value, 20 different benchmark functions in Table 1 are tested using the PSO, ABC, DE, TLBO and CNIWTLBO algorithms. To maintain the consistency in the comparison, all algorithms are experimented with using the same maximum function evaluations (FES) and the same values of common control parameters such as population size. In general, compared with the other algorithms, the algorithm requiring fewer number of function evaluations to get the same best solution can be considered as better. It should be noted that the FES is significantly different between TLBO variants and other meta-heuristic algorithms, which has been discussed in [28]. So, 2 FES are counted in each iteration for original TLBO, CNIWTLBO and most of TLBO variants. In this experiment, the maximum fitness function evaluation is 80,000 for all benchmark functions, and the other specific parameters of algorithms are given in Table 2. Table 2. Parameter settings for particle swarm optimization (PSO), artificial bee colony (ABC), differential evolution (DE), teaching-learning-based optimization (TLBO) and cyclical non-linear inertia-weighted teaching-learning-based optimization (CNIWTLBO) algorithms.

Algorithm
Parameter Settings Each benchmark function in Table 1 is independently experimented 30 times with PSO, ABC, DE, TLBO and CNIWTLBO algorithms and comparative results. Each algorithm is terminated after running for 80,000 FEs or when it reached the global minimum value in advance. The results, in the form of mean value and average standard deviation (SD) of objective function obtained after 30 independent runs, are shown in Table 3.

PSO
Moreover, the number of function fitness evaluations produced by the experiments when the algorithms reach the true global optimum solution, in the form of mean value and average standard deviation, are reported in Table 4.
It is observed from the comparative results in Table 3 that the performance of CNIWTLBO outperforms PSO, ABC, DE and TLBO for functions f1-f7, f15, f16, f18, f20. For functions f9-f14, the performance of CNIWTLBO, PSO, ABC, DE and TLBO is very similar that all the algorithms can obtain the global optimal value. Moreover, the TLBO is better than PSO, ABC, DE for functions f1-f7 and f15. The performance of CNIWTLBO, DE and TLBO is similar and better than PSO and ABC for f17(Multimod) and f19(Griewank). For f8(Rosenbrock), the ABC is better than others. For f21(Weierstrass), the performance of NIWTLBO and TLBO is alike and outperform PSO, ABC and DE.
From Table 4 it can be seen that the number of fitness evaluations is smaller, it is indicated that the algorithm obtains the true global optimum value more quickly. In other words, the smaller the number of fitness evaluations, the faster the convergence rate of the algorithm. It is obvious that the CNIWTLBO algorithm requires less number of function evaluations than the basic TLBO, PSO, ABC, DE to achieve the true global optimum value for most of the benchmark functions. Therefore, the convergence rate of the CNIWTLBO algorithm is faster than TLBO, PSO, ABC, DE for most of the benchmark functions in Table 2 except f13(Six Hump Camel Back) and f14(Goldstein-Price).

CNIWTLBO vs. the Variants of PSO
In order to compare the ability of the CNIWTLBO algorithm with other variant PSO algorithms such as PSO-w [31], PSO-cf [32], CPSO-H [33] and CLPSO [34] to obtain the global optimal value, 8 different unimodal and multimodal benchmark functions in Table 1 are tested in this experiment. To maintain the consistency, the CNIWTLBO algorithm and the variants of PSO are performed with the same maximum function evaluations (30,000 FEs) and dimensions (10D). In the same way, CNIWTLBO algorithm is independently run 30 times for each benchmark function. The comparative results obtained after 30 independent runs of the algorithms on each benchmark function, in the form of the mean value and average standard deviation, are shown in Table 5. In this experiment, the results of the algorithms except CNIWTLBO are taken from literature [28,35], and the population size of each algorithm is 10. From the results in Table 5, it is observed obviously that the performance of CNIWTLBO and TLBO algorithms outperform PSO-w, PSO-cf, CPSO-H and CLPSO algorithms for f1(Sphere), f15(Ackley) and f19(Griewank). The performance of CNIWTLBO and CLPSO is similar for Rastrigin, NCRastrigin and Weierstrass. For Rosenbrock and Schwefel2.26, the CNIWTLBO algorithm does not perform well compared with other algorithms.

CNIWTLBO vs. the Variants of ABC, DE
The experiment in this section is aimed at identifying the performance of the CNIWTLBO algorithm to achieve the global optimum value by comparing with the variants of ABC and DE on 7 benchmark functions shown in Table 1. The variants of ABC, such as the gbest-guided artificial bee colony GABC algorithm [36] and the improved artificial bee colony IABC algorithm [37], and the variants of DE, such as SaDE and JADE, are used in this experiment. To be fair in the comparison, the parameters of the algorithms are same as in the literature [27], where the population size is 20 and dimension is 30. Alike other algorithms, TLBO and CNIWTLBO are tested with the same function evaluations listed in Table 6. The comparative results, in the form of mean value and average standard deviation, are listed in Table 6. The results of GABC, IABC, SaDE and JADE are taken from the literature [27] directly. The results of TLBO and CNIWTLBO are obtained after 30 independent runs on each benchmark function in the same way. It can be observed from the results that the performance of CNIWTLBO performs much better than GABC, SaDE and JADE on all the benchmark functions in Table 6, and outperforms the IABC algorithm for f1(Sphere), f4(Schwefel 1.2), f5(Schwefel 2.22), f6(Schwefel 2.21) and f15(Ackley). Furthermore, CNIWTLBO is better than TLBO for f15(Ackley) and f18(Rastrigin). So, it is indicated that the CNIWTLBO algorithm has a good performance.

CNIWTLBO vs. the Variants of TLBO in Different Dimensions
In order to identify the performance of the CNIWTLBO, the experiments are carried out to compare the CNIWTLBO algorithm with some other variants of TLBO in different dimensions. The variants of the TLBO algorithm including WTLBO [25], ITLBO [26], I-TLBO [28], and NIWTIBO [29] are adopted.
In the experiments, 9 unimodal and multimodal benchmark functions listed in Table 1 are used to evaluate the performance of the algorithms. In order to make fair comparisons, the CNIWTIBO and all adopted TLBO variants use the same parameters. In this work, evolutionary generation is used to evaluate the performance of CNIWTLBO and the TLBO variants. The population size is set to 30 and the number of evolutionary generations is set to 2000. In I-TLBO algorithm, the number of teachers is 4. The learning cycle T in CNIWTLBO is 500. The CNIWTIBO and the TLBO variants are tested on 9 benchmark functions with 20, 50 and 100 dimensions, respectively. To eliminate the randomness impact on the results, 30 independent runs for the algorithms are conducted. The experimental results are reported in Table 7, which is in form of the mean solution.

Conclusions
A modified TLBO algorithm called CNIWTLBO has been proposed in this paper for solving global optimization problems. Two learning factors are introduced in the proposed algorithm. A cyclical non-linear inertia weight factor is introduced into the basic TLBO to control the memory rate of learners, and a non-linear mutation factor is introduced into the basic TLBO to control the learner's mutation randomly during the learning process. With the modifications on the basic TLBO, the CNIWTLBO algorithm has stronger exploration capacities. Furthermore, the proposed algorithm is effectively prevented from falling into a local minimum and ensures assessment accuracy. In the experiments, 21 classical benchmark functions have been tested to evaluate the performance of the CNIWTLBO, and the experimental results are compared with the other meta-heuristic algorithms and their variants available in the literatures. Moreover, the comparison results between the CNIWTLBO and the other variants of TLBO are reported in this paper. The experimental results show that the performance of the CNIWTLBO for solving global optimization problems is satisfactory.
In future work, the CNIWTLBO algorithm will be tested on some engineering benchmarks. To verify the efficiency, the proposed method will be applied to handle constrained engineering optimization problems. Furthermore, the hybrid method combining the proposed algorithm and the other classic intelligent algorithms will be researched to improve the performance of the TLBO algorithm.