An Enhanced Grey Wolf Optimizer with a Velocity-Aided Global Search Mechanism

: This paper proposes a novel variant of the Grey Wolf Optimization (GWO) algorithm, named Velocity-Aided Grey Wolf Optimizer (VAGWO). The original GWO lacks a velocity term in its position-updating procedure, and this is the main factor weakening the exploration capability of this algorithm. In VAGWO, this term is carefully set and incorporated into the updating formula of the GWO. Furthermore, both the exploration and exploitation capabilities of the GWO are enhanced in VAGWO via stressing the enlargement of steps that each leading wolf takes towards the others in the early iterations while stressing the reduction in these steps when approaching the later iterations. The VAGWO is compared with a set of popular and newly proposed meta-heuristic optimization algorithms through its implementation on a set of 13 high-dimensional shifted standard benchmark functions as well as 10 complex composition functions derived from the CEC2017 test suite and three engineering problems. The complexity of the proposed algorithm is also evaluated against the original GWO. The results indicate that the VAGWO is a computationally efﬁcient algorithm, generating highly accurate results when employed to optimize high-dimensional and complex problems.


Introduction
Computational intelligence [1] is a sub-branch of artificial intelligence which employs a variety of mechanisms to solve complex problems in different domains. Computational intelligence is applied to many fields, such as computer vision, healthcare, fog computing, and others. The swarm intelligence (SI) algorithm is one of the most popular computational intelligence methods, mimicking the lifestyle of natural communities such as animal herds. The algorithms included in the SI focus on the individual lives of the swarms' members, on the one hand, and the social relations and interactions among the swarms' individuals to chase and find the food sources on the other. In the last few years, many SI algorithms have been developed and proposed, including the Firefly Algorithm (FA) [2], Cuckoo Search (CS) [3], Grey Wolf Optimizer (GWO) [4], Moth-Flame Optimization (MFO) [5], Gradient-Based Optimizer (GBO) [6], Whale Optimization Algorithm (WOA) [7], Arithmetic Optimization Algorithm (AOA) [8], and Aquila Optimizer (AO) [9].
In the same context, Grey Wolf Optimizer (GWO) is one of the most popular and widely used SI-based techniques [4]. The GWO algorithm is inspired by the behavior of the grey wolves in nature when seeking the best means for hunting prey. The GWO algorithm applies the same mechanism and follows the pack hierarchy for assigning different roles to each member of the pack to reach the food depending on its potential fitness and its rank in the pack. In GWO, four groups of wolves, including alpha, beta, delta, and omega wolves, are defined based upon the highest fitness to leadership to the lowest competence, respectively. As the alpha, beta, and delta wolves are rated as the guide wolves in GWO, the omega wolves always follow the location of these guides when searching for food in nature. The GWO is started by generating random positions for the grey wolves. Then, the three high-fitness wolves are assumed as those with the best locations in the population and are named the alpha, beta, and delta solutions, and the other wolves' positions are updated according to their distance from the leading solutions to bring about the potentially better agents to be determined within the searching process. The GWO employs effective operators to conduct the search process such that a safe and reliable exploration-exploitation balance could be maintained to avoid premature convergence [10].
As with a wide range of other meta-heuristic algorithms, the GWO suffers from poor performance in global search [11]. The updating equation of the GWO conducts the convergence of the algorithm very well but causes premature convergence as a result of not having a strong ability to diversify the search agents to accomplish the exploration phase. The performance of the GWO could be exacerbated when facing the high-dimensional problems, having numerous local optima [12]. To achieve a good balance between maintaining the diversity and providing a high rate of convergence in GWO, there are several GWO variants that have been proposed in recent years, which can generally be placed into four categories: 1.
Modification of the value of the parameters a and C. A non-linearly decreasing strategy of a was proposed in [13]. This method employs an exponential decay function by lapse of iterations. A logarithmic decay function was also proposed to modify the conventional formulation of a in [12]. The control parameter a was also dynamically adapted using fuzzy logic in [14].

2.
Hybridization with other strong population-based methods. In this way, the weaknesses of the GWO are covered by the strengths of several other algorithms, such as genetic algorithms [15], particle swarm optimization [16], differential evolution [17][18][19][20], and biogeography-based optimizer [21]. The integration of the GWO with some local search methods is another idea accomplished in [22][23][24]. 3.
Modification of the updating procedure. The main motivation of this category of the GWO variants is to increase the diversity of the GWO population to enable this algorithm to better perform the exploration phase of the optimization process. Among the proposed variants in this category, the exploration enhanced GWO (EEGWO) is aimed at adding a selected random search agent from the population to the conventional leading alpha, beta, and delta agents to guide the other individuals in the population [25]. A weighted distance GWO (wdGWO) was also proposed in [26]. This variant uses a weighted average of the best individuals instead of the simple average. Inspired by PSO, a new updating scheme replacing the alpha, beta, and delta positions with the positions of the personal historical best position of a solution (Pbest) and the global best position (Gbest) was also proposed in [12]. 4.
Employment of the new operators. A cellular GWO (CGWO) utilizing a topological structure was developed in [27]. In this method, each wolf merely interacts with its neighbors in an attempt to make the search process more local to inject more diversity into the population. A fuzzy hierarchical operator, a mutation operator, and a Lévy flight operator accompanied by a greedy selection strategy were employed to enhance the exploration capability of the GWO in [28][29][30], respectively. A random walk operator was also suggested for use in GWO in a new variant of GWO named RW-GWO [31]. Recently, a refraction learning operator was suggested to help the alpha wolf not to be trapped in the local optimum in a new GWO variant called RL-GWO [11].
The original GWO algorithm suffers from lacking a strong and reliable exploration capability, as this algorithm only uses the acceleration terms to update the wolves. Guiding the wolves only based on the acceleration each leading wolf can apply to the others can make the global search of the wolves incomplete and thus inefficient as a result of possibly creating successive interruptions and disruptions in the path of the search agents when attempting to globally search the search space of the optimization problems. In this paper, we propose a novel variant of the GWO, named Velocity-Aided GWO (VAGWO), to effectively solve the crucial problem of the original GWO, briefly explained above. In this algorithm, a new updating procedure is proposed for the wolves, in which each wolf adopts a velocity term as well. The major contributions of this paper can be highlighted as follows: 1.
Since the original GWO only involves the acceleration terms to update the position of the wolves (search agents), these agents may be trapped in local optima, and thereby a large number of good solutions are not detected during the search process. As a result, a velocity term can highly improve the global search mechanism of the GWO. This is the main motivation for proposing VAGWO.

2.
The exploration and exploitation capabilities of the GWO are both enhanced via presenting a new formulation for the control parameter a to emphasize a in the early iterations while de-emphasizing this parameter in the later iterations. 3.
The control parameter C is also modified to intensify the search process in the last iterations to ameliorate the performance of the GWO in the exploitation phase. Additionally, the newly proposed calculation formulation of C is such that it is well adapted to the iterations of the optimization process to make a well-balanced explorationexploitation transition in the VAGWO.
The organization of the remainder of this paper is as follows: Section 2 introduces the GWO algorithm and describes the modifications made to GWO to yield the proposed VAGWO algorithm. In Section 3.1, the proposed algorithm is applied to two series of highdimensional benchmark functions and compared to other popular and widely used metaheuristic algorithms. In Section 3.2, the proposal is compared with a set of newly proposed algorithms on the same test bed used in Section 3.1. In Section 3.3, a Wilcoxon rank-sum test is conducted to reveal the significance of the superiority of the VAGWO over its competitors when applied to the test functions. In Sections 3.4 and 3.5, the computational complexity of the VAGWO is evaluated against the original GWO. In Section 3.6, the competence of the VAGWO in solving real-world engineering problems is evaluated. Finally, Section 4 highlights the main conclusions of this paper.

Original Grey Wolf Optimizer
The GWO algorithm mimics the hunting behavior and social leadership of grey wolves in nature [4]. The GWO starts the optimization process by randomly generating a swarm of wolves (initial solutions). At each iteration, the three best-fitted wolves, named alpha, beta, and delta, are identified as the leaders of the rest of the wolves, named omega wolves. Then, the omega wolves encircle the best wolves to find the most promising regions in the search space. These wolves act as the search agents seeking the optimal point of the optimization problems. Since every search agent encircles the three best agents in the search space, the arithmetic average of the updated positions of the alpha, beta, and delta wolves is finally adopted as the updated position of each search agent. This procedure may enhance the exploration capability of the algorithm, as three different leading agents are involved in guiding the other agents. The mathematical formulations used in updating the omega wolves are as follows: where t stands for the current iteration; A t p,j = 2r 1 × a − a; C t p,j = 2r 2 ; X t p,j is the position of the prey in the jth dimension in the tth iteration; and X t i,j is the position of the ith grey wolf in the jth dimension in the tth iteration. Additionally, r 1 and r 2 are two random numbers generated in [0, 1]. Furthermore, a is linearly decreased from 2 to 0, by means of the lapse of iterations. Factor A maintains an exploration-exploitation balance in the algorithm. Furthermore, C is also multiplied by the prey position to further help the exploration capability of the algorithm via preventing the wolves from being trapped in local optima.
Each omega wolf is updated according to the alpha, beta, and delta wolves, as formulated below: where the subscripts α, β, and δ denote the alpha, beta, and delta wolves. The other subscripts and superscripts are defined above. As Mirjalili et al. (2014) discussed, in GWO, half of the iterations are for exploration, when |A| > 1, and the second half is dedicated to the exploitation, in which |A| < 1.

Velocity-Aided Grey Wolf Optimizer (VAGWO)
In the original GWO, the search agents do not have velocity as a characteristic helping them in the search process. When the moving agents make their movements in the search space only by successively changing their acceleration towards the guide agents, these movements are not consistently or smoothly made iteration by iteration. In other words, a certain search agent may move towards a guiding agent in the current iteration, and when this guide changes its position, that certain agent immediately turns its trajectory to be able to move towards the new guide. As a result, a rupture may occur in the search agents' movements in the search space, leading to potential drifts. These drifts may result in missing a large number of potentially good positions in the search space, and thereby the optimization process is doomed to face premature convergence. Considering a velocity term can help the search agents to maintain their unique trajectory, enhancing the explorative capability of the search agents, balancing the exploration and exploitation phases of the optimization process, and thus avoiding premature convergence. In VAGWO, there are initial random velocities defined for each of the search agents (wolves) when it decides to move towards each of the leading agents (alpha, beta, and delta wolves), and there are initial random positions defined for each agent (wolf) in the search space. As a result, each search agent takes a velocity and a position in each dimension of the optimization problem. Then, the agents are evaluated, and the three best-fitted agents are chosen as the alpha, beta, and delta wolves to guide the other agents (omega wolves). Then, an updating procedure is built up for the agents at each iteration. For this purpose, a velocity term is first established for the ith search agent when guided by the alpha, beta, and delta wolves, as follows: Mathematics 2022, 10, 351 where V t α,j , V t β,j , and V t δ,j represent the velocity of a search agent (wolf) in the jth dimension when alpha, beta, and delta wolves, respectively, are determined to attract the search agent in the updating procedure. In addition, sgn is the sign function. A t α,j , A t β,j , and A t δ,j represent the acceleration terms of the search agents, calculated as follows: where r 1 , r 2 , and r 3 are the uniformly distributed random numbers generated in [0, 1]. Furthermore, a is a linearly decreasing parameter successively determined as follows: As can be seen, a is changed from the value of √ 2 in the first iteration to the value of 0 in the final iteration, but, according to Equations (13)- (15), the parameter a is to the power 2. This means that a 2 is varied from 2 to 0. The effect of the power of 2 in these equations helps the exploration and exploitation capabilities of the proposed algorithm to be strengthened. In other words, when a 2 ≥ 1, the algorithm is in the exploration phase. In addition, a 2 ≥ a, in this phase, as a ≥ 1, and a is positive. As a result, when the exploration phase is conducted by the proposed VAGWO algorithm, a 2 is greater than what a is expected to be in the original GWO algorithm. This means that |A| 1, and thus the search agents are enabled to explore the search space more strongly in the exploration phase. Furthermore, when a 2 ≤ 1, the algorithm is in the exploitation phase. In this phase, a 2 ≤ a, as a ≤ 1, and a is positive. As a result, the parameter a 2 is less than what a is expected to be in the original GWO algorithm, meaning that |A| 1, and thus the exploitation phase can be more emphasized in the proposed VAGWO algorithm. D t α,j , D t β,j , and D t δ,j denote the modified distances between a focused search agent (wolf) and the alpha, beta, and delta leading wolves, respectively. These distances in each dimension can be calculated as follows: where X t α,j , X t β,j , and X t δ,j are the positions of the alpha, beta, and delta wolves in the jth dimension in the tth iteration; X t i,j is the position of the ith search agent (wolf) in the jth dimension in the tth iteration; C t α,j , C t β,j , and C t δ,j are the crucial coefficients multiplied by each leading wolf to stochastically emphasize or de-emphasize them, as there is an uncertainty in the fitness of each of the alpha, beta, and delta wolves, especially in the early iterations of the optimization process. These coefficients can take these uncertainties into account and help the exploration phase be better conducted by the algorithm. In VAGWO, a new definition is presented for these coefficients, as they can be calculated as follows: Mathematics 2022, 10, 351 where r 4 , r 5 , and r 6 are the uniformly distributed random numbers generated in [0, 1], and c is a parameter adaptively determined as follows: The parameter c is linearly decreasing due to the lapse of iterations. As can be seen, the coefficients C are stochastically generated in [0, 2] in the first iteration, but this range is decomposed over the course of iterations and gradually changes to [0.1, 1 . . , and is terminated at [1, 1] = {1}. Additionally, the way to follow these ranges is such that the ranges' values accelerate to reach the final range values, in which no uncertainty is considered for the fitness of the leading search agents. This is due to making the parameter c to power 2. Knowing c is within [0, 1] over the whole iterations, c 2 is always less than c. As a result, the coefficients C are much closer to [1, 1] = {1} at the final iterations than those at the earlier iterations. In this way, the exploitation capability of the VAGWO algorithm can be further enhanced while the exploration is also strengthened by incorporating the velocity term into the updating procedure of the search agents.
In Equations (10)- (12), the velocity terms of V t α,j , V t β,j , and V t δ,j adopt the sign of the acceleration terms of A t α,j , A t β,j , and A t δ,j . This is very important in the new velocityincorporated updating procedure proposed in VAGWO. Otherwise, there might be a chance in conflict occurring between the velocity and the acceleration which can, in turn, disrupt the agents' movements in the search space. In Equations (10)- (12), k is a tuning parameter playing the role of the inertia weight to facilitate a suitable and reliable transition from exploration to exploitation and is calculated iteration by iteration as follows: Finally, the next positions of the search agents (wolves) can be updated by calculating three positions of X t+1 1,j , X t+1 2,j , and X t+1 3,j , as follows: The new position of a search agent is calculated by averaging three updated positions presented in Equations (25)-(27) as follows: where X t+1 i,j is the position of the ith search agent (wolf) in the jth dimension in the (t + 1)th iteration. The last modification performed in the VAGWO against the original GWO is incorporating an elitism scheme in the proposed algorithm. In this way, each agent updated at an iteration of the algorithm is compared to the best position it has experienced so far and if its objective function is better, it remains in the present form and is designated as its new best-so-far position; otherwise, the current best-so-far position replaces the updated search agent. As a result, the search agents successively become better over the course of iterations. The experiments show that equipping the proposed algorithm with the elitism mechanism can highly improve the results of the optimization offered by the proposed VAGWO algorithm. Figure 1 depicts the exploration and exploitation processes conducted algorithm with the elitism mechanism can highly improve the results of the optimization offered by the proposed VAGWO algorithm. Figure 1 depicts the exploration and exploitation processes conducted by the proposed VAGWO algorithm. The flowchart of the VAGWO is also illustrated in Figure 2.

Comparison with Popular Meta-Heuristic Algorithms
To assess the capability of the presented VAGWO method, it is first applied to 13 popular standard benchmark functions [32,33]. These functions are broken down into two main categories: uni-modal (F1-F7), and multi-modal (F8-F13). Uni-modal benchmark functions have a single global optimum. Thus, they are suitable for assessing the effectiveness of the search process of any optimization algorithm when conducting the exploitation phase, while multi-modal benchmark functions are favoured for assessing the capability of an optimization method to explore the search space. In these benchmark functions, all the global optima are shifted so that the difficulty of solving such functions is increased, as recommended in [5,34]. For conducting a thorough investigation on the potential abilities of the proposed algorithm to solve the optimization problems, it is also applied to 10 composition functions derived from the CEC2017 test suite [35,36]. These composition functions are the combination of various shifted, rotated, and biased multimodal functions. Thus, they can challenge the proposed VAGWO algorithm's capabilities to solve real-world and complex optimization problems to a greater extent. The optimization process implemented over these benchmark functions is of the minimization type. The number of dimensions was set to 100 for the shifted standard benchmark functions and set to 50 for the composition functions. These settings can make these test problems more challenging for the proposed algorithm and its competitors to solve such high-dimensional and thus hard-to-solve problems. The VAGWO algorithm presented to tackle the difficulties in the test problems offered was compared with six popular meta-heuristic algorithms, namely Moth-Flame Optimization (MFO) algorithm [5], Gravitational Search Algorithm (GSA) [37], Particle Swarm Optimization (PSO) [38], Grey Wolf Optimizer (GWO) [4], Genetic Algorithm

Comparison with Popular Meta-Heuristic Algorithms
To assess the capability of the presented VAGWO method, it is first applied to 13 popular standard benchmark functions [32,33]. These functions are broken down into two main categories: uni-modal (F1-F7), and multi-modal (F8-F13). Uni-modal benchmark functions have a single global optimum. Thus, they are suitable for assessing the effectiveness of the search process of any optimization algorithm when conducting the exploitation phase, while multi-modal benchmark functions are favoured for assessing the capability of an optimization method to explore the search space. In these benchmark functions, all the global optima are shifted so that the difficulty of solving such functions is increased, as recommended in [5,34]. For conducting a thorough investigation on the potential abilities of the proposed algorithm to solve the optimization problems, it is also applied to 10 composition functions derived from the CEC2017 test suite [35,36]. These composition functions are the combination of various shifted, rotated, and biased multimodal functions. Thus, they can challenge the proposed VAGWO algorithm's capabilities to solve real-world and complex optimization problems to a greater extent. The optimization process implemented over these benchmark functions is of the minimization type. The number of dimensions was set to 100 for the shifted standard benchmark functions and set to 50 for the composition functions. These settings can make these test problems more challenging for the proposed algorithm and its competitors to solve such high-dimensional and thus hard-to-solve problems.

Parameter Setting of the Algorithms
The VAGWO algorithm presented to tackle the difficulties in the test problems offered was compared with six popular meta-heuristic algorithms, namely Moth-Flame Optimization (MFO) algorithm [5], Gravitational Search Algorithm (GSA) [37], Particle Swarm Optimization (PSO) [38], Grey Wolf Optimizer (GWO) [4], Genetic Algorithm (GA) [39], and Sine Cosine Algorithm (SCA) [34]. To perform an impartial comparison, the swarm size of all algorithms was set to 30 for the shifted uni-and multi-modal functions and set to 50 for the composition functions. In addition, a maximum of 1000 iterations for all the benchmark functions were set for all algorithms. Furthermore, the stopping criterion was assumed to be met when the maximum number of iterations had elapsed. The parameter settings of the VAGWO and the popular algorithms are presented in Table 1. Table 1. Parameter settings of the VAGWO and the popular algorithms.

Algorithm Parameter Settings
GA pr crossover = 0.9; pr mutation = 1 The average, median, best, and standard deviation (std) are computed overall 30 runs and tabulated as the performance measures benchmarked for each algorithm in solving each problem. The final results of the methods on the uni-modal, multi-modal, and composition functions are presented in Tables 2-4, respectively, where the best results are emboldened. Moreover, the convergence curves of the methods while solving the standard uni-and multi-modal benchmark test functions are plotted and shown in Figures 3 and 4.    Figure 3. The convergence curves of the VAGWO and the popular algorithms for F1-F7.
Average best-so-far Average best-so-far Average best-so-far Average best-so-far Average best-so-far Average best-so-far Average best-so-far Average best-so-far Average best-so-far Average best-so-far Average best-so-far Average best-so-far

Results of VAGWO on the Uni-Modal Benchmark Functions
As the results displayed in Table 2 suggest, the VAGWO algorithm strongly outperforms its competitors on 19 out of 28 (68%) of the performance criteria on the uni-modal functions, while the original GWO can outperform its competitors only on 4 out of 28 (14%) of the criteria in this category of test problems. The PSO outperforms its competitors only on 4 criteria, and the MFO and GA each perform better than the other competitors only on one criterion. As can be seen, the results suggest the absolute superiority of the VAGWO as compared to the other competitive algorithms. The main reason behind this superior performance the VAGWO shows in this category may be hidden in strengthening both the ability of exploration and exploitation of the VAGWO as well as incorporating the velocity into the updating procedure of the proposal. The effect of the velocity is not only limited to enhancing the exploration capability, but can enhance the exploitation capability of the proposed VAGWO, as the main problem any optimization algorithm faces when solving the uni-modal functions is the lack of a sufficient rate of convergence to the single optimal point while avoiding divergence nearby this optimum.
Involving the velocity of the search agents in their position updating procedure can speed up the convergence while avoiding the divergence by gradually decreasing the agents' progression towards the global optimum, benefiting from the inertia weight imposed on the velocity term in the updating procedure. Besides these characteristics of the proposed VAGWO, the elitism mechanism embedded in the structure of the VAGWO can be rated as another factor contributing to the high performance of this algorithm on the uni-modal functions, as the elitism can mainly enhance the exploitation capability of an optimization algorithm. The convergence curves plotted are shown in Figure 3. It can be noticed that the VAGWO rapidly converges to the optimum on F1, F3, F5, F6, and F7. While the performance of all the algorithms is similar on F2, the GWO is superior to the other algorithms when it converges to the optimal point of the F3 problem.

Results of VAGWO on the Multi-Modal Benchmark Functions
As the results indicated in Table 3 suggest, the VAGWO is significantly superior to its competitors on F10-F13, while the original GWO has very poor performance in this category. The PSO and GSA show their superiority to the other algorithms only on F8 and F9, respectively. The main reason accounting for the high performance of the VAGWO on such benchmark functions can be summarized in preserving the trajectory of the search agents as the main effect of incorporating the agents' velocity into the updating procedure, and further strengthening the exploration capability of the proposal by increasing the acceleration coefficients represented by parameter A at the exploration phase.
The convergence curves are displayed in Figure 4. As this figure indicates, the VAGWO algorithm converges to the optimal point over all the test problems faster than the others, except for over the F8 and F9. The closest rival to the VAGWO on F10-F13 is the PSO algorithm; however, this algorithm can outperform all other competitors only when solving F8.

Comparison on CEC2017 Benchmark Functions
To further investigate the eligibility of the VAGWO algorithm, the composition functions of the CEC2017 test suite [35] were utilized as the test bed. As can be seen in Table 4, the VAGWO is significantly superior to its competitors on CF3, CF4, CF5, CF7, and CF8. Overall, the VAGWO is superior to the other algorithms examined in this sub-section on 24 out of 40 criteria (60%), while its closest rival is revealed to be PSO, which outperforms the other competitors on 10 criteria (25%), followed by the GWO and SCA, each of which is superior to the other algorithms only on three criteria (8%). Furthermore, the proposal can reach the best averages on 7 out of 10 (70%) of the problems, followed by the PSO, reaching 20% of the best average results, and the GWO, with only 10% outperformancefor these criteria. The other examined algorithms show very poor performance when solving this hard-to-solve category of the benchmark problems. As can be seen, the difference in the results obtained by different algorithms is slight. This issue highlights the high complexity of this category of test problem, the solving of which is a great challenge for any algorithm. The main reason as to why the proposed VAGWO algorithm is superior to the other competitive algorithms on these composition functions is hidden in the unique structure of this algorithm. The VAGWO inherits some advantages from the original GWO, such as having three guide agents, which, in turn, helps the diversity of the solutions in the search space to be considerably preserved. The other characteristic of the GWO which the VAGWO benefits from is the high exploitation capability of this algorithm. These characteristics are strengthened in VAGWO by adding the velocity into the structure of the VAGWO to enable the algorithm to further preserve diversity and avoid missing the good candidate solutions in the search space. In addition, the aforementioned modifications imposed on the control parameters A and C can boost the ability of the proposed VAGWO to both explore and exploit the promising regions in the search space. Finally, the elitism mechanism can intensify the convergence to the optimal point of the problems and enhance the exploitation capability of the proposed method.
On the composition functions, the VAGWO outperforms the other competitors on 7 out of 10 problems, amongwhich its outperformance is significant on four problems, including CF5, CF7, CF8, and CF9. The VAGWO also shows significant dominance over half of the other algorithms, including GA, SCA, and PSO, on CF10. As the composition functions included in the CEC2017 are very challenging for the optimization algorithms, the competitive algorithms all find these test problems hard to solve, and thus show no significant superiority when outperforming several other algorithms on most of these problems.

Comparison with Newly Proposed Meta-Heuristic Algorithms
To further evaluate the effectiveness of the proposed VAGWO in solving optimization problems, its performance on the same benchmark functions used as the test bed in the previous sections is compared with that of several newly proposed meta-heuristic algorithms including Arithmetic Optimization Algorithm (AOA) [8], Flow Direction Algorithm (FDA) [40], Aquila Optimizer (AO) [9], Gradient-Based Optimizer (GBO) [6], and the Effective Butterfly Optimizer with Covariance Matrix Adapted Retreat phase (EBOwithCMAR) [41], as the winner of the CEC2017 competition.
The swarm size and the maximum number of iterations considered for these comparisons are all the same as those set for the comparisons among the VAGWO and the popular algorithms in the previous section. The parameter settings of the newly proposed algorithms along with the EBOwithCMAR are presented in Table 5. All algorithms are implemented on the benchmarks 30 times and the final results are shown in Tables 6-8, where the best results are emboldened. Moreover, the convergence curves of the algorithms when applied to the uni-and multi-modal benchmark test functions are plotted in Figures 5 and 6. Table 5. Parameter settings of the VAGWO and the newly proposed algorithms.

Algorithm
Parameter Settings     Figure 5. The convergence curves of the VAGWO and the newly proposed algorithms for F1-F7.
Average best-so-far Average best-so-far Average best-so-far Average best-so-far Average best-so-far Average best-so-far Average best-so-far Average best-so-far Average best-so-far Average best-so-far Average best-so-far Figure 6. The convergence curves of the VAGWO and the newly proposed algorithms for F8-F13.

Results of VAGWO on the Uni-Modal Benchmark Functions
As the results illustrated in Table 6 suggest, the VAGWO outperforms the other competing algorithms on 10 out of 28 (36%) of the performance criteria on the uni-modal functions. The closest rivals to the VAGWO based on this category of benchmarks are GBO and AO algorithms, outperforming the others on 10 out of 28 (36%) and 8 out of 28 (29%) of the criteria, respectively. As can be seen, the results suggest the superiority of the VAGWO as compared to the other competitive algorithms.
Although the exploitation capability of the original GWO algorithm is strong, adding the velocity term to the updating procedure of the search agents in the VAGWO can expedite their convergence to the optimal point, and thus the superiority of the VAGWO on these functions may be realizable. Among the other features of the VAGWO contributing to this algorithm to show better exploitation, incorporating an elitism mechanism along with some crucial modifications especially on the coefficients C which are multiplied by the leading wolves' position to take uncertainty in their fitness can be mentioned.
The convergence curves depicted in Figure 5 show that the VAGWO can converge to the optimum on F2, F3, F5, and F6, with a rate better than or equal to the other competing algorithms. The most serious rival of the VAGWO is GBO, which is very similar to the proposal in behavior when conducting the optimization process on the uni-modal functions.

Results of VAGWO on the Multi-Modal Benchmark Functions
As the results displayed in Table 7 suggest, the VAGWO is highly superior to its competitors on F10-F12, while the superiority on the other functions in this category is dispersed among the other algorithms. On F13, the results of the VAGWO are very close to those of the FDA, AO, and GBO, while being far better than those of the AOA. Benefiting from a velocity-aided updating procedure as well as accelerating the exploration via increasing the coefficients A at the early stages of the optimization process are among the two major factors leading to the superiority of the VAGWO in solving this category of function when compared to its rivals.
The convergence curves are shown in Figure 6. As indicated in this figure, the VAGWO rapidly and greedily converges to the optimal point of F10-F13, while its closest rival, namely the GBO, seems to stagnate during the optimization of all functions after the lapse of several iterations.

Comparison on CEC2017 Benchmark Functions
The VAGWO is investigated to reveal if its superiorioty to the popular algorithms on the composition functions derived from the CEC2017 benchmark suite is continued when compared to the newly proposed algorithms and the winner of the CEC2017 competition under the same conditions. As can be seen in Table 8, the VAGWO is superior to its competitors on CF1, CF3, CF4, CF6, CF7, CF8, and CF9. Overall, the VAGWO is superior to the other algorithms on 23 out of 40 criteria (58%), while its closest rival is found to be the EBOwithCMAR, as the winner of the CEC2017 competition, which can outperform the other methods only on 20% of the whole criteria. This category of test problems can be a very good examiner of the overall eligibility of any optimization algorithm, as it contains the toughest problems to solve. Adding the velocity term, defining a new formulation for the coefficients C, accelerating the exploration and exploitation in the early and later iterations, respectively, and also incorporating an elitism mechanism are all among the strengths of the proposed VAGWO, helping this algorithm to even overcome the newly proposed meta-heuristics and the winner of the CEC2017 competition.

Statistical Analysis
To further analyze the results, a non-parametric test, named the Wilcoxon rank-sum test, is applied to delineate whether two sets of results are statistically different [42]. This test presents a parameter called the p-value to determine the significance level of a pair of results generated by a pair of algorithms. Usually, the superiority of the performance of a method is statistically significant when the p-value < 0.05.
The p-values are presented in Tables 9-12. In these tables, the expression N/A stands for "Not Applicable", indicating a certain algorithm that outperforms the others in the quality of the results it presents in each test problem and thus should be compared pairwise with each of the other algorithms. Moreover, the signs "+", "−", and "∼", mean that the N/A algorithm beats, loses to, or ties with the other algorithms at the test, respectively. The results are broken down into two categories: (1) the results on the 13 shifted standard test functions; and (2) the results on the CEC2017 50-dimensional composition benchmark functions. As can be seen, the VAGWO significantly outperforms the popular meta-heuristic algorithms on 47 out of 54 (87%) of the total cases of its outperformance when applied to the standard functions. The VAGWO also shows significant dominance in 19 out of 42 (45%) of cases over the popular meta-heuristics when implemented on the CEC2017 composition functions.   Furthermore, the VAGWO can significantly outperform the newly proposed metaheuristics in 15 out of 20 (75%) of its dominance cases when applied to the standard benchmarks, closely followed by the GBO which significantly outperforms the other competitive algorithms in 12 out of 20 (60%) of its total outperformance cases. The proposal also shows significant outperformance in 17 out of 35 (49%) of its all dominance cases, when implemented on the CEC2017 functions, while the EBOwithCMAR and GBO show significant dominance only in 2 out of 10 (20%) and four out of five cases (80%) of their total cases of outperformance, respectively. As a result, not only can the proposed VAGWO outperform the two sets of popular and newly proposed meta-heuristic algorithms as well as the winner of the CEC2017 competition when optimizing the two sets of the test functions, but it can also present significantly better results compared to its rivals.

Complexity of Algorithm
The complexity of VAGWO and GWO methods is evaluated according to the standard approach proposed in [35]. The results can be observed in Table 13, where T 0 stands for the computing time(s) for a standard loop illustrated in Figure 7, T 1 denotes the CPU time(s) of F18 from CEC2017 using 200,000 function evaluations, andT 2 represents the average of CPU time(s) of the methods when solving the same function (i.e., F18) five times. The lowest complexities are highlighted in Table 13.

Complexity of Algorithm
The complexity of VAGWO and GWO methods is evaluated according to the standard approach proposed in [35]. The results can be observed in Table 13, where T 0 stands for the computing time(s) for a standard loop illustrated in Figure 7, T 1 denotes the CPU time(s) of F18 from CEC2017 using 200,000 function evaluations, and T 2 represents the average of CPU time(s) of the methods when solving the same function (i.e., F18) five times. The lowest complexities are highlighted in Table 13.
As can be observed from Table 13, the complexity of the VAGWO method is 29%, 20%, and 27% greater than that of the GWO for 10-, 30-, and 50-dimensional problems, respectively. As inferred from Table 13, the complexity of the proposed VAGWO is just a little greater than that of the GWO as its base algorithm, and their difference in complexity can even be reduced by increasing the dimensions of the problem. It is worth mentioning that all algorithms were run in the MATLAB-R2018b environment installed on the Windows 10 operating system of an Intel Quad-core computer with 2.8 GHz CPU and 16 GB of memory.   As can be observed from Table 13, the complexity of the VAGWO method is 29%, 20%, and 27% greater than that of the GWO for 10-, 30-, and 50-dimensional problems, respectively. As inferred from Table 13, the complexity of the proposed VAGWO is just a little greater than that of the GWO as its base algorithm, and their difference in complexity can even be reduced by increasing the dimensions of the problem. It is worth mentioning that all algorithms were run in the MATLAB-R2018b environment installed on the Windows 10 operating system of an Intel Quad-core computer with 2.8 GHz CPU and 16 GB of memory.

Runtime Analysis
The runtime of VAGWO and GWO algorithms is evaluated to better understand if the proposed algorithm retains its efficiency despite its increased complexity as compared to the original GWO. Tables 14 and 15 show the results of the runtime of independent executions of each test problem, including the standard benchmark functions and the CEC2017 composition functions, respectively. These runtimes are the average of the runtimes of the algorithms when executed 30 times. As the results suggest, the average runtime of the GWO is calculated to be 5.33 s on the standard functions, while the average runtime of the VAGWO on the same functions is evaluated to be 5.94 s, indicating just an 11.45% increase in runtime when using the proposed VAGWO to solve these benchmarks. Moreover, the average runtime of the GWO is obtained as 6.06 s on the CEC2017 composition functions, while the VAGWO has an average runtime of 6.64 s on these functions, showing just a 9.51% increase in runtime compared to that of the GWO. As can be seen, the runtime and consequently the complexity of the VAGWO is just slightly greater than the original GWO, revealing that the proposed algorithm can reach highly better results than the GWO while preserving its efficiency. Meanwhile, the difference in the runtime of the two algorithms is lessened when implemented on the CEC2017 functions. As the CEC2017 functions are much more complex functions to evaluate than the standard functions, it can be inferred that the main reason causing the complexity of the proposed VAGWO to be lessened on the CEC2017 is that the objective function evaluations of this test set are assigned a high weight in the complexity of any algorithm employed to optimize these functions. As a result, the complexity of the main body of the VAGWO algorithm can find less weight in the total complexity of a single run of the optimization process of the CEC2017 functions than that when applied to the standard functions. This point is very promising and further encourages the use of the VAGWO, especially when dealing with a complex objective function.

Comparison on Real-World Engineering Design Problems
In this section, the performance of the VAGWO is examined by solving three constrained real-world engineering design problems. To validate the VAGWO in solving such problems, the resulting performance of VAGWO is tested against seven state-of-theart and widely used optimization algorithms, including Particle Swarm Optimization (PSO) [38], Gravitational Search Algorithm (GSA) [37], Cuckoo Search (CS) [3], Grey Wolf Optimizer (GWO) [4], Whale Optimization Algorithm (WOA) [7], Elephant Herding Optimizer (EHO) [43], and Simulated Annealing (SA) algorithm [44]. All the results of these algorithms taken to be compared with those achieved by the proposed VAGWO algorithm are presented in [45]. For handling the constraints in these problems, the scalable penalty functions are utilized in VAGWO and once the solutions become infeasible, numerous penalty functions are added to the minimization objective to enhance the cost of the optimization and penalize these solutions.
In addition, 50 search agents, as well as 1000 iterations, are used in VAGWO, and each problem is run 30 times, and the best results, including the best objective values and the best design variables are reported against those resulting from the other algorithms examined.

Welded Beam Design Problem
In this problem, a welded beam is designed to minimize its construction cost [46]. The main objective function is subject to some constraints. The problem includes four design variables consisting of h(x 1 ), l(x 2 ), t(x 3 ), and b(x 4 ), as shown in Figure 8. The formulation of this problem is as follows: Subject to: where where  Table 16 shows the results of solving the welded beam design problem obtained by the VAGWO and the other comparative algorithms. As can be seen, the VAGWO can reach f(x)=1.6952, which is the minimum and the best cost among the other algorithms.
The design variables obtained by the VAGWO are shown in Table 16 and assumed as the best variables with respect to the objective function value the VAGWO can achieve during the optimization process. Figure 8. Welded beam design problem [47]. Figure 8. Welded beam design problem [47]. Table 16 shows the results of solving the welded beam design problem obtained by the VAGWO and the other comparative algorithms. As can be seen, the VAGWO can reach f (x) = 1.6952 , which is the minimum and the best cost among the other algorithms. The design variables obtained by the VAGWO are shown in Table 16 and assumed as the best variables with respect to the objective function value the VAGWO can achieve during the optimization process. The tension/compression spring design problem [48] aims to minimize the weight of a tension/compression spring, subject to constraints on minimum deflection, outside diameter restrictions, surge frequency, shear stress, and design variables. The design variables are d(x 1 ), D(x 2 ), and P(x 3 ), as depicted in Figure 9. The formulation of this problem is as follows: Minimize Subject to:  Table 17 shows the final results the VAGWO and its competitors present after solving this problem. As shown in the table, the VAGWO achieves the minimum weight for the tension/compression spring, presenting f (x) = 1.2665 × 10 −2 . The first five comparative algorithms yield the same objective function value, while the EHO and SA present the worst objective values among all the algorithms applied to this problem.  Figure 9. Tension/compression spring problem [47]. This optimization problem is a constrained one, an similarly to the two previous problems it aims to minimize the weight of a speed reducer subject to constraints on the bending stress of the gear teeth, surface stress, transverse deflections of the shafts, and stresses in the shafts [49]. The scheme of the speed reducer is shown in Figure 10. The formulation of this problem is as follows: Minimize f(x)=0.7854x 1 x 2 2 3.3333x 3  x 2 x 6 4 x 3 − 1≤0 (55) Figure 9. Tension/compression spring problem [47].

Speed Reducer Design Problem
This optimization problem is a constrained one, an similarly to the two previous problems it aims to minimize the weight of a speed reducer subject to constraints on the bending stress of the gear teeth, surface stress, transverse deflections of the shafts, and stresses in the shafts [49]. The scheme of the speed reducer is shown in Figure 10. The formulation of this problem is as follows: − 1 ≤ 0 (58) g 9 (x) = x 1 12x 2 − 1 ≤ 0 (61) g 10 (x) = 1.5x 6 + 1.9 x 4 − 1 ≤ 0 (62) g 11 (x) = 1.1x 7 + 1.9  Figure 10. Speed reducer design problem [47]. In this paper, a novel variant of the Grey Wolf Optimization (GWO) algorithm, named Velocity-Aided Grey Wolf Optimizer (VAGWO), was proposed. In this algorithm, a velocity term is added to the position-updating procedure of the original GWO algorithm. It was proven that the velocity can significantly improve the GWO algorithm when attempting to explore the search space, as the velocity can keep to push the search agents to continue their global search to prevent a considerable number of good positions being missed during the optimization process. In VAGWO, both the exploration and exploitation capabilities of the GWO are also strengthened via modification of the two control parameters of this algorithm. Furthermore, a safe and reliable balance between exploration and exploitation is maintained via emphasizing the position of the leading search agents in the last iterations and de-emphasizing them in the earlier iterations. Further- Figure 10. Speed reducer design problem [47]. Table 18 shows the results obtained by the VAGWO and the other algorithms when solving this problem. As can be seen, the results of the different algorithms are very close to each other; however, the CS can achieve f (x) = 2.9975 × 10 3 , as the best objective function value among all the other algorithms. The VAGWO reaches the most competitive result on this problem, as compared to the other examined algorithms. As a result, the VAGWO outperforms the other seven algorithms on two out of three examined real-world engineering design problems, demonstrating its efficacy in solving the constrained practical optimization problems along with a variety of high-dimensional and complex benchmark problems as well. Table 18. Minimization results of speed reducer design.

Algorithm
x 1 x 2 x 3 x 4 x 5 x 6 x 7 f (x)

Conclusions
In this paper, a novel variant of the Grey Wolf Optimization (GWO) algorithm, named Velocity-Aided Grey Wolf Optimizer (VAGWO), was proposed. In this algorithm, a velocity term is added to the position-updating procedure of the original GWO algorithm. It was proven that the velocity can significantly improve the GWO algorithm when attempting to explore the search space, as the velocity can keep to push the search agents to continue their global search to prevent a considerable number of good positions being missed during the optimization process. In VAGWO, both the exploration and exploitation capabilities of the GWO are also strengthened via modification of the two control parameters of this algorithm. Furthermore, a safe and reliable balance between exploration and exploitation is maintained via emphasizing the position of the leading search agents in the last iterations and de-emphasizing them in the earlier iterations. Furthermore, an elitism mechanism is incorporated into the VAGWO to facilitate reaching the optimal solution via intensifying the exploitation. The proposed VAGWO was implemented on 13 shifted high-dimensional standard benchmark functions as well as a set of composition functions derived from the CEC2017 standard test functions and three real-world problems. The eligibility of the proposed method was then verified when compared with a set of popular and newly proposed meta-heuristic algorithms implemented on these test problems. A Wilcoxon test was also performed to highlight the significance of the superiority of the VAGWO when outperforming its competitors. The computational complexity of the VAGWO was also evaluated and demonstrated to be slightly greater than that of the original GWO algorithm. As a result, the proposal is a computationally efficient algorithm, while being capable of tackling the wide range of difficulties the different optimization problems experience. In future work, we will aim to extend the application of the VAGWO to other challenging theoretical and practical test problems to better identify its likely weaknesses and/or shortcomings and remove them to further ameliorate its functionality.

Data Availability Statement:
The data is available upon request.