Next Article in Journal
Association Rules Mining for Hospital Readmission: A Case Study
Next Article in Special Issue
Agent-Based Recommendation in E-Learning Environment Using Knowledge Discovery and Machine Learning Approaches
Previous Article in Journal
Contact Dynamics: Legendrian and Lagrangian Submanifolds
Previous Article in Special Issue
A Distributed Quantum-Behaved Particle Swarm Optimization Using Opposition-Based Learning on Spark for Large-Scale Optimization Problem
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Performance of a Novel Chaotic Firefly Algorithm with Enhanced Exploration for Tackling Global Optimization Problems: Application for Dropout Regularization

1
Faculty of Informatics and Computing, Singidunum University, Danijelova 32, 11000 Belgrade, Serbia
2
Romanian Institute of Science and Technology, Str. Virgil Fulicea 3, 400022 Cluj-Napoca, Romania
3
Computer Science and Engineering, University of Kurdistan Hewler, 30 Meter Avenue, Erbil 44001, Iraq
*
Author to whom correspondence should be addressed.
Mathematics 2021, 9(21), 2705; https://doi.org/10.3390/math9212705
Submission received: 2 October 2021 / Revised: 19 October 2021 / Accepted: 20 October 2021 / Published: 25 October 2021

Abstract

:
Swarm intelligence techniques have been created to respond to theoretical and practical global optimization problems. This paper puts forward an enhanced version of the firefly algorithm that corrects the acknowledged drawbacks of the original method, by an explicit exploration mechanism and a chaotic local search strategy. The resulting augmented approach was theoretically tested on two sets of bound-constrained benchmark functions from the CEC suites and practically validated for automatically selecting the optimal dropout rate for the regularization of deep neural networks. Despite their successful applications in a wide spectrum of different fields, one important problem that deep learning algorithms face is overfitting. The traditional way of preventing overfitting is to apply regularization; the first option in this sense is the choice of an adequate value for the dropout parameter. In order to demonstrate its ability in finding an optimal dropout rate, the boosted version of the firefly algorithm has been validated for the deep learning subfield of convolutional neural networks, with respect to five standard benchmark datasets for image processing: MNIST, Fashion-MNIST, Semeion, USPS and CIFAR-10. The performance of the proposed approach in both types of experiments was compared with other recent state-of-the-art methods. To prove that there are significant improvements in results, statistical tests were conducted. Based on the experimental data, it can be concluded that the proposed algorithm clearly outperforms other approaches.

1. Introduction

Swarm intelligence is popular in the field of optimization. However, as the “no free lunch” theorem infers, no single algorithm is universally the best performing algorithm for all problems. Hence, many techniques inspired by the behaviors of living organisms have been developed and applied for theoretical and practical tasks, including function optimization, parameter and method calibration, and efficiency improvement in industrial scenarios.
The current paper introduces a modified version of the firefly algorithm (FA) and verifies its boosted abilities on global optimization tasks. The FA [1] is a well-known SI algorithm that has shown great promise in the field of optimization based on metaheuristics. The proposed method is theoretically tested on two bound-constrained benchmark sets: (i) with chosen functions from the CEC test suite, with 10, 30, and 100 dimensions; and (ii) with challenging CEC2017 bound-constrained problems. Finally, from the practical perspective, the proposed approach was applied for dropout approximation.
The FA was chosen as the base for augmentation as it has been successfully validated for various NP-hard challenges in the machine learning domain [2,3,4], including the dropout estimation problem [5], and it shows great potential. However, it was also established that the basic FA suffers from some deficiencies and it is assumed that its potential can be further improved by performing modification of its original version. Furthermore, the performance of SI algorithms for a dropout regularization challenge was not investigated enough.
The field of machine learning suffers from overfitting, otherwise known as high variance, and it appears along every model. The question is not if the model will overfit, but rather about how much it will overfit. While the variance is high, the value of the bias is at its lowest, which is the deviation from the predicted value. The trade-off between these two parameters has to be made without any of the two gravitating towards their extreme values. The difficulty arises when the data set is modest in size, which leaves less space for adjusting. Convolutional neural networks (CNN) tackle this problem with the method of dropout regularization, which has proven to be efficient in solving these types of scenarios. In this process, random neurons are selected for exclusion from the layers during the training phase. This results in a higher bias, which translates to a more precise model, but the key is moderation, because of the previously mentioned trade-off.
Dropout is estimated manually and tested by trial and error, which is unsustainable in cases where the model is complex. Dropout estimation by swarm intelligence algorithms can help solve this problem. Algorithms that are considered swarm-like heuristics have had success with solving NP-hard problems, including dropout estimation. CNN and SI algorithms are metaheuristics; they are influenced by concepts from nature. CNNs take inspiration from the human visual cortex and SI algorithms take inspiration from animals that move, live, and gather resources in large groups, called swarms. Recent research shows that the hybrid solutions between machine learning and swarm intelligence provide better results [6,7,8]. These types of hybrid solutions are more optimized and scalable.
The observed results indicate improvement from the original algorithm on the tested CEC benchmarks and better results with the CNN. As mentioned before, overfitting is unavoidable. One of the solutions is regularization—this is the process of preventing overfitting. The complexity of the model is controlled through regularization techniques. Models with a larger number of features result in a large number of weights, considering that every feature is assigned a certain weight. The loss function returns the difference between the predicted and the actual label. Different techniques of regularization exist, among which, the most popular are L1, L2, and dropout regularization. The dropout method carries significant importance because of the high accuracy of the model, while the loss is very low. The other techniques perform well in established scenarios; however, there is a certain lack of evidence of stable performance.
The main objective behind the approach proposed in this study is to further improve the FA, from the theoretical side, increase the classification performance of CNNs, and avoid the overfitting issue by proper establishment of the dropout regularization parameter, from a practical scope. Furthermore, since the potential of metaheuristics for this type of challenge was not investigated enough, 10 other well-known swarm intelligence approaches were also implemented and tested for this problem. The contribution of this research is three-fold:
  • A novel modified FA algorithm was implemented by specifically targeting the known flaws of the basic implementation of the FA approach;
  • The devised algorithm was later utilized to help establish the proper dropout value and enhancing the CNN accuracy;
  • Other well-known swarm intelligence metaheuristics for CNN dropout regularization challenge were further investigated.
The rest of the paper is organized in the following manner. Section 2 describes the fundamental technologies used (swarm intelligence and CNN). Section 3 introduces the modified version of the algorithm, as well as the original one. Section 4 provides the results of the experiments. Section 5 deals with the optimization of the dropout parameter, and the final observations are given in Section 6.

2. Preliminaries and Related Works

Improving an existing solution by modifying an algorithm, i.e., via another metaheuristic approach, yields good results in this field. Metaheuristic solutions are stochastic, and for an algorithm to be categorized as metaheuristic, it must be inspired by a certain process in the nature. These processes come from group animal behaviors, in which animals work towards a common goal, unachievable by solely working alone. This type of behavior exhibits group intelligence. The intellectual potential of a single unit of a species is not very high. On the contrary, while in large groups, even simple organisms perform complex tasks successfully. The solutions inspired by these kinds of animals are metaheuristic and belong to the field of swarm intelligence, which has proven successful in solving NP-hard problems. This has been exploited in algorithm hybridization for improving machine learning algorithms; this type of combination is referred to as learnheuristics.
In this work, dropout regularization improvement was achieved by the previously mentioned methods. Swarm intelligence is a metaheuristic field that adapts animal behavior, specifically in animals that move in swarms, in regard to algorithms used in the field of artificial intelligence [9,10]. The field of SI has a wide application because it efficient in solving NP-hard problems. SI methods have been frequently used to address different optimization tasks, both theoretical [11] and from various practical fields, including wireless sensor networks (WSNs) [12,13,14,15], task scheduling in the cloud, and edge computing [16,17]. Recently, one of the most important fields of interest has been the hybrid approach with SI and machine learning. The number of publications in this domain increased drastically in recent years; some of the most prominent works include hyperparameter optimization [3,18,19], feature selection problems [2], time series prediction tasks, e.g., estimation of COVID-19 cases [6,20], and neural network training [21,22].
Hybridization of these algorithms yields the most benefits. With this approach, it is possible to significantly improve convergence times. SI algorithms apply a stochastic approach in the search of global optima, making them heavily reliant on the number of iterations. This process is divided into two phases, exploration and exploitation, similar to the training and testing phases in machine learning. In exploration, the focus is on exploring the local search area, while the latter phase is global. These phases must be balanced out, again, similar to the training and testing phases in machine learning. The SI goal is not to achieve the best solution, but rather to quickly provide a sub-optimal one. The search for the best solution can be greatly enhanced by adding evolutionary principles to the algorithm. The evolutionary algorithms implement a mechanism that transfers the knowledge from the previous population to the next one. This is achieved through mutation, crossover, and selection. Mutation can be translated into the algorithm, keeping a unit from the previous generation, but with the modification of the value it carries. Crossover is the combination of two neurons, and selection is the process of selecting the best units. This is a different approach compared to the random generation of a hive population. Evolution-based swarms are prone to provide faster convergence compared to the classic population-based swarm algorithms, but are less sensitive in terms of finding local optima, as they can get more easily stuck in them.
The SI method proposed in this paper regards an augmentation of the FA algorithm. The improved version was tested on several benchmark theoretical functions, before it is applied to dropout optimization in CNN.
Humans are highly visual creatures and rely heavily on this sense. This translates to limitations of input that can be used for machine learning. While the field yielded tremendous results in big data and prediction-based insights, most of the ideas that employ AI required visual input. For the majority of adopters, which are non-professional individuals, the only contact with AI is by using certain software that manipulates the visual input. While these tasks are trivial, in regard to changing one’s appearance, the true importance of this adoption lies in the previously mentioned nature of our species. Humans are not computational beings. Therefore, the human species does not process any of the information absorbed by labeling, tagging, and placing into tables. This creates a limitation for the accurate representation of the information obtained in the computational form. It is inefficient and too complex of a process for an individual to translate the obtained information from a photograph into words (in a way that a program can process them). As a result, CNNs have been widely applied because they excel in these types of tasks, including speech recognition, natural language processing, and computer vision. These models must be ‘modeled’, such as the human nervous system [23,24,25]. The most recent applications include facial recognition [26,27,28,29], document analysis [30,31,32], image classification tasks in medicine as support for diagnostic processing and faster illness detection [33,34,35], analysis of climate change and extreme weather prediction [36,37], and many others. The metaheuristic approach in CNN comes from the animal visual cortex. The visual cortex is built from layers that receive, segment, and integrate visual input. The output of each layer is the input for the next layer. During this process, the data get cleaner as they get deeper. This means that data are simplified, making it easier to process further, while retaining all of the important features. An example of this behavior is edge forming on the first layer, the set of edges and corners on the second layer, sets of corners and contours and parts of objects on the third layer and, finally, the full object on the last layer. The convolution layer, pooling layer, and the fully connected layer, in that order, represent the anatomy of a CNN.
Firstly, the convolution layers apply the corresponding operations, which filter the data. It is important to emphasize that the filters are always smaller in size from the input. Widely used sizes are 3 × 3, 5 × 5, and 7 × 7. The convolution operation of the input vector is:
z i , j , k [ l ] = w k [ l ] x i , j [ l ] + b k [ l ]
The symbols from the equation bear the following meaning: z i , j , k [ l ] denotes the output feature value of the k-th feature map at location i , j , the input is x at the location i , j , w represents filters, and bias is b.
The activation operation is:
g i , j , k [ l ] = g ( z i , j , k [ l ] ) ,
where g ( · ) denotes the non-linear function exploiting the output.
There are two types of pooling layers: global and local. The most widely used method is the max and average pooling.
The resolution is reduced through the pooling function:
y i , j , k [ l ] = p o o l i n g ( g i , j , k [ l ] )
Classification is performed by the fully connected layers. The softmax layer performs multi-classification. In the case of binary classification, the logistic layer is used.
As stated in Section 1, several techniques are used to avoid the overfitting issue; one of them is dropout regularization. This research focuses on optimizing the dropout probability ( d p ); hence, the dropout regularization is explained in the following paragraphs.
In light of the proposed CNN model, the dropout technique can be considered a new CNN layer. With this in mind, r denotes the activation or dropout of M nodes in the observed layer. Every variable r j is assigned the value of 1 with probability 1 p , independently. If the observed r j contains the value 1, then that unit remains in the network, and if not, then that particular unit is removed from the network with all of its connections.
The probability p is unconstrained from other cells in the network, and it is obtained from the Bernoulli distribution, described with the Equation (4).
r j B e r n o u l l i ( p ) , j = 1 , 2 , , M
With this in mind, it is possible to denote the outputs vector of a layer L with y ( L ) during the network training. After applying the dropout, the new outputs vector y ˜ ( L ) can be defined by Equation (5):
y ˜ ( L ) = r y ( L )
At the end, during the network testing, the weight matrix W is required to be scaled by ratio p for averaging all 2 M possible networks that have dropped out. This step summarizes the main contributions of the regularization method, because it is needed to test a single network, as shown in Equation (6).
W t e s t ( L ) = p W ( L )
where W ( L ) denotes the weight matrix at layer L.

3. Proposed Method

This beginning section introduces the basic implementation of the FA metaheuristics, followed by the discussion about the known and observed flaws and drawbacks of the original version. At the end, a detailed description of the proposed modified method that is devised to specifically overcome these flaws of the original algorithm is provided.

3.1. The Original Firefly Algorithm

The FA metaheuristics, introduced by Yang [1], is motivated by flashing and social characteristics of fireflies. Since, in the ‘real-world’, the natural system is relatively complex and sophisticated, the FA models it by using several approximation rules [1].
Brightness and attractiveness of fireflies are used for modeling fitness functions; attractiveness, in most typical FA implementations, depend on the brightness, which is in turn determined by the objective function value. In the case of minimization problems, it is formulated as [1]:
I ( x ) = 1 f ( x ) , if f ( x ) > 0 1 + f ( x ) , otherwise
where I ( x ) represents attractiveness and f ( x ) denotes the value of objective function at location x.
Light intensity; hence, the attractiveness of the individual decreases, as the distance from the light source increases [1]:
I ( r ) = I 0 1 + γ r 2
where I ( r ) represents light intensity at the distance r, while I 0 stands for the light intensity at the source. Furthermore, for modeling real natural systems, where the light is partially absorbed by its surroundings, the FA makes use of the γ parameter, which represents the light absorption coefficient. In most FA versions, the combined effect of the inverse square law for distance and the γ coefficient is approximated with the following Gaussian form [1]:
I ( r ) = I 0 · e γ r 2
Moreover, each firefly individual utilizes attractiveness β , which is directly proportional to the light intensity of a given firefly and also depends on the distance, as shown in Equation (10).   
β ( r ) = β 0 · e γ r 2
where parameter β 0 designates attractiveness at distance r = 0 . It should be noted that, in practice, Equation (10) is often replaced by Equation (11) [1]:
β ( r ) = β 0 1 + γ r 2
Based on the above, the basic FA search equation for a random individual i, which moves in iteration t + 1 to a new location x i towards individual j with greater fitness, is given as [1]:
x i t + 1 = x i t + β 0 · e γ r i , j 2 ( x j t x i t ) + α t ( κ 0.5 )
where α stands for the randomization parameter, the random number drawn from Gaussian or a uniform distribution is denoted as κ , and r i , j represents the distance between two observed fireflies i and j. Typical values that establish satisfying results for most problems for β 0 and α are 1 and [ 0 , 1 ] , respectively.
The r i , j is the Cartesian distance, which is calculated by using Equation (13).
r i , j = | | x i x j | | = k = 1 D ( x i , k x j , k ) 2
where D marks the number specific problem parameters.

3.2. Motivation for Improvements

Notwithstanding the outstanding performance of original FA for many benchmarks [38] and practical challenges [39], findings of previous studies suggest that the basic FA shows some deficiencies in terms of insufficient exploration and inadequate intensification-diversification balance [40,41,42]. The lack of diversification is particularly emphasized in early iterations, when, in some runs, the algorithm is not able to converge to optimal search space regions, and ultimately worse mean values are obtained. In such scenarios, a basic FA search procedure (Equation (12)), which primarily conducts exploitation, is not able to guide the search towards optimum domains. Conversely, when in the initialization phase, random solutions are generated by chance in the optimal or near-optimal regions, the FA manages to obtain satisfying results.
Further, by analyzing a fundamental FA search equation (Equation (12)), it can be observed that it does not encompass an explicit exploration procedure. To address this issue, some FA implementations utilize the dynamic randomization parameter α , where this parameter is gradually decreased from its initial value α 0 towards the predefined threshold α m i n , as shown in Equation (14). In this way, at the beginning of a run, exploration is more emphasized, while in later iterations, the balance between intensification and diversification moves towards exploitation [43]. However, based on the extensive empirical simulations, it is deduced that the application of dynamic α is not enough to enhance FA exploration abilities and the proposed mechanism only slightly eliminates this issue.
α t + 1 = α t · 1 t T ,
where t and t + 1 denote current and next iteration, respectively, while the T is the maximum iteration number in one run of an algorithm.
It is also worth noting that the previous studies show that FA exploitation abilities are efficient in tackling various kinds of tasks, and FA is known as metaheuristic, with robust exploitation capabilities [40,41,42].

3.3. Novel FA Metaheuristics

A novel FA approach proposed in this study addresses issues of the basic FA by assimilating the following procedures:
  • Explicit exploration mechanism based on the solution’s exhaustiveness;
  • gBest chaotic local search (CLS) strategy.
No matter what the outstanding exploitation capabilities of the original FA are, by using the CLS mechanism, intensification can further improve, as shown in the empirical section of this manuscript.
Motivated by proposed enhancements, a novel FA is named chaotic FA with enhanced exploration (CFAEE).

3.3.1. Explicit Exploration Mechanism

The goal of the explicit exploration procedure is to assure that the algorithm converges to the optimum part of the search space in early iteration, while in late phases of execution, it facilitates exploration around the parameter boundaries of the current best individual x * . To incorporate this behavior, each solution is modeled by using additional attributes t r i a l , which is incremented every time when the solution cannot be improved by the basic FA search (Equation (12)). When the t r i a l parameter for a particular solution reaches a predetermined l i m i t value, the individual is replaced with the random solution drawn from within the boundaries of the search space by utilizing the same procedure as in the initialization phase:
x i , j = l j + ( u j l j ) · r a n d ,
where x i , j represents j-th component of i-th individual, u j and l j denote upper and lower search boundaries of the j-th parameter, and r a n d is a uniformly distributed random number from the interval [ 0 , 1 ] .
The solution, which t r i a l exceeds the l i m i t , is said to become exhaustive. This idea, as well as terminology, was adapted from the well-known ABC metaheuristics [44], which are known to have efficient exploration mechanisms [45].
Replacement of the exhausted solution, with a pseudo-random individual, stirs up search performances in early iterations, when the algorithm does not identify proper parts of the search region. However, in later iterations, following the reasonable assumption that the optimal region has been found, this kind of replacement wastes function evaluations. For that reason, in later iterations, the random replacement procedure is changed with the guided replacement mechanism around the lower and upper parameter values of all solutions in the population:
x i , j = P l j + ( P u j P l j ) · r a n d ,
where P l j and P u j represent the lowest and highest values of the j-th component from the entire population P.

3.3.2. The gBest CLS Strategy

The chaos as random phenomenon exists in non-linear and deterministic systems and is highly responsive to its initial condition [46]. From the mathematical perspective, chaotic search is more efficient than the ergodic one [47], because a vast number of sequences can be generated by only tweaking its initial values.
Notwithstanding that, in modern literature, many chaotic maps exist, after conducting empirical experiments, it was concluded that, in case of the proposed novel FA, the logistic map obtains the most promising results. We should note that the logistic map has been utilized in many swarm intelligence approaches [48,49,50].
The logistic map that the proposed method utilizes executes in K steps and it is defined as:
σ i , j k + 1 = μ σ i , j k ( 1 σ i , j ) , k = 1 , 2 , K ,
where σ i , j k and σ i , j k + 1 represent chaotic variable for j-th component of the i-th solution in steps k and k + 1 , respectively, and μ is control variable. The σ i , j 0.25 , 0.5 and 0.75 , σ i , j ( 0 , 1 ) and μ is set to 4, since this value was previously determined empirically [50].
The proposed method incorporates the global best (gBest) CLS strategy because the chaotic search is performed around the x * solution. In each step k, new x * , denoted as x * , is generated with the Equations (18) and (19), which are applied for to component j of x * :
x j * = ( 1 λ ) x j * + λ S j
S j = l j + σ j k ( u j l j )
where σ j k is determined by Equation (17) and λ is the dynamic shrinkage parameter that depends on the current fitness function evaluation ( F F E ) and on the maximum number of fitness function evaluations ( m a x F F E ) in the run:
λ = m a x F F E F F E + 1 m a x F F E
By using dynamic λ , better exploitation–exploration equilibrium around the x * is established. In earlier phases of the execution, a wider search radius around the x * was explored, while in the later phases, a fine-tuned exploitation was performed. The F F E and m a x F F E can be replaced with t and T when the maximum number of iterations is taken as the termination condition.
In this way, by using the CLS strategy, x * is (attempted to be) improved in K steps, and if the x * obtains better fitness than the x * , the CLS procedure is terminated, and the x * is replaced with x * . However, if in K steps the x * could not be improved, it is retained in the population.

3.3.3. Chaotic FA with Enhanced Exploration Pseudo-Code

In order to efficiently incorporate the exploration mechanism and gBest CLS strategy in the original FA, a few things should be considered. First, as already suggested in Section 3.3.1, in the early phases of the execution, the random replacement mechanism should be conducted, while in latter phases, the guided one would generate better results. Second, the gBest CLS strategy, in early iterations, would not generate significant improvements because the x * likely still did not converge to the optimum region, and it would just waste F F E s .
To control the above mentioned behavior, the additional control parameter ψ is included in the following way: if t < ψ , the exhausted solutions from the population are replaced with the random ones (Equation (15)) and the gBest CLS will not be executed; if t ψ , the guided replacement mechanism will be executed (Equation (16)) and the gBest CLS will be triggered.
Moreover, to fine-tune, the basic FA search proposed method utilizes dynamic α , according to Equation (14).
Taking all of the above, the pseudo-code of the proposed CFAEE is summarized in Algorithm 1.
Algorithm 1 The CFAEE pseudo-code
Initialize main metaheuristics control parameters N and T
Initialize search space parameters D, u j and l j
Initialize CFAEE control parameters γ , β 0 , α 0 , α m i n , K and ϕ
Generate initial random population P i n i t = { x i , j } , i = 1 , 2 , 3 , N ; j = 1 , 2 , , D using Equation (15) in the search space
while t < T do
for i = 1 to N do
  for z = 1 to i do
   if I z < I i then
    Move solution z in the direction of individual i in D dimensions (Equation (12))
    Attractiveness changes with distance r as exp[ γ r ] (Equation (10))
    Evaluate new solution, replace the worse individual with better one and update intensity of light (fitness)
   end if
  end for
end for
if t < ϕ then
  Replace all solutions for which t r i a l = l i m i t with random ones using Equation (15)
else
  Replace all solutions for which t r i a l = l i m i t with guided replacement using Equation (16)
  for k = 1 to K do
   Perform gBest CLS around the x * using Equations (17)–(19) and generate x *
   Retain better solution between x * and x *
  end for
end if
 Update α and λ according to Equations (14) and (20), respectively
end while
Return the best individual x * from the population
Post-process results and perform visualization

3.3.4. The CFAEE Complexity and Drawbacks

The number of FFEs can be taken as a metric to determine the complexity of the swarm intelligence algorithm because the most computationally expensive part is the objective evaluation [38]. The basic FA evaluates objective functions in the initialization and in the solution updating phases. While updating solutions, according to the Equation (12), the FA employs one main loop for T iterations and two inner loops going through N solutions [38].
Thus, including the initialization phase, the worst case complexity of basic FA metaheuristics is O ( N ) + O ( N 2 · T ) . However, if N is relatively large, it is possible to use one inner loop by ranking the attractiveness or brightness of all fireflies using sorting algorithms, and in this case, complexity is O ( N ) + O ( N · T · log ( N ) ) [38].
The complexity of the proposed CFAEE is higher than the original FA due to the application of the explicit exploration mechanism and gBest CLS strategy. In the worst case scenario, if the l i m i t = 0 , all solutions will be replaced in every iteration, and the gBest CLS strategy will be triggered throughout the whole run if ϕ = 0 . Assuming that the value of K is set to 4, then the worst case CFAEE complexity is given as: O ( N ) + O ( T · N 2 ) + O ( T · N ) + O ( 4 · T ) . However, in practice, the complexity is much better because of l i m i t and ψ control parameter adjustments.
Drawbacks of the proposed CFAEE over the original version involve utilization of additional control parameters l i m i t and ψ . However, conducting empirical simulations, values of these parameters can be relatively easy determined. Moreover, the employment of these two parameters is justified because the CFAEE exhibits substantial performance improvements over the original FA for benchmark challenges and for the dropout regularization challenge from the machine learning domain, as shown in Section 4 and Section 5.

4. Bound-Constrained Benchmark Simulations

The proposed novel FA was first rigorously tested on a set of standard bound-constrained benchmarks that encompass functions from the well-known Congress on Evolutionary Computation (CEC) benchmark suite and other notable instances. The first benchmark set consists of 18 carefully chosen complex uni-modal, multi-modal, and two-dimensional functions, with the goal of determining convergence speed and exploration ability of the proposed method. Comparative analysis was performed with another state-of-the-art FA version. The purpose of the second benchmark set, which includes challenging CEC2017 unconstrained functions, is to measure the robustness and efficiency of the proposed CFAEE over other the state-of-the-art swarm intelligence metaheuristics.

4.1. Experimental Setup

Due to the stochastic nature of metaheuristics, the only way to determine proper control parameter values is by performing a “trial and error” approach on a wider set of theoretical problems, such as extensively utilized bound-constrained benchmarks. Afterwards, the results for a set of independent runs are averaged, and control parameters that obtain the best mean performances are utilized in further experiments. This is usual practice for establishing proper control parameter values for novel and improved implementations of existing metaheuristics approaches [1,51,52,53].
Following the above-mentioned firmly established practice, the optimal (or near-optimal) CFAEE control parameter setup was determined by conducting extensive simulations on classical unconstrained benchmarks. The goal was to find control parameter values that would, on average, for all test instances, accomplish satisfying results.
The CFAEE control parameter values that were utilized in both bound-constrained simulations are shown in Table 1. Since the CFAEE may utilize different number of F F E in each run, the m a x F F E is used as termination criteria instead of T. Expressions for calculating values of l i m i t and ϕ parameters are also determined empirically.
Both bound-constrained experiments were executed in 50 independent runs and all methods included in the comparative analysis were implemented for the purpose of this research. All algorithms were implemented in Python by using core (built-in), as well as specific data science and machine learning Python libraries: NumPy, SciPy, pandas, scikit-learn, pyplot, and seaborn.
All experiments were conducted on Intel® CoreTM i7-8700K CPU and 32 GB of RAM running, using a Windows 10 × 64-bit operating system computer platform.

4.2. Benchmark Problem Set 1

The goal of the first bound-constrained simulation was to validate convergence speed and exploration ability of the proposed method against other state-of-the-art FA approaches. The same ‘opponent’ algorithms and the same test beds as in [54] are included in the analysis.
Table 2 provides details of carefully chosen unconstrained benchmark instances that were used in experiments. Tests f 1 , f 3 , f 4 , f 5 , f 6 , f 7 , f 14 , and f 15 are provided by the benchmark suite of the CEC. Remaining functions represent basic tests used to evaluate the convergence of algorithms and quality of solutions. In addition, all test functions possess diverse characteristics. Firstly, the complex uni-modal functions that only have the global optimum are f 1 , f 2 , f 5 , f 7 , f 8 , f 12 , and f 14 . These functions are used for the purpose of convergence speed testing. Secondly, the multi-modal functions that have a variety of local optima are f 3 , f 4 , f 6 , f 9 , f 10 , f 11 , f 13 , and f 15 and their purpose is to test the ability of an algorithm to pull through from local solutions, which is the measure of exploration ability. Finally, the highly complex two-dimensional functions with various local minima are also included f 16 , f 17 , and f 18 .
State-of-the-art FA versions that were included in he comparative analysis are the following: dynamic adaptive weight firefly algorithm (WFA) [55], chaotic FA based on logistic map (CLFA) [56], Levy flights FA (LFA) [57], variable step size firefly algorithm for numerical optimization (VSSFA) [58], and the dynamically adaptive firefly algorithm with global orientation (GDAFA) [54].
In [54], all of the above-mentioned FA approaches were tested with N = 20 and T = 1000 per run, which, in the worst case, yielded a total of 400,040 F F E (please refer to Section 3.3.4). However, empirically, it was determined that not all N · N evaluations were executed in each iteration and the best approximation would be F F E / 2.5 , which is around 160,000. Thus, in this research, experiments provided in [54] were recreated with F F E = 160,000 for all methods in order to establish fair comparative analysis because the proposed CFAEE utilizes more F F E in each iteration than other opponent methods. Basic control parameter setups for all FA versions are the same, as shown in Table 1; for their other specific parameters, please refer to [54].
It should be noted that, for all methods, except for the basic FA, similar results as in [54] were obtained. In the conducted experiments, basic FA with dynamic parameter a l p h a (Equation (14)) was used, and much better results than reported in [54] were obtained. Authors in [54] implemented a static FA approach, and other proposed improved FA methods established better performance.
All simulations were conducted with 10, 30, and 100 dimensions ( D = [ 10 , 30 , 100 ] ) for benchmark function instances from f 1 to f 15 and comparative analysis results were summarized in Table 3, Table 4 and Table 5, respectively. Comparative analysis results with two-dimensional functions ( f 16 f 18 ) are provided in Table 6. In all simulations, best, worst, and mean values averaged over 50 runs are reported. The results in bold and slightly larger font denote the algorithm that showed the best results for that performance metric.
The overall conclusion from all presented results is that the best two methods are proposed—CFAEE and GDAFA. Benchmark instance with D = 10 are relativity easy for optimization and both methods in each run for all benchmarks obtained optimum results. The most significant performance difference between the original FA and other methods can be observed in the f 14 test, where the basic version completely failed to converge to the optimum region. On the other hand, the basic FA showed very competitive results for the f 3 benchmark.
When the benchmarks with D = 30 were considered, the proposed CFAEE again obtained superior results, leaving the GDAFA approach in second place. The superiority of CFAEE can be seen in f 5 , f 7 , f 8 , and f 13 benchmarks, where the difference between CFAEE (first), followed by GDAFA (second), and all other observed algorithms, were the most significant. It is also worth noting that the basic FA implementation again performed well, and exhibited competitive performances for the test instances f 1 , f 2 , f 5 , f 9 , and f 10 , where it outperformed several other enhanced FA implementations.
When the most complex benchmarks ( D = 100 ) are observed, the superiority of the proposed CFAEE can be seen once more. This is most obvious in the test instances f 7 , f 8 , and f 13 , where performance of the CFAEE (first), followed closely by GDAFA (second), were by far the best when compared to all other algorithms, with the most significant difference. The GDAFA, on the other hand, performed very well in test instances f 6 , f 9 , and f 14 , finishing in the first place, in front of the proposed CFAEE. Again, similar as to the D = 10 and D = 30 benchmarks, the basic FA implementation performances were very competitive, which can be easily seen for f 1 and f 6 benchmarks, where the basic FA performances were close to CFAEE and GDAFA, while leaving other enhanced FA implementations behind.
Finally, for instances with only two dimensions (Table 6), all methods, except FA and WFA, managed to reach optimum in all runs. These complex functions exhibit many local optima and FA and WFA did not show satisfactory exploration ability in all runs. This issue of basic FA is described in Section 3.2.
For making performance differences more clear for the readers—the number of times that each algorithm outperformed the benchmark, as well as each performance indicator, are counted in Table 7.
Further, to see if there is a statistically significant difference in the results, we applied the Wilcoxon signed rank-test to perform the pair-wise results comparisons between the proposed CFAEE and other improved FA versions, and the original FA algorithm, for 100-dimensional simulations (Table 5). Following the usual practice for determining whether the results came from different distributions, a significance level of α = 0.05 was taken. It should be noted that the results for D = 10 and D = 30 do not exhibit statistically significant differences since low-dimensional and medium-dimensional problems are easy tasks for all methods included in the analysis.
Results of the Wilcoxon signed-rank test are summarized in Table 8. As can be seen from the presented table, the calculated p-value is lesser than the critical level α = 0.05 in all cases, and it can be concluded that the proposed CFAEE, on average, significantly outperforms all other approaches.
Convergence speed graphs of some functions, averaged over 50 runs for all metaheuristics taken for comparative analysis in Table 5, are shown in Figure 1.

4.3. Benchmark Problem Set 2

The second bound-constrained validation of the proposed CFAEE was conducted on a very challenging CEC 2017 benchmark suite [59]. The suite is composed of 30 benchmarks divided into 4 groups: F 1 F 3 are uni-modal, F 4 F 10 are multi-modal, F 11 F 20 belong to the class of hybrid functions, while tests F 21 F 30 are very challenging composite functions. The last group contains properties of all uni-modal, multi-modal, and hybrid functions; moreover, they are shifted and rotated.
Test instance F 2 was deleted from the test suite due to unstable behavior [60], and these results are not reported. Basic details of CEC 2017 instances are given in Table 9.
Simulations are executed with 30-dimensional instances ( D = 30 ) and mean (average) and standard deviation (std) results for 50 runs are reported. The proposed CFAEE is compared against the basic FA with dynamic α , state-of-the-art improved Harris hawks optimization (IHHO) presented in [61], and other well-known efficient nature-inspired metaheuristics: HHO, DE, GOA, GWO, MFO, MVO, PSO, WOA, and SCA.
In this study, the same experimental setup as in [61] was recreated. The study shown in [61] reports results with N = 30 and T = 500 . However, as in the case of the first unconstrained experiment, since the CFAEE utilizes more F F E in each run, the m a x F F E is used as the termination criteria. All approaches included in the comparative analysis employ one F F E per solution in the initialization and update phases, and to conduct an unbiased comparison, m a x F F E was set to 15,030 ( N + N · T ). Control parameter adjustments of opponent methods can be retrieved from [61].
Comparative analysis results for the CEC 2017 benchmark suite are reported in Table 10. The best results for each performance indicator and instance are marked bold. Moreover, if two or more algorithms obtained the same performance, which are the best at the same time, these results are also underlined. Very similar results as in [54] are obtained, but with subtle discrepancies due to the stochastic nature metaheuristics.
The Table 10 shows that the CFAEE had the best results over 21 functions; those were F1, F3, F5, F6, F7, F8, F11, F12, F13, F15, F17, F19, F20, F21, F22, F23, F25, F26, F28, F29, and F30. In some cases, these case algorithms had the best results, but they were tied with results from another algorithm. In these cases, both results are in bold. The algorithm outperformed every other algorithm in these cases and the IHHO.
With some functions from the previously mentioned set, the CFAEE had the same results as the IHHO, and in those situations, they were tied as having the best results. Such cases are F3, F6, F19, F21, and F29. These results are underlined and in bold. It can be observed that, not only CFAEE and IHHO were tied with their results, from some functions; these results are also underlined and in bold. For function F9, the two best algorithms were MVO and PSO. The results of CFAEE and PSO were also tied (for function F11) as having the best results. Finally, with functions F13 and F15, the CFAEE was tied with the DE as having the best results.
In the minority of cases, the CFAEE was outperformed by the IHHO and other algorithms. The IHHO algorithm was only better for functions F4 and F14. The alternative best solutions only came from PSO, MVO, and DE. The previously mentioned case of PSO and MVO, being tied as having the best result with function F9, is one of them; two other cases where PSO was best were with functions F10 and F16. The only other algorithm that was better is the DE, in cases of F18, F24, and F27.
It is important to note that, in no case, was the original FA better than the improved version CFAEE. For some functions, the CFAEE achieved vastly improved results, as much as more than 1000 times better, as seen in F1. Large differences can be seen with other functions as well, such as F12, F13, F18, and F30.
Considering all of the mentioned cases, there is no doubt that the proposed solution, CFAEE, is superior to the original solution, FA, but also to every other algorithm tested. Furthermore, the improvement is justified.
The Friedman test [62,63] and the two-way variance analysis, by ranks, were performed for the determination of the difference significance between the novel CFAEE and the alternative methods used for comparison. This was conducted to further establish the statistical significance of enhancements, not only by comparing the results. Table 11 and Table 12 present the results achieved by 12 different algorithms over the 30 functions from the CEC2017 set for Friedman test ranks and the aligned Friedman test ranks, respectively.
As seen in Table 11, one could conclude that the proposed method, CFAEE, achieves better performance than the 10 other algorithms, as well as the original FA. The original HHO algorithm had an average ranking of 9.483. The modified version of HHO, the IHHO, had 3.138. The original FA had 6.655. The improved CFAEE was more than twice better than the previous best solution of IHHO with the average ranking of 1.551.
Additionally, the Iman and Davenport test [64] was also performed because the research [65] proves that the test could possibly provide better results in terms of precision than the χ 2 . Summary of the results from Friedman and Iman and Davenport’s test can be seen in Table 13.
Upon completion of the calculations, the result of the Iman and Davenport test is 36.95 and put into comparison against the F-distribution critical value ( F ( 9 , 9 × 10 ) = 1.820 ) shows that the Iman and Davenport test returns a significantly higher result. This test also rejects H 0 . Furthermore, the Friedman statistics ( χ r 2 = 181.50 ) are larger than the χ 2 critical value with 10 degrees of freedom ( 1.82 ), while at the significance level of α = 0.05 .
Consequentially, it is possible to reject the null hypothesis ( H 0 ); it could be suggested that CFAEE performed vastly better than the rest of the algorithms that were tested.
Since the null hypothesis was rejected by both performed statistical tests, the non-parametric post-hoc procedure, the Holm step-down procedure, was also conducted and presented in Table 14. By using this procedure, all methods were sorted according to their p value and compared with α / ( k i ) , where k and i represent the degree of freedom and the algorithm number, respectively. In this study, the α was set to 0.05 and 0.1. Moreover, it should be noted that the p-value results are provided in scientific notation.
The results given in the Table 14 suggest that the proposed algorithm significantly outperformed all opponent algorithms at both significance levels.
Finally, to establish a visual difference between methods included in the comparison, dispersion of results over 50 runs for some benchmark instances, and better performing methods using box and whiskers diagrams, are shown in Figure 2 and Figure 3.

5. Dropout Estimation Simulations

In this section, an empirical study of the proposed CFAEE for a practical problem of dropout regularization in CNN is presented. A basic experimental setup (problem modeling, control parameter setup, and dataset details) is shown first, followed by a presentation of the obtained results, comparative analysis with other metaheuristics-based methods, and a discussion.
For experimental purposes, two CNN structures with default values provided by the Caffe library, which obtained modest performances for employed datasets, were used. The purpose of the experiment was to further investigate the performance of metaheuristics for optimizing dropout probability d p . The same experimental conditions as in [5] were utilized.
All metaheuristics, as well as the CNN framework, were developed in Python using its core and data science libraries (scikit-learn, NumPy, SciPy, along with pandas and matplotlib for visualization) and Keras API. Experiments are conducted on Intel® CoreTM i7-8700K CPU, 64 GB RAM, and Windows 10 OS with 6 × NVIDIA GTX 1080 GPUs.

5.1. Basic Experimental Setup

The study proposed in this manuscript utilizes a similar research setup as shown in [5]. Four parameters that influence the CNN learning process, which were taken into consideration in this study, are: the learning rate η , L1 regularization (penalty, momentum) α , L2 regularization (weight decay) λ , and the dropout probability d p . However, in all experiments, tuple ( η , α , λ ) was fixed, while the metaheuristics approaches attempted to optimize only the d p parameter. Therefore, this problem belongs to the group of global optimization challenges, with only one parameter that is being optimized.
In the conducted simulations, two CNN architectures, provided by the well-known Caffe library [66] examples (https://caffe.berkeleyvision.org/, accessed on 10 October 2021), as in [5], were utilized. First, CNN architecture was used for performing classification tasks for MNIST, Fashion-MNIST, Semeion, and USPS datasets, while the second was employed for CIFAR-10 challenge. The only differences in CNN design over the proposed Caffe CNNs are the following: an extra dropout layer was added, and for Semeion and USPS simulations, the kernel size was set to 3 × 3 instead of 5 × 5 (as provided in Caffe), due to the lower image resolutions.
Graphical representation of the utilized CNN structures generated by the plot_model Keras function is shown in Figure 4.
Method was tested on five well-known image classification datasets:
  • MNIST—consists of images of handwritten digits “0–9”; it is divided into 60,000 training and 10,000 testing observations; image size 28 × 28 pixels gray-scale (http://yann.lecun.com/exdb/mnist/, accessed on 10 October 2021);
  • Fashion-MNIST—dataset of Zalando’s article images; it is comprised of different clothing images divided into 10 classes; it is split into 60,000 and 10,000 images used for training and testing, respectively; image size 28 × 28 pixels (https://github.com/zalandoresearch/fashion-mnist, accessed on 10 October 2021);
  • Semeion—includes a total of 1593 handwritten digits “0–9” images collected from 80 persons; digits are written accurately (normal way) and inaccurately (fast way); the original dataset is not split into training and testing; image size 16 × 16 grayscale and each pixel is binarized (https://archive.ics.uci.edu/ml/datasets/Semeion+Handwritten+Digit, accessed on 10 October 2021);
  • USPS—contains handwritten digits “0–9” images obtained from the envelopes of the United States Postal Service; dataset is split into 7291 training and 2007 testing images; image size 16 × 16 gray-scale (http://statweb.stanford.edu/tibs/ElemStatLearn/datasets/zip.info.txt, accessed on 10 October 2021);
  • CIFAR-10—consists of various images from 10 classes; subset of 80 million tiny images retrieved and collected by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton; divided into 50,000 images for training and 10,000 images for testing; image size 32 × 32 color-scale (http://www.cs.toronto.edu/kriz/cifar.html, accessed on 10 October 2021).
The total number of instances per each class in the training and testing sets, for all datasets employed in the simulations, is shown in Figure 5. Nevertheless, some datasets are unbalanced (does not have the same number of observations for each class) in the original train and test sets; the original split was used in experiments and all metaheuristics were tested under the same experimental conditions. The only dataset that was not originally split into training and testing sets was Semeion; for the purpose of this study, it was manually divided into 400 and 993 observations used for training and testing, respectively, as suggested in [5].
The training set for each database was further divided into train and validation, while the same proportion of the number of instances for each class was maintained. Data preprocessing was not applied. The dataset details, in terms of the split, along with the training batch size (provided in parentheses), are shown in Table 15. The same configuration was employed in [5].
The values for η , α , and λ parameters, as well as the number of training epochs, were set to default values, provided by the Caffe library with only an exception for the Semeion dataset. In this case, the η was set to 0.001 (not Caffe default) due to fewer images in the dataset. The d p , which is subject to optimization, can take any continuous value from the range [ 0 , 1 ] . The parameter setup is summarized in Table 16.
Each solution in the population represents one possible d p value. The fitness of solution is calculated in the following way: the CNN with d p is generated and trained on the training set and validated on the validation set with early stopping conditions (the early stopping is adjusted as 5% of the total number of training epochs); afterwards, trained CNN is evaluated for the test set and classification error rate E r is return. The fitness is reversed, proportional to the E r : f i t = 1 / E r .
All metaheuristics were tested with a total number of 77 F F E s .The sStudy proposed in [5] evaluated methods with N = 7 and T = 10 , which also yielded a total of 77 F F E s (7 + 7 × 10).
With the goal of visualizing the CNN dropout regularization experiment flow and design, the general CFAEE flowchart and the flowchart for fitness calculation are sown in Figure 6.

5.2. Results, Comparative Analysis, and Discussion

For the purpose of the study proposed in [5], the bat algorithm (BA) [67], cuckoo search (CS) [68], FA [1], and particle swarm optimization (PSO) [69] metaheuristics were implemented and tested. However, to compare the performance of metaheuristics-defined d p , results of the standard Caffe CNN with dropout (Dropout Caffe) and without dropout (Caffe) are also provided.
In the study proposed in this paper, all above metaheuristics were also implemented and tested to validate results provided in [5]. Additionally, besides the CFAEE method proposed in this manuscript, the following approaches were also included in the analysis: elephant herding optimization (EHO) [70], whale optimization algorithm (WOA) [53], sine cosine algorithm (SCA) [51], salp swarm algorithm (SSA), grasshopper optimization algorithm (GOA) [52], and biogeography-based optimization (BBO) [71].
The CFAEE was tested with the same control parameter adjustments as in bound-constrained experiments (Table 1). Summary of control parameters for other metaheuristics methods included in the analysis are summarized in Table 17.
All metaheuristics methods were tested in 20 separate runs and the average reported accuracy was used as comparison metrics. Moreover, the mean obtained d p value was also shown in the comparison table. Comparative analysis results are shown in Table 18.
The results from Table 18 clearly indicate superior performance of the proposed CFAEE method regarding the d p value that was subjected to the optimization process. On the MNIST dataset, the proposed CFAEE method obtained superior accuracy of 99.23% with the determined d p value of 0.516. All other metaheuristics approaches obtained the d p value below the standard Dropout Caffe value d p = 0.5 . In this particular case, the results obviously show that the d p value should be slightly greater than 0.5 in order to achieve better accuracy, and the proposed CFAEE method was the only one that was able to achieve it.
A similar conclusion can be derived for the Fashion-MNIST experiment. Most methods included in the analysis generated d p , which is lower than 0.5, and worse results than those achieved by the Dropout Caffe were reported. However, BA, SSA SSA, FA, and the proposed CFAEE obtained better accuracy than Dropout Caffe with d p > 0.5 .
In the Semeion dataset, again, the proposed CFAEE obtained the best accuracy result of 98.46%, with the d p value of 0.719. It is clear that the accuracy increases with the values of d p , higher than the standard Dropout Caffe value 0.5 in this particular dataset. The second best method was BA, which achieved an accuracy of 98.35% with d p = 0.692 . The simple Caffe that does not employ the dropout ( d p = 0 ) achieved 97.62% in this dataset, while the Dropout Caffe approach ( d p = 0.5 ) achieved an accuracy of 98.14%.
A similar pattern can be seen in the USPS dataset as well. The proposed CFAEE again achieved the best accuracy of 96.8% with the obtained d p value of 0.845. Similar to the previous datasets, the increase of d p value leads to better accuracy values. The second best method in this dataset was BA, which achieved 96.45 % with the d p = 0.762 . The improvement of the accuracy over the standard Caffe and Dropout Caffe methods is significant, as the proposed CFAEE achieved accuracy approximately 1% greater than the Caffe, and about 0.6% greater than the Dropout Caffe.
Finally, the results on the CIFAR-10 dataset show a different pattern, as they indicate that, if the d p is larger than the standard Dropout Caffe ( d p = 0.5 ), the performance start to drop and accuracy decreases. In this particular case, the model drops out neurons, and it is not able to generalize well. At the same time, if the d p is too small, again, the performances will drop (similar to the standard Caffe that utilizes d p = 0 ). It can be concluded that, on the CIFAR-10 dataset, the best performances are achieved for the d p values slightly below 0.5. The proposed CFAEE method achieved the best accuracy of 72.32% with the d p = 0.388 , and it was the only method that found the d p value below 0.5, as all other metaheuristics determined the d p values in range [ 0.5 , 1 ] .
Finally, the original FA method showed an average performance and the proposed CFAEE in all tests managed to substantially outscore its basic version. Therefore, similar to the case of unconstrained benchmarks, the improvements over the original approach were also validated against the practical challenge of dropout regularization.
Similarly, as it was performed for unconstrained benchmark problem set 1 (Section 4.2), to establish if there were significant result differences between the proposed CFAEE and other methods, a Wilcoxon signed rank-test was conducted. A mean classification error rate generated over 20 independent runs and critical level α = 0.05 were taken for the test.
Results of the Wilcoxon signed-rank test are shown in Table 19. The calculated p-values in all cases are lesser than critical values α = 0.05 , which implies that the proposed CFAEE, on average, substantially outperformed all other approaches.

6. Conclusions

The proposed manuscript introduced a novel FA approach that further enhanced both exploration and exploitation processes of the original method. The CFAEE incorporates an explicit exploration mechanism and CLS, and in this way, the observed deficiencies of the original FA were suppressed.
Following the recent practices in the optimization process—the introduced CFAEE algorithm was first tested on the recent CEC benchmark functions set, and the obtained results were compared with other modern metaheuristic methods, which were tested under the same experimental environment. Additionally, the statistical tests were executed and delivered the proofs that the enhanced FA algorithm outscored other methods, significantly. Furthermore, the proposed CFAEE outperforms the original FA.
The second part of the experiment focused on applying the proposed CFAEE to the practical CNN problem-optimization of the dropout probability value. Dropout is crucial in overfitting prevention, and it is an important challenge in the machine learning domain. The CFAEE driven CNN was tested on five standard datasets: MNIST, Fashion-MNIST, Semeion, USPS, and CIFAR-10. Furthermore, since the potential of metaheuristics for this type of challenge was not investigated enough, 10 other well-known swarm intelligence approaches were also implemented and tested for this problem. The achieved accuracies on those datasets indicate that the CFAEE has superior performance over other methods, as well as a promising future in this area.
Accordingly, future work will focus on applying the proposed CFAEE method on other machine learning problems. Due to its promising performances, CFAEE will be adapted and used in tackling other NP-hard problems, including challenges in wireless sensor networks and cloud computing. Finally, regularization in CNNs can be further addressed by utilizing CFAEE and fine-tuning α and λ parameters, with a goal to obtain even better classification accuracy. Moreover, the variables of the convolutional layers, such as the size and depth of the filters, can be parameterized through the CFAEE, instead of the more classical metaheuristic algorithms [73].

Author Contributions

Conceptualization, N.B., M.Z. and R.S.; methodology, N.B., T.B., A.P. and R.S.; software, N.B., T.B., M.Z.; validation, N.B. and R.S.; formal analysis, M.Z. and A.P.; investigation, T.A.R. and N.B.; data curation, T.A.R., N.B. and R.S.; writing—original draft preparation, A.P. and M.Z.; writing—review and editing, M.Z., T.A.R. and R.S.; visualization, T.B., M.Z. and A.P.; supervision, N.B. and R.S. All authors have read and agreed to the published version of the manuscript.

Funding

R. Stoean acknowledges funding from a grant from the Romanian Ministry of Research and Innovation, CCCDI–UEFISCDI, project number 178PCE/2021, PN-III-P4-ID-PCE-2020-0788, within PNCDI III. Part of her work was also supported by another grant from the Romanian Ministry of Research and Innovation, CCCDI-UEFISCDI, project number 408PED/2020, PN-III-P2-2.1-PED-2019-2227, within PNCDI III. N. Bacanin acknowledges funding from a grant from the Ministry of Education and Science of Republic of Serbia, grant no. III-44006.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Yang, X.S. Firefly Algorithms for Multimodal Optimization. In Stochastic Algorithms: Foundations and Applications; Watanabe, O., Zeugmann, T., Eds.; Springer: Berlin/Heidelberg, Germany, 2009; pp. 169–178. [Google Scholar]
  2. Bezdan, T.; Cvetnic, D.; Gajic, L.; Zivkovic, M.; Strumberger, I.; Bacanin, N. Feature Selection by Firefly Algorithm with Improved Initialization Strategy. In Proceedings of the 7th Conference on the Engineering of Computer Based Systems, Novi Sad, Serbia, 26–27 May 2021; pp. 1–8. [Google Scholar]
  3. Bacanin, N.; Bezdan, T.; Venkatachalam, K.; Al-Turjman, F. Optimized convolutional neural network by firefly algorithm for magnetic resonance image classification of glioma brain tumor grade. J. Real Time Image Process. 2021, 18, 1085–1098. [Google Scholar] [CrossRef]
  4. Kumar, V.; Kumar, D. A systematic review on firefly algorithm: Past, present, and future. Arch. Comput. Methods Eng. 2021, 28, 3269–3291. [Google Scholar] [CrossRef]
  5. de Rosa, G.; Papa, J.; Yang, X.S. Handling Dropout Probability Estimation in Convolution Neural Networks Using Metaheuristics. Soft Comput. 2018, 22, 6147–6156. [Google Scholar] [CrossRef] [Green Version]
  6. Zivkovic, M.; Bacanin, N.; Venkatachalam, K.; Nayyar, A.; Djordjevic, A.; Strumberger, I.; Al-Turjman, F. COVID-19 cases prediction by using hybrid machine learning and beetle antennae search approach. Sustain. Cities Soc. 2021, 66, 102669. [Google Scholar] [CrossRef] [PubMed]
  7. Wainer, J.; Fonseca, P. How to tune the RBF SVM hyperparameters? An empirical evaluation of 18 search algorithms. Artif. Intell. Rev. 2021, 54, 4771–4797. [Google Scholar] [CrossRef]
  8. Basha, J.; Bacanin, N.; Vukobrat, N.; Zivkovic, M.; Venkatachalam, K.; Hubálovskỳ, S.; Trojovskỳ, P. Chaotic Harris Hawks Optimization with Quasi-Reflection-Based Learning: An Application to Enhance CNN Design. Sensors 2021, 21, 6654. [Google Scholar] [CrossRef]
  9. Beni, G. Swarm intelligence. Complex Soc. Behav. Syst. Game Theory Agent Based Model. 2020, 791–818. [Google Scholar] [CrossRef]
  10. Abraham, A.; Guo, H.; Liu, H. Swarm intelligence: Foundations, perspectives and applications. In Swarm Intelligent Systems; Springer: Berlin/Heidelberg, Germany, 2006; pp. 3–25. [Google Scholar]
  11. Li, M.W.; Wang, Y.T.; Geng, J.; Hong, W.C. Chaos cloud quantum bat hybrid optimization algorithm. Nonlinear Dyn. 2021, 103, 1167–1193. [Google Scholar] [CrossRef]
  12. Zivkovic, M.; Bacanin, N.; Tuba, E.; Strumberger, I.; Bezdan, T.; Tuba, M. Wireless Sensor Networks Life Time Optimization Based on the Improved Firefly Algorithm. In Proceedings of the 2020 International Wireless Communications and Mobile Computing (IWCMC), Limassol, Cyprus, 15–19 June 2020; pp. 1176–1181. [Google Scholar]
  13. Zivkovic, M.; Bacanin, N.; Zivkovic, T.; Strumberger, I.; Tuba, E.; Tuba, M. Enhanced Grey Wolf Algorithm for Energy Efficient Wireless Sensor Networks. In Proceedings of the 2020 Zooming Innovation in Consumer Technologies Conference (ZINC), Online, 26–27 May 2020; pp. 87–92. [Google Scholar]
  14. Bacanin, N.; Tuba, E.; Zivkovic, M.; Strumberger, I.; Tuba, M. Whale Optimization Algorithm with Exploratory Move for Wireless Sensor Networks Localization. In Proceedings of the International Conference on Hybrid Intelligent Systems, Sehore, India, 10–12 December 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 328–338. [Google Scholar]
  15. Zivkovic, M.; Zivkovic, T.; Venkatachalam, K.; Bacanin, N. Enhanced Dragonfly Algorithm Adapted for Wireless Sensor Network Lifetime Optimization. In Data Intelligence and Cognitive Informatics; Springer: Berlin/Heidelberg, Germany, 2021; pp. 803–817. [Google Scholar]
  16. Bacanin, N.; Bezdan, T.; Tuba, E.; Strumberger, I.; Tuba, M.; Zivkovic, M. Task scheduling in cloud computing environment by grey wolf optimizer. In Proceedings of the 2019 27th Telecommunications Forum (TELFOR), Belgrade, Serbia, 26–27 November 2019; pp. 1–4. [Google Scholar]
  17. Strumberger, I.; Bacanin, N.; Tuba, M.; Tuba, E. Resource scheduling in cloud computing based on a hybridized whale optimization algorithm. Appl. Sci. 2019, 9, 4893. [Google Scholar] [CrossRef] [Green Version]
  18. Bezdan, T.; Zivkovic, M.; Tuba, E.; Strumberger, I.; Bacanin, N.; Tuba, M. Glioma Brain Tumor Grade Classification from MRI Using Convolutional Neural Networks Designed by Modified FA. In Proceedings of the International Conference on Intelligent and Fuzzy Systems, Istanbul, Turkey, 21–23 July 2020; Springer: Berlin/Heidelberg, Germany, 2020; pp. 955–963. [Google Scholar]
  19. Bacanin, N.; Bezdan, T.; Tuba, E.; Strumberger, I.; Tuba, M. Monarch butterfly optimization based convolutional neural network design. Mathematics 2020, 8, 936. [Google Scholar] [CrossRef]
  20. Zivkovic, M.; Venkatachalam, K.; Bacanin, N.; Djordjevic, A.; Antonijevic, M.; Strumberger, I.; Rashid, T.A. Hybrid Genetic Algorithm and Machine Learning Method for COVID-19 Cases Prediction. In Proceedings of the International Conference on Sustainable Expert Systems: ICSES 2020, Nepal, South Asia, 17–18 September 2020; Springer: Berlin/Heidelberg, Germany, 2021; Volume 176, p. 169. [Google Scholar]
  21. Milosevic, S.; Bezdan, T.; Zivkovic, M.; Bacanin, N.; Strumberger, I.; Tuba, M. Feed-Forward Neural Network Training by Hybrid Bat Algorithm. In Proceedings of the Modelling and Development of Intelligent Systems: 7th International Conference, MDIS 2020, Sibiu, Romania, 22–24 October 2020; Revised Selected Papers 7. Springer: Berlin/Heidelberg, Germany, 2021; pp. 52–66. [Google Scholar]
  22. Gajic, L.; Cvetnic, D.; Zivkovic, M.; Bezdan, T.; Bacanin, N.; Milosevic, S. Multi-layer Perceptron Training Using Hybridized Bat Algorithm. In Computational Vision and Bio-Inspired Computing; Springer: Berlin/Heidelberg, Germany, 2021; pp. 689–705. [Google Scholar]
  23. Hongtao, L.; Qinchuan, Z. Applications of deep convolutional neural network in computer vision. J. Data Acquis. Process. 2016, 31, 1–17. [Google Scholar]
  24. Xiao, T.; Xu, Y.; Yang, K.; Zhang, J.; Peng, Y.; Zhang, Z. The application of two-level attention models in deep convolutional neural network for fine-grained image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 842–850. [Google Scholar]
  25. Zhang, Y.; Zhao, D.; Sun, J.; Zou, G.; Li, W. Adaptive convolutional neural network and its application in face recognition. Neural Process. Lett. 2016, 43, 389–399. [Google Scholar] [CrossRef]
  26. Lawrence, S.; Giles, C.L.; Tsoi, A.C.; Back, A.D. Face recognition: A convolutional neural-network approach. IEEE Trans. Neural Netw. 1997, 8, 98–113. [Google Scholar] [CrossRef] [Green Version]
  27. Ranjan, R.; Sankaranarayanan, S.; Castillo, C.D.; Chellappa, R. An all-in-one convolutional neural network for face analysis. In Proceedings of the 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), Washington, DC, USA, 30 May–3 June 2017; pp. 17–24. [Google Scholar]
  28. Matsugu, M.; Mori, K.; Mitari, Y.; Kaneda, Y. Subject independent facial expression recognition with robust face detection using a convolutional neural network. Neural Netw. 2003, 16, 555–559. [Google Scholar] [CrossRef]
  29. Ramaiah, N.P.; Ijjina, E.P.; Mohan, C.K. Illumination invariant face recognition using convolutional neural networks. In Proceedings of the 2015 IEEE International Conference on Signal Processing, Informatics, Communication and Energy Systems (SPICES), Kozhikode, India, 19–21 February 2015; pp. 1–4. [Google Scholar]
  30. Simard, P.Y.; Steinkraus, D.; Platt, J.C. Best practices for convolutional neural networks applied to visual document analysis. In Proceedings of the ICDAR, Edinburgh, UK, 3–6 August 2003; Volume 3. [Google Scholar]
  31. Afzal, M.Z.; Capobianco, S.; Malik, M.I.; Marinai, S.; Breuel, T.M.; Dengel, A.; Liwicki, M. Deepdocclassifier: Document classification with deep convolutional neural network. In Proceedings of the 2015 13th International Conference on Document Analysis and Recognition (ICDAR), Tunis, Tunisia, 23–26 August 2015; pp. 1111–1115. [Google Scholar]
  32. Stoean, C.; Lichtblau, D. Author Identification Using Chaos Game Representation and Deep Learning. Mathematics 2020, 8, 1933. [Google Scholar] [CrossRef]
  33. Špetlík, R.; Franc, V.; Matas, J. Visual heart rate estimation with convolutional neural network. In Proceedings of the British Machine Vision Conference, Newcastle, UK, 3–6 September 2018; pp. 3–6. [Google Scholar]
  34. Li, Q.; Cai, W.; Wang, X.; Zhou, Y.; Feng, D.D.; Chen, M. Medical image classification with convolutional neural network. In Proceedings of the 2014 13th International Conference on Control Automation Robotics & Vision (ICARCV), Singapore, 10–12 December 2014; pp. 844–848. [Google Scholar]
  35. Ting, F.F.; Tan, Y.J.; Sim, K.S. Convolutional neural network improvement for breast cancer classification. Expert Syst. Appl. 2019, 120, 103–115. [Google Scholar] [CrossRef]
  36. Liu, Y.; Racah, E.; Correa, J.; Khosrowshahi, A.; Lavers, D.; Kunkel, K.; Wehner, M.; Collins, W. Application of deep convolutional neural networks for detecting extreme weather in climate datasets. arXiv 2016, arXiv:1605.01156. [Google Scholar]
  37. Chattopadhyay, A.; Hassanzadeh, P.; Pasha, S. Predicting clustered weather patterns: A test case for applications of convolutional neural networks to spatio-temporal climate data. Sci. Rep. 2020, 10, 1317. [Google Scholar] [CrossRef]
  38. Yang, X.S.; Xingshi, H. Firefly Algorithm: Recent Advances and Applications. Int. J. Swarm Intell. 2013, 1, 36–50. [Google Scholar] [CrossRef] [Green Version]
  39. Strumberger, I.; Tuba, E.; Bacanin, N.; Zivkovic, M.; Beko, M.; Tuba, M. Designing convolutional neural network architecture by the firefly algorithm. In Proceedings of the 2019 International Young Engineers Forum (YEF-ECE), Caparica, Portugal, 10 May 2019; pp. 59–65. [Google Scholar]
  40. Strumberger, I.; Bacanin, N.; Tuba, M. Enhanced Firefly Algorithm for Constrained Numerical Optimization, IEEE Congress on Evolutionary Computation. In Proceedings of the IEEE International Congress on Evolutionary Computation (CEC 2017), Donostia, Spain, 5–8 June 2017; pp. 2120–2127. [Google Scholar]
  41. Xu, G.H.; Zhang, T.W.; Lai, Q. A new firefly algorithm with mean condition partial attraction. Appl. Intell. 2021, 1–14. [Google Scholar] [CrossRef]
  42. Bacanin, N.; Tuba, M. Firefly Algorithm for Cardinality Constrained Mean-Variance Portfolio Optimization Problem with Entropy Diversity Constraint. Sci. World J. Spec. Issue Comput. Intell. Metaheuristic Algorithms Appl. 2014, 2014, 721521. [Google Scholar] [CrossRef] [PubMed]
  43. Wang, H.; Zhou, X.; Sun, H.; Yu, X.; Zhao, J.; Zhang, H.; Cui, L. Firefly algorithm with adaptive control parameters. Soft Comput. 2017, 3, 5091–5102. [Google Scholar] [CrossRef]
  44. Karaboga, D.; Basturk, B. On the performance of artificial bee colony (ABC) algorithm. Appl. Soft Comput. 2008, 8, 687–697. [Google Scholar] [CrossRef]
  45. Moradi, P.; Imanian, N.; Qader, N.N.; Jalili, M. Improving exploration property of velocity-based artificial bee colony algorithm using chaotic systems. Inf. Sci. 2018, 465, 130–143. [Google Scholar] [CrossRef]
  46. Alatas, B. Chaotic bee colony algorithms for global numerical optimization. Expert Syst. Appl. 2010, 37, 5682–5687. [Google Scholar] [CrossRef]
  47. dos Santos Coelho, L.; Mariani, V.C. Use of chaotic sequences in a biologically inspired algorithm for engineering design optimization. Expert Syst. Appl. 2008, 34, 1905–1913. [Google Scholar] [CrossRef]
  48. Li, C.; Zhou, J.; Xiao, J.; Xiao, H. Parameters identification of chaotic system by chaotic gravitational search algorithm. Chaos Solitons Fractals 2012, 45, 539–547. [Google Scholar] [CrossRef]
  49. Chen, H.; Xu, Y.; Wang, M.; Zhao, X. A balanced whale optimization algorithm for constrained engineering design problems. Appl. Math. Model. 2019, 71, 45–59. [Google Scholar] [CrossRef]
  50. Liang, X.; Cai, Z.; Wang, M.; Zhao, X.; Chen, H.; Li, C. Chaotic oppositional sine–cosine method for solving global optimization problems. Eng. Comput. 2020, 1–17. [Google Scholar] [CrossRef]
  51. Mirjalili, S. SCA: A sine cosine algorithm for solving optimization problems. Knowl. Based Syst. 2016, 96, 120–133. [Google Scholar] [CrossRef]
  52. Mirjalili, S.Z.; Mirjalili, S.; Saremi, S.; Faris, H.; Aljarah, I. Grasshopper optimization algorithm for multi-objective optimization problems. Appl. Intell. 2018, 48, 805–820. [Google Scholar] [CrossRef]
  53. Mirjalili, S.; Lewis, A. The whale optimization algorithm. Adv. Eng. Softw. 2016, 95, 51–67. [Google Scholar] [CrossRef]
  54. Liu, J.; Mao, Y.; Liu, X.; Li, Y. A dynamic adaptive firefly algorithm with globally orientation. Math. Comput. Simul. 2020, 174, 76–101. [Google Scholar] [CrossRef]
  55. Zhu, Q.; Xiao, Y.; Chen, W.; Ni, C.; Chen, Y. Research on the improved mobile robot localization approach based on firefly algorithm. Chin. J. Sci. Instrum. 2016, 37, 323–329. [Google Scholar]
  56. Kaveh, A.; Javadi, S. Chaos-based firefly algorithms for optimization of cyclically large-size braced steel domes with multiple frequency constraints. Comput. Struct. 2019, 214, 28–39. [Google Scholar] [CrossRef]
  57. Yang, X.S. Firefly Algorithm, Lévy Flights and Global Optimization. In Research and Development in Intelligent Systems XXVI; Bramer, M., Ellis, R., Petridis, M., Eds.; Springer: London, UK, 2010; pp. 209–218. [Google Scholar]
  58. Yu, S.; Zhu, S.; Ma, Y.; Mao, D. A variable step size firefly algorithm for numerical optimization. Appl. Math. Comput. 2015, 263, 214–220. [Google Scholar] [CrossRef]
  59. Awad, N.; Ali, M.; Liang, J.; Qu, B.; Suganthan, P.; Definitions, P. Evaluation criteria for the CEC 2017 special session and competition on single objective real-parameter numerical optimization. Technol. Rep. 2016. Available online: http://home.elka.pw.edu.pl/ (accessed on 4 October 2021).
  60. Gupta, S.; Deep, K. Improved sine cosine algorithm with crossover scheme for global optimization. Knowl. Based Syst. 2019, 165, 374–406. [Google Scholar] [CrossRef]
  61. Hussien, A.G.; Amin, M. A self-adaptive Harris Hawks optimization algorithm with opposition-based learning and chaotic local search strategy for global optimization and feature selection. Int. J. Mach. Learn. Cybern. 2021, 1–28. [Google Scholar] [CrossRef]
  62. Friedman, M. The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J. Am. Stat. Assoc. 1937, 32, 675–701. [Google Scholar] [CrossRef]
  63. Friedman, M. A comparison of alternative tests of significance for the problem of m rankings. Ann. Math. Stat. 1940, 11, 86–92. [Google Scholar] [CrossRef]
  64. Iman, R.L.; Davenport, J.M. Approximations of the critical region of the fbietkan statistic. Commun. Stat. Theory Methods 1980, 9, 571–595. [Google Scholar] [CrossRef]
  65. Sheskin, D.J. Handbook of Parametric and Nonparametric Statistical Procedures; Chapman and Hall/CRC: Boca Raton, FL, USA, 2020. [Google Scholar]
  66. Jia, Y.; Shelhamer, E.; Donahue, J.; Karayev, S.; Long, J.; Girshick, R.; Guadarrama, S.; Darrell, T. Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the 22nd ACM International Conference on Multimedia, Orlando, FL, USA, 3–7 November 2014; pp. 675–678. [Google Scholar]
  67. Yang, X.S.; Hossein Gandomi, A. Bat algorithm: A novel approach for global engineering optimization. Eng. Comput. 2012, 29, 464–483. [Google Scholar] [CrossRef] [Green Version]
  68. Gandomi, A.H.; Yang, X.S.; Alavi, A.H. Cuckoo search algorithm: A metaheuristic approach to solve structural optimization problems. Eng. Comput. 2013, 29, 17–35. [Google Scholar] [CrossRef]
  69. Kennedy, J.; Eberhart, R. Particle swarm optimization. In Proceedings of the ICNN’95-International Conference on Neural Networks, Perth, Australia, 27 November–1 December 1995; Volume 4, pp. 1942–1948. [Google Scholar]
  70. Wang, G.G.; Deb, S.; Gao, X.Z.; Coelho, L.D.S. A new metaheuristic optimisation algorithm motivated by elephant herding behaviour. Int. J. Bio-Inspired Comput. 2016, 8, 394–409. [Google Scholar] [CrossRef]
  71. Simon, D. Biogeography-based optimization. IEEE Trans. Evol. Comput. 2008, 12, 702–713. [Google Scholar] [CrossRef] [Green Version]
  72. Mirjalili, S.; Gandomi, A.H.; Mirjalili, S.Z.; Saremi, S.; Faris, H.; Mirjalili, S.M. Salp Swarm Algorithm: A bio-inspired optimizer for engineering design problems. Adv. Eng. Softw. 2017, 114, 163–191. [Google Scholar] [CrossRef]
  73. Stoean, R. Analysis on the potential of an EA–surrogate modelling tandem for deep learning parametrization: An example for cancer classification from medical images. Neural Comput. Appl. 2018, 32, 313–322. [Google Scholar] [CrossRef]
Figure 1. Mean convergence speed graphs for some benchmark instances (Benchmark set 1).
Figure 1. Mean convergence speed graphs for some benchmark instances (Benchmark set 1).
Mathematics 09 02705 g001
Figure 2. Dispersion of best results over runs for functions F4, F5, F7, F9, F10, F12 (Benchmark set 2).
Figure 2. Dispersion of best results over runs for functions F4, F5, F7, F9, F10, F12 (Benchmark set 2).
Mathematics 09 02705 g002
Figure 3. Dispersion of best results over runs for functions F15, F18, F20, F22, F26, F30 (Benchmark set 2).
Figure 3. Dispersion of best results over runs for functions F15, F18, F20, F22, F26, F30 (Benchmark set 2).
Mathematics 09 02705 g003
Figure 4. Example instance of MNIST/Fashion-MNIST/Semeion/USPS models (left) and example instance of CIFAR-10 model (right).
Figure 4. Example instance of MNIST/Fashion-MNIST/Semeion/USPS models (left) and example instance of CIFAR-10 model (right).
Mathematics 09 02705 g004
Figure 5. Number of instances per each class in the training and testing sets for MNIST, Fashion-MNIST, Semeion, USPS, and CIFAR-10 datasets.
Figure 5. Number of instances per each class in the training and testing sets for MNIST, Fashion-MNIST, Semeion, USPS, and CIFAR-10 datasets.
Mathematics 09 02705 g005
Figure 6. (a) General CFAEE flowchart (left); (b) flowchart for fitness calculation (right).
Figure 6. (a) General CFAEE flowchart (left); (b) flowchart for fitness calculation (right).
Mathematics 09 02705 g006
Table 1. Setup of CFAEE control parameters.
Table 1. Setup of CFAEE control parameters.
Parameter and NotationValue
Number of solution N20 (benchmark1), 30 (benchmark2)
Maximum number of F F E ( m a x F F E )160,000 (benchmark1), 15,030 (benchmark2)
Absorption coefficient γ 1.0
Attractiveness at r = 0   β 0 1.0
Randomization (step) α changes according to Equation (14)
Initial value of step α 0 0.5
Minimum value of step α m i n 0.1
Solutions’ exhaustiveness l i m i t m a x F F E / N · 2
CLS strategy step number K4
CLS strategy λ changes according to Equation (20)
Parameter ϕ m a x F F E / 2
Table 2. Function details for benchmarks problem set I.
Table 2. Function details for benchmarks problem set I.
IDNameSearch RangeFormulationOptimum
f1Sphere [ 100 , 100 ] D min f ( x ) = i = 1 D x i 2 0
f2Moved Axis Function [ 5.12 , 5.12 ] D min f ( x ) = i = 2 D 5 i x i 2 0
f3Griewank [ 100 , 100 ] D min f ( x ) = i = 1 D x i 2 4000 i = 1 D c o s ( x i i ) + 1 0
f4Rastrigin [ 5.12 , 5.12 ] D min f ( x ) = 10 n + i = 1 D [ x i 2 10 cos ( 2 π x i ) ] 0
f5The Schwefel’s Problem 1.2 [ 100 , 100 ] D min f ( x ) = i = 1 D j = 1 i x j 2 0
f6Ackley [ 32 , 32 ] D min f ( x ) = a × e x p ( b 1 n i = 1 n x i 2 ) e x p ( 1 D i = 1 n cos ( c x i ) ) + a + e x p ( 1 ) , where a = 20 , b = 0.2 0
f7Powell Sum [ 1 , 1 ] D min f ( x ) = i = 1 D | x i | i + 1 0
f8Sum Squares [ 10 , 10 ] D min f ( x ) = i = 1 D i x i 2 0
f9Schwefel 2.22 [ 100 , 100 ] D min f ( x ) = i = 1 D | x i | + i = 1 n | x i | 0
f10Powell Singular [ 4 , 5 ] D min f ( x ) = i = 1 D / 4 [ ( x 4 i 3 + 10 x 4 i 2 ) 2 + 5 ( x 4 i 1 x x i ) 2 + ( x 4 i 2 2 x 4 i 1 ) 4 + 10 ( x 4 i 3 + x 4 i ) 4 ] 0
f11Alpine [ 10 , 10 ] D min f ( x ) = i = 1 D x i sin x i + 0.1 x i 0
f12Inverse Cosine-Wave Function [ 100 , 100 ] D min f ( x ) = i = 1 D 1 ( e x p ( x i 2 + x i + 1 2 + 0.5 x i x i + 1 8 ) × cos ( 4 x i 2 + x i + 1 2 + 0.5 x i x i + 1 ) ) −D+1
f13Pathological [ 100 , 100 ] D min f ( x ) = i = 1 D 1 0.5 + sin 2 100 x i 2 + x i + 1 2 0.5 1 + 0.001 x i 2 2 x i x i + 1 + x i + 1 2 2 2 0
f14Discus [ 100 , 100 ] D min f ( x ) = 10 6 x 1 2 + i = 1 D x i 2 0
f15Happy Cat [ 2 , 2 ] D min f ( x ) = ( | | x i 2 | | D ) 2 α + 1 D ( 0.5 | | x i 2 | | + i = 1 D x i ) + 0.5 , where α = 1 4 0
f16Drop-Wave Function [ 5.2 , 5.2 ] D min f ( x ) = 1 + cos ( 12 x 1 2 + x 2 2 ) ( 0.5 ( x 1 2 + x 2 2 ) + 2 ) −1
f17Schaffer 2 [ 100 , 100 ] D min f ( x ) = 0.5 + sin 2 ( x 1 2 x 2 2 ) 2 0.5 1 + 0.001 ( x 1 2 + x 2 2 ) 2 0
f18Camel Function-Three Hump [ 5 , 5 ] D min f ( x ) = 2 x 1 2 1.05 x 1 4 + x 1 6 6 + x 1 x 2 + x 2 2 0
Table 3. Comparative analysis among CFAEE, original FA, and five other FA implementations for benchmarks with 10 dimensions.
Table 3. Comparative analysis among CFAEE, original FA, and five other FA implementations for benchmarks with 10 dimensions.
FunctionAlgorithmBest ValueWorst ValueMean ValueFunctionAlgorithmBest ValueWorst ValueMean Value
f 1 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
0
0
0
0
5.019 × 10 3
0
0
2.91 × 10 5
0
0.531452
0
0.116521
0
0
6.07 × 10 7
0
0.151967
0
0.067858
0
0
f 9 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
0
0
0
0
5.18 × 10 2
0
0
1.00 × 10 3
0
0.735625
0
0.79956
0
0
2.00 × 10 5
0
0.327158
0
0.431151
0
0
f 2 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
0
0
0
0
1.64 × 10 3
0
0
5.15 × 10 7
0
5.74765
0
4.005821
0
0
1.06 × 10 8
0
1.32645
0
1.003456
0
0
f 10 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
1.25 × 10 6
0
0
0
4.82 × 10 2
0
0
3.29 × 10 6
0
14.95923
0
9.18410
0
0
2.28 × 10 6
0
2.736795
0
3.381069
0
0
f 3 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
9.60 × 10 4
0
0
0
8.63 × 10 5
0
0
9.60 × 10 4
0
5.32 × 10 2
0
1.65 × 10 2
0
0
9.60 × 10 4
0
7.45 × 10 3
0
7.32 × 10 3
0
0
f 11 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
0
0
0
0
1.96 × 10 2
0
0
3.02 × 10 7
0
0.451043
0
0.224532
0
0
6.30 × 10 9
0
0.145892
0
0.131779
0
0
f 4 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
5.969720
1.12073
0
0
2.537912
2.142703
0
5.969720
16.96541
4.352192
0
22.243001
27.135292
0
5.969720
7.962931
2.160021
0
10.984211
11.528380
0
f 12 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
−3.007700
−7.416352
−9
−9
−9
−7.982860
−9
−3.007700
−6.100051
−8.154811
−9
−6.738521
−5.318621
−9
−3.007740
−6.821470
−8.837092
−9
−7.182860
−6.730021
−9
f 5 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
0
0
0
0
1.15 × 10 2
0
0
1.35 × 10 5
0
1.216521
0
0.668310
0
0
2.81 × 10 7
0
0.371654
0
0.315237
0
0
f 13 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
0.502000
1.35 × 10 3
0
0
8.63 × 10 4
1.64 × 10 4
0
0.502000
2.51 × 10 2
8.69 × 10 3
0
8.29 × 10 2
2.52 × 10 2
0
0.502000
7.95 × 10 3
1.72 × 10 3
0
1.82 × 10 2
8.44 × 10 3
0
f 6 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
1.46 × 10 14
8.88 × 10 16
8.88 × 10 16
8.88 × 10 16
7.63 × 10 2
8.88 × 10 16
8.88 × 10 16
1.90 × 10 3
8.88 × 10 16
1.156728
8.88 × 10 16
1.197652
8.88 × 10 16
8.88 × 10 16
3.8 × 10 5
8.88 × 10 16
0.363197
8.88 × 10 16
0.569403
8.88 × 10 16
8.88 × 10 16
f 14 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
643.025312
0
0
0
2.63 × 10 3
0
0
697.974622
0
0.634750
0
0.177280
0
0
644.124100
0
6.42 × 10 2
0
7.34 × 10 2
0
0
f 7 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
1.61 × 10 7
0
0
0
4.65 × 10 13
0
0
1.83 × 10 7
0
1.05 × 10 2
0
2.55 × 10 3
0
0
1.62 × 10 7
0
1.52 × 10 3
0
8.15 × 10 9
0
0
f 15 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
1.753800
0
0
0
0.622315
0
0
1.753800
0.432198
0.453921
0
0.978813
0.635291
0
1.753800
4.62 × 10 2
0.168663
0
0.860170
0.160825
0
f 8 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
0
0
0
0
1.69 × 10 2
0
0
1.08 × 10 6
0
1.413521
0
0.727350
0
0
2.24 × 10 8
0
0.237733
0
0.355913
0
0
Table 4. Comparative analysis among CFAEE, original FA, and five other FA implementations for benchmarks with 30 dimensions.
Table 4. Comparative analysis among CFAEE, original FA, and five other FA implementations for benchmarks with 30 dimensions.
FunctionAlgorithmBest ValueWorst ValueMean ValueFunctionAlgorithmBest ValueWorst ValueMean Value
f 1 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
0
13.62752
9.542603
0
0.143725
8.69 × 10 2
0
3.30 × 10 4
20.168334
17.455290
3.71 × 10 4
0.329325
0.335712
6.25 × 10 6
8.47 × 10 6
17.05007
14.08179
1.61 × 10 5
0.237264
0.194190
1.41 × 10 6
f 9 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
0
15.668320
7.896512
0
0.759455
0.663490
0
7.15 × 10 3
18.532451
13.652705
4.84 × 10 3
1.652710
2.0693
5.17 × 10 3
7.43 × 10 4
17.168332
11.634482
6.22 × 10 4
1.444582
1.444582
6.09 × 10 4
f 2 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
0
952.735293
495.234239
0
9.0823
8.458129
0
5.73 × 10 4
1292.759201
932.959210
5.31 × 10 4
27.288553
35.736666
3.52 × 10 5
1.19 × 10 5
1151.53123
831.976505
4.63 × 10 5
15.382611
19.345189
4.52 × 10 6
f 10 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
5.52 × 10 4
740.533299
1297.755023
0
3.545252
5.3022
0
1.48 × 10 3
4352.542059
3675.442951
1.53 × 10 2
33.82541
28.982541
1.13 × 10 3
8.79 × 10 4
2953.135592
2626.920051
1.30 × 10 3
23.710392
17.315642
4.44 × 10 4
f 3 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
9.86 × 10 3
0.453243
0.331970
0
5.43 × 10 3
5.45 × 10 3
0
9.87 × 10 3
0.573032
0.511440
1.84 × 10 5
3.65 × 10 2
1.76 × 10 2
5.79 × 10 5
9.86 × 10 3
0.516954
0.446482
4.65 × 10 6
1.23 × 10 2
9.97 × 10 3
2.33 × 10 6
f 11 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
1.66 × 10 2
13.115620
6.718345
0
0.245052
0.133675
0
1.86 × 10 2
16.344592
13.539203
3.34 × 10 3
0.731462
0.475093
8.54 × 10 4
1.66 × 10 2
14.957239
10.529380
4.69 × 10 4
0.417792
0.312399
1.03 × 10 4
f 4 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
53.728352
91.368000
26.842502
0
10.562432
48.503233
0
53.728352
145.032962
47.888361
0.293775
70.887502
118.455291
0.163325
53.728352
131.851977
37.948270
6.33 × 10 2
52.398675
89.757932
2.31 × 10 2
f 12 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
−2.745302
−14.281512
−19.773059
−29
−27.135292
−19.932444
−29
−2.738143
−10.236442
−14.387294
−28.981153
−23.462555
−13.572562
−28.975432
−2.741055
−12.601748
−17.381692
−28.995732
−25.463931
−16.488942
−28.997240
f 5 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
0
188.932905
175.893044
0
1.363823
1.167251
0
8.10 × 10 4
249.742592
248.643292
1.33 × 10 4
5.757921
7.374155
7.35 × 10 5
1.69 × 10 5
229.451399
218.334752
1.51 × 10 5
3.464743
3.888032
5.65 × 10 6
f 13 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
4.821110
4.86 × 10 2
5.92 × 10 2
2.48 × 10 32
4.28 × 10 2
3.85 × 10 2
4.34 × 10 34
4.854329
9.37 × 10 2
0.135155
8.75 × 10 6
0.131320
9.92 × 10 2
3.13 × 10 6
4.831800
7.24 × 10 2
8.63 × 10 2
1.08 × 10 6
7.60 × 10 1
6.16 × 10 2
7.09 × 10 7
f 6 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
3.95 × 10 12
4.340823
3.011800
8.88 × 10 16
0.433632
0.447690
8.88 × 10 16
4.08 × 10 1
4.530665
3.725154
2.51 × 10 2
0.855143
2.853752
1.05 × 10 2
8.17 × 10 2
4.493832
3.383214
2.87 × 10 3
0.688177
1.1317517
4.65 × 10 3
f 14 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
1.26 × 10 + 4
16.331
12.645
0
3.54 × 10 2
0.10623
0
1.29 × 10 + 4
22.498
20.226
3.01 × 10 4
0.50956
0.42646
1.15 × 10 4
1.26 × 10 + 4
19.8985
16.6886
4.50 × 10 5
0.261277
0.244281
2.29 × 10 5
f 7 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
6.16e-05
123.295565
21.954921
0
4.57E-11
3.52e-08
0
6.18e-05
1158.432456
4319.824940
3.17e-38
1.38e-03
9.83e-03
4.91 × 10 39
6.17 × 10 5
541.478399
1345.324915
7.91 × 10 40
9.67 × 10 5
1.25 × 10 5
3.12 × 10 41
f 15 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
2.3302
2.262251
0.723335
0
1.175841
1.307425
0
2.3302
2.348725
1.466781
0.637052
1.371513
1.743721
0.592563
2.3302
2.305134
1.060788
9.92 × 10 2
1.294690
1.524977
9.98 × 10 2
f 8 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
0
152.832522
103.285692
0
1.534341
1.561
0
3.17e-04
275.964302
214.365219
5.93 × 10 4
4.781903
5.175238
2.22 × 10 4
8.12 × 10 5
235.952315
173.448925
4.62 × 10 5
3.255770
3.270697
9.29 × 10 6
Table 5. Comparative analysis among CFAEE, original FA, and five other FA implementations for benchmarks with 100 dimensions.
Table 5. Comparative analysis among CFAEE, original FA, and five other FA implementations for benchmarks with 100 dimensions.
FunctionAlgorithmBest ValueWorst ValueMean ValueFunctionAlgorithmBest ValueWorst ValueMean Value
f 1 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
0
86.457552
84.743562
0
0.776975
6.0074
0
5.08 × 10 3
94.965352
101.550299
5.33 × 10 3
1.343821
9.086351
3.35 × 10 3
3.06 × 10 4
91.851742
95.331892
1.67 × 10 3
1.123621
7.397248
2.12 × 10 4
f 9 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
7.270000
75.261423
61.271532
0
4.667233
19.453222
0
7.345340
79.183492
69.287492
0.282980
7.217744
29.786432
0.253300
7.272472
76.237822
65.957970
6.75 × 10 2
5.953690
24.393170
8.03 × 10 2
f 2 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
6.45 × 10 2
20,155.732954
20,134.629495
0
183.584823
1439.319025
0
0.861329
23,097.569290
23,511.452949
2.732509
326.044592
2483.724942
0.525656
0.296353
21,435.685432
22,061.730052
0.263728
246.343291
1955.372492
0.255429
f 10 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
0.113550
24,876.459003
24,626.324592
0
95.657422
688.787853
0
0.754291
29,942.359392
33,338.728942
0.769235
143.859235
1431.750099
0.621509
0.212700
27,295.176529
29,210.135929
6.36 × 10 2
116.538544
1058.681232
5.31 × 10 2
f 3 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
1.11 × 10 1
0.783852
0.746025
0
1.16 × 10 2
0.167315
0
0.354451
0.850443
0.837694
1.25 × 10 4
2.35 × 10 2
0.186983
1.38 × 10 4
0.195824
0.811365
0.799866
3.55 × 10 5
9.38 × 10 5
0.166489
2.13 × 10 5
f 11 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
14.775500
68.413502
58.357421
0
1.295332
7.397541
0
15.145392
73.163592
65.772001
2.27 × 10 2
2.282315
11.648522
2.48 × 10 2
14.950132
70.657632
61.465342
4.53 × 10 3
1.686960
9.200357
3.02 × 10 3
f 4 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
436.882200
638.513205
223.195002
0
113.543829
476.735252
0
551.395213
706.697495
263.465402
1.653533
213.352932
613.530234
1.293298
484.606492
668.543402
243.792502
0.589842
178.416452
558.464329
0.539520
f 12 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
-4.445052
−23.0423521
−44.356992
−99
−87.920501
−40.345210
−99
−8.728848
−19.167452
−34.123586
−98.835492
−79.465202
−27.446501
−98.872555
−6.178500
−21.911352
−37.588482
−98.947900
−83.891430
−36.452653
−98.962902
f 5 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
2428.940492
4012.652903
3888.542030
0
33.682005
293.459724
0
2592.352049
4633.727049
4683.634029
0.418092
72.436405
518.965567
0.435304
2433.183441
4315.743555
4365.895902
8.25 × 10 2
53.696570
385.652334
4.67 × 10 2
f 13 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
20.651103
0.162015
0.147652
1.53 × 10 32
0.142465
0.117900
3.65 × 10 33
20.770492
0.187683
0.191543
8.75 × 10 5
0.186110
0.169890
3.33 × 10 5
20.659945
0.169435
0.175026
1.30 × 10 5
0.163519
0.154551
9.66 × 10 6
f 6 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
1.11 × 10 13
4.924155
4.140800
8.88 × 10 16
0.635644
3.269543
8.88 × 10 16
9.43 × 10 2
5.442632
4.543301
3.59 × 10 2
1.108742
4.236500
2.75 × 10 2
1.89 × 10 2
5.032945
4.412393
1.03 × 10 2
0.845798
3.732709
1.26 × 10 2
f 14 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
1.70 × 10 + 5
93.691550
95.931103
0
0.880193
8.075650
0
1.72 × 10 + 5
105.629021
109.031902
3.28 × 10 3
1.358399
14.832029
2.83 × 10 3
1.71 × 10 + 5
100.355603
103.562900
6.53 × 10 4
1.099707
11.357613
6.85 × 10 4
f 7 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
7.30 × 10 4
1.31 × 10 + 15
1.75 × 10 + 18
0
3.17 × 10 9
21978.054329
0
7.38 × 10 4
6.49 × 10 + 17
1.41 × 10 + 22
2.35 × 10 31
5.56 × 10 3
2.27 × 10 + 11
3.45 × 10 32
7.33 × 10 4
1.38 × 10 + 16
1.76 × 10 + 21
5.16 × 10 33
4.27 × 10 4
9.45 × 10 + 9
6.65 × 10 34
f 15 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
3.158222
3.186900
2.271819
0
1.756300
2.569432
0
3.158339
3.256833
2.492549
0.786523
1.896942
2.835301
0.725431
3.158275
3.225253
2.405293
0.451663
1.8131570
2.704312
0.417902
f 8 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
0.185455
4.10 × 10 + 3
3.54 × 10 + 3
0
36.945444
293.842003
0
0.499821
4.63 × 10 + 3
4.79 × 10 + 3
0.567650
73.345992
486.513050
0.475325
0.289011
4.23 × 10 + 3
4.25 × 10 + 3
5.12 × 10 2
53.455374
375.451515
3.19 × 10 2
Table 6. Comparative analysis among CFAEE, original FA, and five other FA implementations for benchmarks with 2 dimensions.
Table 6. Comparative analysis among CFAEE, original FA, and five other FA implementations for benchmarks with 2 dimensions.
FunctionAlgorithmBest ValueWorst ValueMean ValueFunctionAlgorithmBest ValueWorst ValueMean Value
f 16 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
−1
−1
−1
−1
−1
−1
−1
−1
−1
−1
−1
−0.95357
−1
−1
−1
−1
−1
−1
−0.997534
−1
−1
f 17 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
0
0
0
0
0
0
0
9.65 × 10 13
0
0
0
0
0
0
1.96 × 10 14
0
0
0
0
0
0
f 18 ( x ) FA
VSSFA
LFA
GDAFA
WFA
CLFA
CFAEE
0
0
0
0
0
0
0
3.02 × 10 12
0
0
0
3.52 × 10 5
0
0
6.29 × 10 14
0
0
0
1.86 × 10 7
0
0
Table 7. Summary of benchmark problem set 1 scores.
Table 7. Summary of benchmark problem set 1 scores.
FAVSSFALFAWFACLFAGDAFACFAEE
Best 10 dim0000000
Mean 10 dim0000000
Worst 10 dim0000000
Total 10 dim0000000
Best 30 dim0000001
Mean 30 dim00000213
Worst 30 dim00000312
Total 30 dim00000526
Best 100 dim0000001
Mean 100 dim00000312
Worst 100 dim00000312
Total 100 dim00000625
GRAND TOTAL000001151
Table 8. Statistical comparison of results obtained by CFAEE for 100-dimensional benchmarks with other approaches by the Wilcoxon Signed-Rank Test at α = 0.05.
Table 8. Statistical comparison of results obtained by CFAEE for 100-dimensional benchmarks with other approaches by the Wilcoxon Signed-Rank Test at α = 0.05.
FunctionCFAEEGDAFAFAVSSFALFAWFACLFA
f1 2.12 × 10 4 1.66 × 10 3 3.06 × 10 4 9.16 × 10 + 1 9.42 × 10 + 1 1.027.50
f2 2.554 × 10 1 2.63 × 10 1 2.96 × 10 1 2.14 × 10 + 4 2.21 × 10 + 4 2.47 × 10 + 2 1.96 × 10 + 3
f3 2.13 × 10 5 3.50 × 10 5 5.97 × 10 5 8.11 × 10 1 8.00 × 10 1 2.10 × 10 2 1.66 × 10 1
f4 5.395 × 10 1 5.68 × 10 1 4.85 × 10 + 2 6.69 × 10 + 2 2.42 × 10 + 2 1.79 × 10 + 2 5.56 × 10 + 2
f5 4.67 × 10 2 8.17 × 10 2 2.43 × 10 + 3 4.31 × 10 + 3 4.36 × 10 + 3 5.06 × 10 + 1 3.85 × 10 + 2
f6 1.26 × 10 2 1.08 × 10 2 1.89 × 10 2 5.054.41 8.38 × 10 1 3.61
f7 6.65 × 10 34 8.11 × 10 33 7.35 × 10 5 1.29 × 10 + 17 1.88 × 10 + 21 1.17 × 10 4 9.54 × 10 + 9
f8 3.19 × 10 2 5.42 × 10 2 1.89 × 10 1 4.33 × 10 + 3 4.33 × 10 + 3 5.15 × 10 + 1 3.70 × 10 + 2
f9 8.03 × 10 2 6.85 × 10 2 7.27 7.62 × 10 + 1 6.58 × 10 + 1 5.95 2.44 × 10 + 1
f10 5.31 × 10 2 6.21 × 10 2 2.13 × 10 1 2.73 × 10 + 4 2.92 × 10 + 4 1.18 × 10 + 2 1.06 × 10 + 3
f11 3.02 × 10 3 4.53 × 10 3 1.48 × 10 + 1 7.06 × 10 + 1 6.25 × 10 + 1 1.689.20
f12 9.89 × 10 + 1 9.89 × 10 + 1 −6.18 2.09 × 10 + 1 3.65 × 10 + 1 8.28 × 10 + 1 3.64 × 10 + 1
f13 9.66 × 10 6 1.30 × 10 5 2.07 × 10 + 1 1.54 × 10 1 1.75 × 10 1 1.56 × 10 1 1.53 × 10 1
f14 6.85 × 10 4 6.23 × 10 4 1.70 × 10 + 5 9.93 × 10 + 1 1.02 × 10 + 2 1.10 1.12 × 10 + 1
f15 4.17 × 10 1 4.42 × 10 1 3.163.222.411.812.71
p-value3.125 × 10 2 4.39 × 10 2 2.13 × 10 4 3.05 × 10 5 3.05 × 10 5 3.05 × 10 5 3.05 × 10 5
Table 9. CEC 2017 function details.
Table 9. CEC 2017 function details.
IDName of the functionClassSearch RangeOptimum
F1Shifted and Rotated Bent Cigar FunctionUnimodal[−100, 100]100
F2Shifted and Rotated Sum of Different Power FunctionUnimodal[−100, 100]200
F3Shifted and Rotated Zakharov FunctionUnimodal[−100, 100]300
F4Shifted and Rotated Rosenbrock’s FunctionMultimodal[−100, 100]400
F5Shifted and Rotated Rastrigin’s FunctionMultimodal[−100, 100]500
F6Shifted and Rotated Expanded Scaffer’s FunctionMultimodal[−100, 100]600
F7Shifted and Rotated Lunacek Bi-Rastrigin FunctionMultimodal[−100, 100]700
F8Shifted and Rotated Non-Continuous Rastrigin’s FunctionMultimodal[−100, 100]800
F9Shifted and Rotated Lévy FunctionMultimodal[−100, 100]900
F10Shifted and Rotated Schwefel’s FunctionMultimodal[−100, 100]1000
F11Hybrid Function 1 (N = 3)Hybrid[−100, 100]1100
F12Hybrid Function 2 (N = 3)Hybrid[−100, 100]1200
F13Hybrid Function 3 (N = 3)Hybrid[−100, 100]1300
F14Hybrid Function 4 (N = 4)Hybrid[−100, 100]1400
F15Hybrid Function 5 (N = 4)Hybrid[−100, 100]1500
F16Hybrid Function 6 (N = 4)Hybrid[−100, 100]1600
F17Hybrid Function 6 (N = 5)Hybrid[−100, 100]1700
F18Hybrid Function 6 (N = 5)Hybrid[−100, 100]1800
F19Hybrid Function 6 (N = 5)Hybrid[−100, 100]1900
F20Hybrid Function 6 (N = 6)Hybrid[−100, 100]2000
F21Composition Function 1 (N = 3)Composition[−100, 100]2100
F22Composition Function 2 (N = 3)Composition[−100, 100]2200
F23Composition Function 3 (N = 4)Composition[−100, 100]2300
F24Composition Function 4 (N = 4)Composition[−100, 100]2400
F25Composition Function 5 (N = 5)Composition[−100, 100]2500
F26Composition Function 6 (N = 5)Composition[−100, 100]2600
F27Composition Function 7 (N = 6)Composition[−100, 100]2700
F28Composition Function 8 (N = 6)Composition[−100, 100]2800
F29Composition Function 9 (N = 3)Composition[−100, 100]2900
F30Composition Function 10 (N = 3)Composition[−100, 100]3000
Table 10. CEC 2017 comparative analysis results.
Table 10. CEC 2017 comparative analysis results.
AlgorithmF1F2F3F4F5
MeanSTDMeanSTDMeanSTDMeanSTDMeanSTD
IHHO 1.86 × 10 + 2 26.921n/an/a3.02 × 10 + 2 52.1524.03 × 10 + 2 2.607 5.05 × 10 + 2 3.251
HHO 1.75 × 10 + 6 4.29 × 10 + 5 n/an/a 6.71 × 10 + 2 3.24 × 10 + 2 4.37 × 10 + 2 53.631 5.35 × 10 + 2 24.927
DE 7.54 × 10 + 7 1.71 × 10 + 7 n/an/a 4.59 × 10 + 3 1.35 × 10 + 3 4.29 × 10 + 2 8.530 5.52 × 10 + 2 6.232
GOA 1.56 × 10 + 5 5.24 × 10 + 4 n/an/a 3.05 × 10 + 2 61.300 4.15 × 10 + 2 19.48 5.25 × 10 + 2 16.803
GWO 1.53 × 10 + 7 4.85 × 10 + 6 n/an/a 3.57 × 10 + 3 2.77 × 10 + 3 4.09 × 10 + 2 10.705 5.19 × 10 + 2 8.543
MFO 7.17 × 10 + 6 2.18 × 10 + 7 n/an/a 9.04 × 10 + 3 9.31 × 10 + 3 4.20 × 10 + 2 27.727 5.31 × 10 + 2 12.860
MVO 1.79 × 10 + 4 7.99 × 10 + 3 n/an/a 3.05 × 10 + 2 46.451 4.06 × 10 + 2 1.392 5.17 × 10 + 2 9.888
PSO 9.49 × 10 + 4 8.42 × 10 + 2 n/an/a 3.49 × 10 + 2 65.409 4.07 × 10 + 2 10.318 5.26 × 10 + 2 7.305
WOA 4.27 × 10 + 7 3.81 × 10 + 6 n/an/a 5.16 × 10 + 3 4.22 × 10 + 2 4.61 × 10 + 2 69.033 5.51 × 10 + 2 17.46
SCA 1.15 × 10 + 8 5.91 × 10 + 7 n/an/a 4.03 × 10 + 3 8.42 × 10 + 2 4.85 × 10 + 2 47.271 5.59 × 10 + 2 9.352
FA 1.61 × 10 + 5 3.77 × 10 + 4 n/an/a 3.09 × 10 + 2 54.991 4.17 × 10 + 2 18.858 5.28 × 10 + 2 19.302
CFAEE1.31 × 10 + 2 14.353n/an/a3.02 × 10 + 2 28.131 4.04 × 10 + 2 2.3725.01 × 10 + 2 3.285
AlgorithmF6F7F8F9F10
MeanSTDMeanSTDMeanSTDMeanSTDMeanSTD
IHHO6.00 × 10 + 2 0.082 7.49 × 10 + 2 10.041 8.11 × 10 + 2 6.526 1.13 × 10 + 3 85.42 1.69 × 10 + 3 1.31 × 10 + 2
HHO 6.38 × 10 + 2 12.320 7.96 × 10 + 2 18.921 8.29 × 10 + 2 5.700 1.44 × 10 + 3 1.24 × 10 + 2 2.03 × 10 + 3 3.42 × 10 + 2
DE 6.28 × 10 + 2 4.744 8.01 × 10 + 2 10.373 8.62 × 10 + 2 6.873 1.76 × 10 + 3 1.48 × 10 + 2 2.09 × 10 + 3 2.01 × 10 + 2
GOA 6.08 × 10 + 2 10.295 7.32 × 10 + 2 11.375 8.31 × 10 + 2 14.512 9.97 × 10 + 2 93.212 1.96 × 10 + 3 3.17 × 10 + 2
GWO 6.01 × 10 + 2 1.909 7.35 × 10 + 2 16.343 8.16 × 10 + 2 5.053 9.14 × 10 + 2 12.11 1.76 × 10 + 3 3.10 × 10 + 2
MFO 6.02 × 10 + 2 2.411 7.46 × 10 + 2 22.655 8.29 × 10 + 2 13.786 1.23 × 10 + 3 2.76 × 10 + 2 2.02 × 10 + 3 3.27 × 10 + 2
MVO 6.03 × 10 + 2 4.365 7.30 × 10 + 2 11.278 8.25 × 10 + 2 12.2169.00 × 10 + 2 0.012 1.82 × 10 + 3 3.60 × 10 + 2
PSO 6.10 × 10 + 2 3.539 7.26 × 10 + 2 9.008 8.19 × 10 + 2 5.9829.00 × 10 + 2 0.0031.50 × 10 + 3 2.84 × 10 + 2
WOA 6.36 × 10 + 2 13.695 7.82 × 10 + 2 23.692 8.45 × 10 + 2 17.470 1.54 × 10 + 3 3.94 × 10 + 2 2.19 × 10 + 3 3.16 × 10 + 2
SCA 6.24 × 10 + 2 4.105 7.84 × 10 + 2 13.299 8.47 × 10 + 2 7.577 1.03 × 10 + 3 85.98 2.51 × 10 + 3 2.18 × 10 + 2
FA 6.71 × 10 + 2 11.393 7.35 × 10 + 2 11.55 8.33 × 10 + 2 13.914 9.97 × 10 + 2 81.44 1.93 × 10 + 3 2.96 × 10 + 2
CFAEE6.00 × 10 + 2 0.0517.23 × 10 + 2 11.3918.08 × 10 + 2 5.422 9.87 × 10 + 2 42.11 1.58 × 10 + 3 1.25 × 10 + 2
AlgorithmF11F12F13F14F15
MeanSTDMeanSTDMeanSTDMeanSTDMeanSTD
IHHO 1.12 × 10 + 3 13.523 4.25 × 10 + 5 3.05 × 10 + 5 4.42 × 10 + 3 2.18 × 10 + 3 1.42 × 10 + 3 1.651 2.15 × 10 + 3 5.65 × 10 + 2
HHO 1.16 × 10 + 3 45.729 2.56 × 10 + 6 1.13 × 10 + 6 1.92 × 10 + 4 1.16 × 10 + 4 1.83 × 10 + 3 2.41 × 10 + 2 8.63 × 10 + 3 5.55 × 10 + 2
DE 1.14 × 10 + 3 36.317 9.15 × 10 + 4 6.58 × 10 + 4 1.35 × 10 + 3 78.355 1.46 × 10 + 3 11.8261.51 × 10 + 3 18.454
GOA 1.17 × 10 + 3 58.009 2.24 × 10 + 6 1.15 × 10 + 6 1.65 × 10 + 4 1.13 × 10 + 4 2.93 × 10 + 3 1.15 × 10 + 3 6.48 × 10 + 3 4.32 × 10 + 3
GWO 1.34 × 10 + 3 183.524 1.31 × 10 + 6 1.54 × 10 + 6 1.26 × 10 + 4 7.82 × 10 + 3 3.19 × 10 + 3 1.82 × 10 + 3 5.63 × 10 + 3 3.16 × 10 + 3
MFO 1.23 × 10 + 3 107.133 2.23 × 10 + 6 4.81 × 10 + 6 1.61 × 10 + 4 1.39 × 10 + 4 8.42 × 10 + 3 5.42 × 10 + 3 1.25 × 10 + 4 1.02 × 10 + 4
MVO 1.14 × 10 + 3 27.331 1.52 × 10 + 6 1.41 × 10 + 6 9.89 × 10 + 3 2.55 × 10 + 3 2.15 × 10 + 3 1.03 × 10 + 3 4.05 × 10 + 3 2.45 × 10 + 3
PSO1.10 × 10 + 3 3.727 4.35 × 10 + 4 1.26 × 10 + 4 1.01 × 10 + 4 7.23 × 10 + 3 1.49 × 10 + 3 88.291 1.81 × 10 + 3 3.75 × 10 + 2
WOA 1.22 × 10 + 3 82.415 4.85 × 10 + 6 5.12 × 10 + 6 1.57 × 10 + 4 1.38 × 10 + 4 3.42 × 10 + 3 9.82 × 10 + 2 1.42 × 10 + 4 9.88 × 10 + 3
SCA 1.24 × 10 + 3 96.535 2.41 × 10 + 7 2.05 × 10 + 7 6.43 × 10 + 4 4.69 × 10 + 4 1.99 × 10 + 3 4.31 × 10 + 2 3.21 × 10 + 3 1.41 × 10 + 3
FA 1.16 × 10 + 3 39.705 2.32 × 10 + 6 1.21 × 10 + 6 1.21 × 10 + 4 1.05 × 10 + 4 1.88 × 10 + 3 3.21 × 10 + 2 3.67 × 10 + 3 2.13 × 10 + 3
CFAEE1.10 × 10 + 3 1.5033.18 × 10 + 4 2.29 × 10 + 4 1.35 × 10 + 3 20.499 1.43 × 10 + 3 21.3501.51 × 10 + 3 10.217
AlgorithmF16F17F18F19F20
MeanSTDMeanSTDMeanSTDMeanSTDMeanSTD
IHHO 1.73 × 10 + 3 59.44 1.73 × 10 + 3 7.519 4.79 × 10 + 3 1.68 × 10 + 3 1.90 × 10 + 3 6.993 2.02 × 10 + 3 19.561
HHO 1.89 × 10 + 3 1.47 × 10 + 2 1.79 × 10 + 3 65.751 2.02 × 10 + 4 1.41 × 10 + 4 1.71 × 10 + 4 1.21 × 10 + 4 2.23 × 10 + 3 86.017
DE 1.69 × 10 + 3 41.15 1.77 × 10 + 3 19.5141.84 × 10 + 3 23.298 2.75 × 10 + 3 8.35 × 10 + 2 2.05 × 10 + 3 23.711
GOA 1.78 × 10 + 3 1.76 × 10 + 2 1.83 × 10 + 3 1.21 × 10 + 2 1.63 × 10 + 4 1.31 × 10 + 4 3.25 × 10 + 3 1.95 × 10 + 3 2.15 × 10 + 3 74.824
GWO 1.79 × 10 + 3 1.11 × 10 + 2 1.77 × 10 + 3 38.759 2.55 × 10 + 4 1.84 × 10 + 4 2.75 × 10 + 4 2.38 × 10 + 4 2.09 × 10 + 3 73.994
MFO 1.85 × 10 + 3 15.23 × 10 + 2 1.78 × 10 + 3 65.311 2.21 × 10 + 4 1.39 × 10 + 4 7.81 × 10 + 3 6.15 × 10 + 3 2.13 × 10 + 3 72.321
MVO 1.80 × 10 + 3 1.44 × 10 + 2 1.80 × 10 + 3 46.126 2.03 × 10 + 4 1.25 × 10 + 4 4.63 × 10 + 3 2.62 × 10 + 3 2.12 × 10 + 3 86.303
PSO1.65 × 10 + 3 65.364 1.72 × 10 + 3 16.123 7.63 × 10 + 3 4.46 × 10 + 3 3.13 × 10 + 3 2.05 × 10 + 3 2.06 × 10 + 3 35.410
WOA 1.96 × 10 + 3 14.92 × 10 + 2 1.82 × 10 + 3 73.459 2.13 × 10 + 4 1.95 × 10 + 2 2.07 × 10 + 5 1.16 × 10 + 5 2.19 × 10 + 3 1.11 × 10 + 2
SCA 1.73 × 10 + 3 95.425 1.80 × 10 + 3 25.30 × 10 3 8.77 × 10 + 4 9.23 × 10 + 2 1.15 × 10 + 4 1.44 × 10 + 3 2.14 × 10 + 3 46.855
FA 1.79 × 10 + 3 1.73 × 10 + 2 1.82 × 10 + 3 1.15 × 10 + 2 1.67 × 10 + 4 1.45 × 10 + 4 3.18 × 10 + 3 1.59 × 10 + 3 2.12 × 10 + 3 71.303
CFAEE 1.70 × 10 + 3 86.3591.71 × 10 + 3 8.442 1.86 × 10 + 3 21.5651.90 × 10 + 3 8.7172.01 × 10 + 3 9.443
AlgorithmF21F22F23F24F25
MeanSTDMeanSTDMeanSTDMeanSTDMeanSTD
IHHO2.20 × 10 + 3 4.615 2.28 × 10 + 3 17.820 2.59 × 10 + 3 14.213 2.68 × 10 + 3 1.31 × 10 + 2 2.87 × 10 + 3 85.338
HHO 2.35 × 10 + 3 53.711 2.32 × 10 + 3 25.234 2.69 × 10 + 3 35.522 2.82 × 10 + 3 93.623 2.95 × 10 + 3 49.573
DE 2.25 × 10 + 3 78.104 2.29 × 10 + 3 17.513 2.63 × 10 + 3 15.1632.66 × 10 + 3 69.502 2.91 × 10 + 3 15.543
GOA 2.30 × 10 + 3 56.877 2.38 × 10 + 3 1.08 × 10 + 2 2.64 × 10 + 3 23.536 2.73 × 10 + 3 57.833 2.93 × 10 + 3 32.598
GWO 2.30 × 10 + 3 32.884 2.31 × 10 + 3 57.573 2.62 × 10 + 3 13.862 2.74 × 10 + 3 25.132 2.94 × 10 + 3 28.256
MFO 2.32 × 10 + 3 29.255 2.35 × 10 + 3 93.557 2.63 × 10 + 3 11.327 2.75 × 10 + 3 76.435 2.96 × 10 + 3 37.776
MVO 2.32 × 10 + 3 11.839 2.33 × 10 + 3 1.11 × 10 + 2 2.65 × 10 + 3 10.445 2.74 × 10 + 3 18.246 2.92 × 10 + 3 84.256
PSO 2.27 × 10 + 3 49.783 2.33 × 10 + 3 1.03 × 10 + 2 2.60 × 10 + 3 72.300 2.70 × 10 + 3 76.143 2.90 × 10 + 3 33.735
WOA 2.34 × 10 + 3 60.021 2.48 × 10 + 3 2.45 × 10 + 2 2.66 × 10 + 3 29.838 2.77 × 10 + 3 85.902 2.98 × 10 + 3 1.03 × 10 + 2
SCA 2.29 × 10 + 3 65.229 2.41 × 10 + 3 66.636 2.67 × 10 + 3 45.449 2.78 × 10 + 3 11.548 2.98 × 10 + 3 37.291
FA 2.29 × 10 + 3 34.701 2.36 × 10 + 3 1.10 × 10 + 2 2.62 × 10 + 3 17.452 2.72 × 10 + 3 1.05 × 10 + 2 2.93 × 10 + 3 47.019
CFAEE2.20 × 10 + 3 48.5522.26 × 10 + 3 13.0402.55 × 10 + 3 21.929 2.67 × 10 + 3 1.72 × 10 + 2 2.81 × 10 + 3 95.429
AlgorithmF26F27F28F29F30
MeanSTDMeanSTDMeanSTDMeanSTDMeanSTD
IHHO 2.93 × 10 + 3 1.66 × 10 + 2 3.19 × 10 + 3 33.657 3.30 × 10 + 3 48.6943.20 × 10 + 3 28.982 2.30 × 10 + 4 1.45 × 10 + 4
HHO 3.62 × 10 + 3 5.39 × 10 + 2 3.18 × 10 + 3 51.306 3.41 × 10 + 3 1.02 × 10 + 2 3.39 × 10 + 3 85.653 1.43 × 10 + 6 1.31 × 10 + 6
DE 2.95 × 10 + 3 95.9293.07 × 10 + 3 2.558 3.28 × 10 + 3 27.035 3.21 × 10 + 3 35.216 3.65 × 10 + 5 2.31 × 10 + 5
GOA 3.01 × 10 + 3 3.65 × 10 + 2 3.11 × 10 + 3 25.326 3.31 × 10 + 3 1.53 × 10 + 2 3.27 × 10 + 3 75.411 5.29 × 10 + 5 3.89 × 10 + 5
GWO 3.36 × 10 + 3 5.05 × 10 + 2 3.10 × 10 + 3 13.541 3.42 × 10 + 3 1.33 × 10 + 2 3.22 × 10 + 3 49.822 6.17 × 10 + 5 4.88 × 10 + 5
MFO 3.05 × 10 + 3 1.13 × 10 + 2 3.09 × 10 + 3 5.722 3.21 × 10 + 3 93.459 3.26 × 10 + 3 55.593 6.36 × 10 + 5 5.93 × 10 + 5
MVO 3.15 × 10 + 3 2.77 × 10 + 2 3.10 × 10 + 3 21.875 3.36 × 10 + 3 1.23 × 10 + 2 3.26 × 10 + 3 75.139 4.62 × 10 + 5 4.07 × 10 + 5
PSO 2.95 × 10 + 3 2.55 × 10 + 2 3.12 × 10 + 3 31.830 3.32 × 10 + 3 1.35 × 10 + 2 3.21 × 10 + 3 62.374 1.13 × 10 + 6 1.09 × 10 + 6
WOA 3.37 × 10 + 3 2.92 × 10 + 2 3.17 × 10 + 3 48.124 3.46 × 10 + 3 1.65 × 10 + 2 3.46 × 10 + 3 1.21 × 10 + 2 1.29 × 10 + 6 7.53 × 10 + 5
SCA 3.15 × 10 + 3 1.82 × 10 + 2 3.13 × 10 + 3 13.152 3.38 × 10 + 3 89.259 3.25 × 10 + 3 48.339 1.49 × 10 + 6 9.77 × 10 + 5
FA 3.02 × 10 + 3 2.03 × 10 + 2 3.10 × 10 + 3 27.015 3.32 × 10 + 3 1.17 × 10 + 2 3.26 × 10 + 3 31.117 4.71 × 10 + 5 4.02 × 10 + 5
CFAEE2.86 × 10 + 3 2.45 × 10 + 2 3.08 × 10 + 3 48.6903.13 × 10 + 3 2.51 × 10 + 2 3.20 × 10 + 3 27.9142.22 × 10 + 4 1.44 × 10 + 4
Table 11. Friedman test ranks for the compared algorithms over 30 CEC2017 functions.
Table 11. Friedman test ranks for the compared algorithms over 30 CEC2017 functions.
FunctionIHHOHHODEGOAGWOMFOMVOPSOWOASCAFACFAEE
F1271159834101261
F31.57103.58123.5611951.5
F4110965834111272
F5291154836101271
F61.511963457108121.5
F78111245.57329105.51
F826.512836.554101191
F9810125.5391.51.51175.54
F10391074851111262
F1136.54.5812104.51.59116.51.5
F12410385762111291
F133111.510794581261.5
F14153910128411762
F154101.598117312561.5
F165.511317.51092125.57.54
F17374.5124.568.5210.58.510.51
F18371511108491262
F191.510361187412951.5
F20212310586.541196.51
F211.51237.57.59.59.54115.55.51.5
F2225310486.56.5121191
F232126.584.56.59310114.51
F24312167.597.54101152
F252946.58105311.511.56.51
F262123.551078.53.5118.561
F27121117535810952
F284103511286.51296.51
F291.5113.5105883.512681.5
F30211367849101251
Average Ranking3.1389.4835.3626.8626.7248.0175.9144.06910.6219.6036.6551.552
Rank210487953121161
Table 12. Aligned Friedman test ranks for the compared algorithms over 30 CEC2017 functions.
Table 12. Aligned Friedman test ranks for the compared algorithms over 30 CEC2017 functions.
FunctionIHHOHHODEGOAGWOMFOMVOPSOWOASCAFACFAEE
F1273475983434634861
F356.56332758.532333458.5613283266056.5
F4144226211177164192152157255278183146
F5138213242194174206169196241245197135
F6153.5235218173158161165180232212270153.5
F7193264266145155.5184140136244246155.5131
F8151198.5251204167198.5191172229231207143
F914131031889.58028578.578.53139189.587
F108129330125788290957430931721675
F11114159.5127.5185300271127.5103.5265280159.5103.5
F1213191217141615113443451810
F134333139.53245432145463143375339.5
F146670683113163337369320727167
F155232948.53223033356551336556248.5
F16291.530827664298.5305302233312291.5298.5282
F17122225181.5269181.5205239.5113262.5239.5262.5107
F183876354733232582413193385036
F1928.5443033330373431339423228.5
F2098294112260.5150237.5223121286247.522396
F2199.5281129227.5227.5252.5252.5162.527220920999.5
F22110142118258130217170.5170.5297283234101
F23126279202.5223178.5202.5237.5134247.5260.5178.5102
F24119.5288105200.5220236220132.5259267.5175.5111
F25117243168214.5230256195139274.5274.5214.593
F268431585.594306106249.585.5307249.59777
F27284277119.5175.5148132.5148200.5267.5220148125
F2813728712416628992254189.5296273189.583
F29108.5295115.5209123187187115.5304162.5187108.5
F302134222252627233403413432420
Average Ranking108.017221.172158.621169.948188.810205.259144.655122.328291.724244.276147.25991.931
Rank210678943121151
Table 13. Friedman and Iman–Davenport statistical test results summary ( α = 0.05 ).
Table 13. Friedman and Iman–Davenport statistical test results summary ( α = 0.05 ).
Friedman Value χ 2 Critical Valuep-ValueIman–Davenport ValueF Critical Value
1.815 × 10 + 2 1.968 × 10 + 1 1.110 × 10 16 3.695 × 10 + 1 1.820
Table 14. Results of the Holm step-down procedure.
Table 14. Results of the Holm step-down procedure.
Comparisonp_VALUESRankingalpha = 0.05alpha = 0.1H1H2
CFAEE vs. HHO000.004550.00909TRUETRUE
CFAEE vs. WOA010.005000.01000TRUETRUE
CFAEE vs. SCA020.005560.01111TRUETRUE
CFAEE vs. MFO 4.29 × 10 12 30.006250.01250TRUETRUE
CFAEE vs. GOA 1.02 × 10 8 40.007140.01429TRUETRUE
CFAEE vs. GWO 2.35 × 10 8 50.008330.01667TRUETRUE
CFAEE vs. FA 3.53 × 10 8 60.010000.02000TRUETRUE
CFAEE vs. MVO 2.04 × 10 6 70.012500.02500TRUETRUE
CFAEE vs. DE 2.86 × 10 5 80.016670.03333TRUETRUE
CFAEE vs. PSO 3.92 × 10 3 90.025000.05000TRUETRUE
CFAEE vs. IHHO 4.69 × 10 2 100.050000.10000FALSEFALSE
Table 15. Datasets split into training, validation, and testing, along with the batch size.
Table 15. Datasets split into training, validation, and testing, along with the batch size.
DatasetTrain SetValidation SetTesting Set
MNIST20.000 (64)40.000 (100)10.000 (100)
Fashion-MNIST20.000 (64)40.000 (100)10.000 (100)
Semeion200 (2)400 (400)993 (993)
USPS2.406 (32)4.885 (977)2.007 (2.007)
CIFAR-1020.000 (100)30.000 (100)10.000 (100)
Table 16. Employed CNNs parameters summary.
Table 16. Employed CNNs parameters summary.
Dataset η α λ d p Epochs
MNIST0.010.90.0005[0, 1]10.000
Fashion-MNIST0.010.90.0005[0, 1]10.000
Semeion0.0010,90.0005[0, 1]10.000
USPS0.010.90.0005[0, 1]10.000
CIFAR-100.0010.90.004[0, 1]4.000
Table 17. Control parameter setup for metaheuristics included in the analysis.
Table 17. Control parameter setup for metaheuristics included in the analysis.
AlgorithmParameters
BA [67] f m i n = 0 , f m a x = 2 , A = 0.5 , r = 0.5
CS [68] β = 1.5 , p = 0.25 , α = 0.8
PSO [69] c 1 = 1.7 , c 2 = 1.7 , ω = 0.7
EHO [70] n o c l a n = 5 , α = 0.5 , β = 0.1 , n o e l i t e = 2
WOA [53] a 1 linearly decreasing from 2 to 0, a 2 linearly decreasing from −1 to −2, b=1
SCA [51] a = 2 , r 1 linearly decreasing from 2 to 0
SSA [72] c 1 non-linearly decreasing from 2 to 0, c 2 and c 3 rand from [0, 1]
GOA [52]c linearly decreasing from 1 to 0
BBO [71] h m p = 1 , i m p = 0.1 , n b h k = 2
FA [1] α = 0.2 , β 0 = 1.0 , γ = 1.0
Table 18. Comparative results between the proposed CFAEE and other metaheuristics in terms of mean classification accuracy.
Table 18. Comparative results between the proposed CFAEE and other metaheuristics in terms of mean classification accuracy.
MethodMNISTFashion-MNISTSemeionUSPSCIFAR-10
acc. d p acc. d p acc. d p acc. d p acc. d p
Caffe99.07091.71097.62095.80071.470
Dropout Caffe99.180.592.530.598.140.596.210.572.080.5
BA99.140.49192.560.50598.350.69296.450.76271.490.633
CS99.140.48992.410.49198.210.54496.310.71571.210.669
PSO99.160.49392.380.48197.790.37196.330.72571.510.621
EHO99.130.47592.360.47098.110.48196.240.68271.150.705
WOA99.150.48992.430.49398.230.56196.320.72271.230.685
SCA99.170.49692.530.50198.250.58096.290.70571.540.597
SSA99.190.49992.630.52798.310.64296.410.75371.580.529
GOA99.160.49292.440.49498.150.51396.150.48170.950.849
BBO99.130.47492.350.46898.160.51596.170.48371.080.768
FA99.180.49592.580.51198.290.61996.420.75871.550.583
CFAEE99.260.52992.730.57098.460.71996.880.84572.320.388
Table 19. Statistical comparison of classification error rate metrics obtained by CFAEE for CNN experiments, with other approaches by Wilcoxon Signed-Rank Test at α = 0.05.
Table 19. Statistical comparison of classification error rate metrics obtained by CFAEE for CNN experiments, with other approaches by Wilcoxon Signed-Rank Test at α = 0.05.
FunctionCFAEECaffeDropoutCaffeBACSPSOEHOWOASCASSAGOABBOFA
MNIST0.740.90.820.860.860.80.870.850.830.810.840.870.82
Fashion-MNIST7.278.297.477.447.597.627.647.577.477.377.567.657.42
Semeion1.542.381.861.651.792.211.891.771.751.691.851.841.71
USPS3.124.23.793.553.693.673.763.683.713.593.853.833.58
CIFAR-1027.6828.5327.9228.5128.7928.528.8528.7728.4628.4229.0528.9228.45
p-value3.125 × 10 2 3.125 × 10 2 3.125 × 10 2 3.125 × 10 2 3.125 × 10 2 3.125 × 10 2 3.125 × 10 2 3.125 × 10 2 3.125 × 10 2 3.125 × 10 2 3.125 × 10 2 3.125 × 10 2 3.125 × 10 2
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Bacanin, N.; Stoean, R.; Zivkovic, M.; Petrovic, A.; Rashid, T.A.; Bezdan, T. Performance of a Novel Chaotic Firefly Algorithm with Enhanced Exploration for Tackling Global Optimization Problems: Application for Dropout Regularization. Mathematics 2021, 9, 2705. https://doi.org/10.3390/math9212705

AMA Style

Bacanin N, Stoean R, Zivkovic M, Petrovic A, Rashid TA, Bezdan T. Performance of a Novel Chaotic Firefly Algorithm with Enhanced Exploration for Tackling Global Optimization Problems: Application for Dropout Regularization. Mathematics. 2021; 9(21):2705. https://doi.org/10.3390/math9212705

Chicago/Turabian Style

Bacanin, Nebojsa, Ruxandra Stoean, Miodrag Zivkovic, Aleksandar Petrovic, Tarik A. Rashid, and Timea Bezdan. 2021. "Performance of a Novel Chaotic Firefly Algorithm with Enhanced Exploration for Tackling Global Optimization Problems: Application for Dropout Regularization" Mathematics 9, no. 21: 2705. https://doi.org/10.3390/math9212705

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop