Next Article in Journal
Development of Broadband High-Frequency Piezoelectric Micromachined Ultrasonic Transducer Array
Previous Article in Journal
3D DCT Based Image Compression Method for the Medical Endoscopic Application
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Feature Selection Using Enhanced Particle Swarm Optimisation for Classification Models

1
Computational Intelligence Research Group, Department of Computer and Information Sciences, Faculty of Engineering and Environment, University of Northumbria, Newcastle upon Tyne NE1 8ST, UK
2
Institute for Intelligent Systems Research and Innovation, Deakin University, Waurn Ponds, VIC 3216, Australia
3
College of Tongda, Nanjing University of Posts and Telecommunications, Nanjing 210049, China
4
College of Computer Science and Software Engineering, Shenzhen University, Shenzhen 518060, China
*
Author to whom correspondence should be addressed.
Current affiliation: National Subsea Centre, Robert Gordon University, Aberdeen AB10 7AQ, UK.
Sensors 2021, 21(5), 1816; https://doi.org/10.3390/s21051816
Submission received: 13 January 2021 / Revised: 19 February 2021 / Accepted: 22 February 2021 / Published: 5 March 2021
(This article belongs to the Section Intelligent Sensors)

Abstract

:
In this research, we propose two Particle Swarm Optimisation (PSO) variants to undertake feature selection tasks. The aim is to overcome two major shortcomings of the original PSO model, i.e., premature convergence and weak exploitation around the near optimal solutions. The first proposed PSO variant incorporates four key operations, including a modified PSO operation with rectified personal and global best signals, spiral search based local exploitation, Gaussian distribution-based swarm leader enhancement, and mirroring and mutation operations for worst solution improvement. The second proposed PSO model enhances the first one through four new strategies, i.e., an adaptive exemplar breeding mechanism incorporating multiple optimal signals, nonlinear function oriented search coefficients, exponential and scattering schemes for swarm leader, and worst solution enhancement, respectively. In comparison with a set of 15 classical and advanced search methods, the proposed models illustrate statistical superiority for discriminative feature selection for a total of 13 data sets.

1. Introduction

The knowledge discovery processes in real-world applications often involve datasets with large numbers of features [1]. The high dimensionalities of datasets increase the likelihood of overfitting and impair generalization capability. Besides that, the inclusion of redundant or even contradictory features can severely reduce the performance of classification, regression and clustering algorithms [2]. As a result, feature selection and dimensionality reduction become critical in overcoming the aforementioned challenges by eliminating certain irrelevant and redundant features while identifying the most effective and discriminative ones [3,4]. Moreover, for datasets with high dimensionalities, it is computationally impractical to conduct an exhaustive search of all possible combinations of the feature subsets [5]. In addition, the search landscape becomes extremely complicated, owing to the sophisticated confounding effects of various feature interactions in terms of redundancy and complementarity [6]. Therefore, effective and robust search methods are required to thoroughly explore the complex effects of feature interactions while satisfying the constraints of practicality in term of computational cost to undertake large-scale feature selection tasks.
Evolutionary Computation (EC) techniques have been widely employed to comprehensively explore the complex effects of feature interactions, owing to the significant capability of EC in finding global optimality [4]. Inspired by natural evolution, EC techniques employ a population-based evolving mechanism to supervise the individual solutions to move towards the promising search territory iteratively and identify the global optima. In EC-based feature selection methods, the coevolution mechanisms based on diverse evolving operators, e.g., crossover and mutation, are capable of producing various feature representations of the original problem in one single run. Therefore, the confounding effects of feature interactions can be thoroughly explored through the evaluation of various feature constitutions during the iterative process. The effectiveness and superiority of various EC techniques over other methods in undertaking feature selection tasks have been extensively verified in many existing studies, such as feature optimisation using Genetic Algorithm (GA) [7], Differential Evolution (DE) [8,9], Particle Swarm Optimisation (PSO) [10], Moth-flame optimisation (MFO) [11], Firefly Algorithm (FA) [3,12], Ant Colony Optimisation (ACO) [13], Grey Wolf Optimisation (GWO) [14], Whale Optimisation Algorithm (WOA) [15], and Sine Cosine Algorithm (SCA) [16]. Nevertheless, the empirical studies indicated that these original EC algorithms tend to be trapped in local optima, and they can be further improved in terms of search diversity and capability of avoiding local stagnation.
As one of the most acknowledged and widely-used EC algorithms, PSO has been adopted in various optimisation problems, owing to its simplicity, fast convergence speed, as well as effectiveness and robust generalization capability. In PSO, each particle adjusts its search trajectory by learning from two historical best experiences, i.e., its own best position and the global best solution. Despite the great advantages in following both the local and global best signals, PSO suffers from the local optima traps as well as inefficient fine-tuning capabilities owing to its working principles [17,18,19]. As an example, PSO lacks the operation of exchanging information between particles, owing to the fact that only the global best solution is exploited as the reference for coevolution [20]. Secondly, the swarm often tends to revisit previously explored regions, owing to the strict adherence to the historical best experiences of each particle [21]. These limitations in the original PSO model severely constrain the search diversity and search scope, hence resulting in early stagnation and premature convergence. Such constraints of the PSO algorithm become worse when undertaking feature selection tasks with complex problem landscapes.
In this research, we propose two enhanced PSO models to address the identified limitations of the original PSO algorithm as well as undertake complex feature selection problems. Specifically, the research overcomes the lack of cooperation between individual particles and ineffectiveness of search owing to frequent re-visits to previously explored regions in the original PSO model. The proposed PSO models employ several key strategies, including leader/exemplar generation using dynamic absorption of elicit genes, search operations with differentiated nonlinear trajectories, exploitation schemes for swarm leader enhancement, as well as re-dispatching mechanisms for enhancement of the worst solutions. These strategies work cooperatively as augmentations to accelerate convergence while preserving diversity. A summary of the research contributions is presented, as follows:
  • Two new PSO variants for feature selection are proposed to overcome two major shortcomings of the original PSO algorithm, i.e., premature convergence and weak local exploitation capability around the near optimal solutions.
  • The first proposed PSO model, i.e., PSOVA1 (PSO variant 1), comprises the following mechanisms: (1) a modified PSO operation with rectified global and personal best signals, (2) spiral search based local exploitation, (3) Gaussian distribution based swarm leader enhancement, as well as (4) mirroring and DE mutation operations for worst solution improvement.
  • The second proposed PSO model, i.e., PSOVA2 (PSO variant 2), enhances PSOVA1 through four mechanisms: (1) an adaptive exemplar breeding mechanism incorporating multiple optimal signals, (2) search coefficient generation using sine, cosine, and hyperbolic tangent functions, (3) worst solution enhancement using a hybrid re-dispatching scheme, and (4) an exponential exploitation scheme for swarm leader improvement. Moreover, the search diversity and scopes in PSOVA2 are further elevated in comparison with those of PSOVA1. This is owing to the adoption of diverse exemplars to guide the search in each dimension, as well as the employment of versatile search trajectories to calibrate the particle positions.
  • Evaluation using 13 datasets with a wide spectrum of dimensionalities: the empirical results indicate that both proposed models outperform five classical search methods and ten advanced PSO variants with significant advantages, evidenced by the statistical test outcomes.
The rest of the paper is organized as follows. Section 2 introduces the original and diverse PSO models, and their applications to feature selection. We present the two proposed PSO models with elaborations and analysis for each proposed enhancement in Section 3 and Section 4, respectively. Section 5 discusses the evaluation of the proposed and the baseline search methods on a variety of feature selection tasks. Conclusions are drawn and future research directions are presented in Section 6.

2. Related Studies

In this section, we firstly introduce the original PSO model. Then, the state-of-the-art PSO variants are presented. We also conduct a literature review on the application of PSO variants to feature selection. Finally, we discuss the motivation of this research.

2.1. Particle Swarm Optimisation

PSO is a population-based self-adaptive optimisation technique developed by Kennedy and Eberhart [22] based on swarm social behaviors, such as fish in a school and birds in a flock. The PSO algorithm conducts search in the landscape of the objective function by adjusting the trajectories of individual particles in a quasi-stochastic manner [23,24]. Each particle adjusts its velocity and position by following its own best experience in history and the global best solution of the swarm. In the PSO model, the updating equations of the velocity vidt+1 and position xidt+1 of the ith particle at the dth dimension are prescribed in Equations (1) and (2) [22]:
v i d t + 1 = w × v i d t + c 1 × r 1 × ( p b e s t i d x i d t ) + c 2 × r 2 × ( g b e s t d x i d t )
x i d t + 1 = x i d t + v i d t + 1
where vi and xi represent the velocity and position of the ith particle, while pbesti an gbest represent the historical best solution of the ith particle and the global best solution, respectively. Besides that, c1 and c2 denote the position constants, while r1 and r2 are two random values generated from [0, 1]. Moreover, t and w represent the current iteration number and inertia weight, respectively.

2.2. PSO Variants

Despite its simplicity and fast convergence speed, the PSO model is subject to local optima traps and premature convergence, owing to the constant reference to the global best solution for all swarm particles. The particle positions also become increasingly similar over iterations. As such, various diversity enhancing strategies have been proposed, e.g., repulsion strategies [23], mutation operators [24], multi-swarm concepts [25,26], multiple leaders [25,27], and hybridization with other search methods [27]. Such strategies enable the search process to balance between convergence and diversity while searching for the global optimality.
Chen et al. [28] proposed a dynamic PSO model with escaping prey schemes (DPSOEP). In DPSOEP, swarm particles were categorized into three sub-swarms according to their fitness scores, i.e., ‘preys’ (top ranked particles), ‘strong particles’ (middle ranked particles), and ‘weak particles’ (lower ranked particles). The particles in the above groups subsequently followed distinctive search operations, i.e., Lévy flights, the original PSO position and a multivariate normal distribution, respectively, to search for global optimality.
Li et al. [29] proposed a multi-information fusion “triple variables with iteration” inertia weight PSO (MFTIWPSO) model, in which the inertia weight was generated using multiple information, including the particle velocity, position, random disturbance, number of iterations, as well as inertia weight score from the last iteration. The MFTIWPSO outperformed a number of baseline models for solving benchmark functions and hyper-parameter tuning in classification methods.
Wang et al. [24] proposed a diversity enhancing and neighborhood search PSO (DNSPSO) model for solving multimodal high-dimensional benchmark functions. It employed a crossover factor and a DE-based operation for trial particle generation. Moreover, a ring topology was also utilized to facilitate local and global neighborhood search operations. In addition, an eXpanded PSO (XPSO) model was proposed by Xia et al. [30], where the swarm leader and a dynamic neighboring best solution were employed to guide the social component in the PSO operation.
A distributed contribution based quantum-behaved PSO with controlled diversity (DC-QPSO) was proposed by Chen et al. [31] for solving large-scale global optimisation problems. Their model first decomposed the original problem into several sub-problems. A contribution-based mechanism was then employed to ensure more resources (i.e., more number of function evaluations) to be awarded to the sub-swarms with comparatively more fitness enhancement. A diversity control strategy based on genotype diversity (i.e., distance-based diversity) was subsequently used to increase search diversity.
Lin et al. [32] proposed an enhanced genetic learning PSO (GL-PSO) algorithm for global optimisation. In GL-PSO, the genetic operators and a ring topology were employed for the generation of fitter exemplars, which were subsequently used to guide the swarm particles.
Tan et al. [27] proposed an asynchronized learning PSO model, i.e., ALPSO, by incorporating DE, Simulated Annealing (SA) and helix search actions, for hybrid clustering and hyper-parameter fine-tuning in deep Convolutional Neural Networks (CNN) for skin lesion segmentation. Zhang et al. [33] proposed an Enhanced Sine Cosine Algorithm (SCA), which employed two randomly selected neighboring solutions and the Gaussian distribution-based search parameters for the diversification of the global best signal. Moreover, Jordehi [34] proposed an Enhanced Leader PSO (ELPSO) where a five-staged mutation mechanism (e.g., Gaussian, Cauchy and opposition-based mutations) was used for swarm leader enhancement to avoid premature convergence.
Kang et al. [35] proposed a modified PSO algorithm for optimal hyper-parameter selection of Gaussian process regression (GPR). Instead of using the inertial component as in PSO, a momentum element was proposed, which was based on the mean distance of the swarm in two successive iterations. Subsequently a mutation mechanism based on a perturbation function was proposed to further enhance the global best solution.
Yu et al. [36] developed an enhanced DE algorithm for tackling multi-objective optimisation problems. It incorporated a Gaussian mutation operator for the improvement of infeasible solutions as well as a standard DE/rand/1 operation for evolving feasible solutions according their dominance relationships.
Cao et al. [37] integrated comprehensive learning PSO (CLPSO) with an adaptive local search starting strategy to solve multimodal and CEC 2013 benchmark functions, whereas Xu et al. [38] proposed an accelerated two-stage PSO (ATPSO) method with the employment of intra-cluster distance and intra-cluster cohesion measures as objective functions, respectively, for tackling complex clustering problems. Elbaz et al. [39] developed an improved PSO-adaptive neurofuzzy inference system (ANFIS) model for the prediction of shield performance during tunneling. An improved PSO method with an adaptive inertia weight and a constriction factor was employed for the optimisation of parameters in ANFIS. The empirical results indicate that this PSO-ANFIS model offered better prediction accuracy in comparison with those of ANFIS and GA-ANFIS. Elbaz et al. [40] proposed a GA-based evolving group method of data handling (GMDH)-type neural network (GMDH-GA) model for the prediction of disc cutter life during shield tunneling. GA was adopted to identify the optimal network configurations for the GMDH-type neural network.
Besides the aforementioned studies, there are other related investigations on diversity enhancement. Among them include genetic PSO (GPSO) with a crossover operator [41], a Bare-bones PSO variant (BBPSOVA) with repulsive operations and sub-swarm mechanisms [42], a Micro-GA PSO [43], a PSO with multiple sub-swarms for multimodal function evaluation (MFOPSO) [44], and a modified PSO method (MPSOELM) with time-varying adaptive acceleration coefficients for hyper-parameter optimisation pertaining to an Extreme Learning Machine (ELM) [45].

2.3. PSO for Feature Selection

Feature selection methods can be broadly divided into two categories, i.e., filter and wrapper. The filter approach ranks the features individually based on certain statistical criteria, such as chi-square test [46] and mutual information [47]. The feature ranking scores indicate their relative importance to the problem. It is challenging to identify the cut-off point for selecting the most important features. Besides that, the individual-based ranking mechanisms are incapable of measuring the confounding effects of feature interactions and feature composition [1]. Instead of measuring the impact of individual features, the wrapper approach evaluates the quality of various feature subsets by taking feature interaction into account, with the learning algorithm wrapped inside. Therefore, the wrapper technique possesses interaction with classifiers to capture feature dependencies.
In addition, PSO and its variants have been widely employed as the search engines in wrapper-based feature selection methods, owing to their fast convergence speed and powerful discriminative search capabilities [3,4,10,42,43]. As an example, Gu et al. [48] proposed a Competitive Swarm Optimiser, i.e., CSO, to undertake high-dimensional feature selection tasks. In CSO, the swarm was randomly divided into two sub-swarms, and pairwise competitions were conducted between particles from each sub-swarm. The winning particle was passed on to the next generation, while the defeated particle updated its position by learning from the position of the winning particle in the cognitive component as well as the mean position of the swarm in the social component. The CSO model outperformed several existing algorithms with various initialisation strategies for discriminative feature selection.
Moradi and Gholampour [49] proposed a hybrid PSO variant, i.e., HPSO-LS, for feature selection by integrating a local search strategy into the original PSO model. Two operators, i.e., “Add” and “Delete”, were employed to enhance the local search capability of PSO. Specifically, the “Add” operator inserted the dissimilar features into the particle, while the similar features were deleted from the particle by the “Delete” operator. Evaluated with 13 classification problems, HPSO-LS significantly outperformed a number of existing dimension reduction methods. Another hybrid PSO model, i.e., HPSO-SSM, was proposed by Chen et al. [19]. Specifically, the Logistic chaotic map was used to generate the inertia weight. Subsequently, two dynamic nonlinear correction factors were employed as the search parameters in the position updating operation. A spiral search mechanism was also incorporated to increase search diversity. Evaluated with 20 UCI datasets, HPSO-SSM outperformed several feature selection methods, such as CatfishBPSO (binary PSO with catfish effect). Tan et al. [50] proposed a hybrid learning PSO model, i.e., HLPSO, to identify the most discriminative elements from the shape, color, and texture features extracted from dermoscopic images for the identification of malignant skin lesions. HLPSO adopted three probability distributions, i.e., Gaussian, Cauchy, and Levy distributions, to further enhance the top 50% promising particles. Modified FA and spiral search actions were also employed to guide the lower-ranking 50% particles. Moreover, Xue et al. [4] conducted a comprehensive review on the applications of PSO as well as other EC techniques for tackling feature selection problems.

2.4. Research Motivations

Table 1 depicts a detailed comparison between several existing studies (including the original PSO algorithm) and this research. The original PSO model employs a search process led by a single swarm leader. Comparatively, both proposed PSOVA1 and PSOVA2 models employ multiple hybrid global optimal signals and a number of cooperative search operations to mitigate premature convergence. In particular, PSOVA2 employs versatile search operations with diverse specified sine, cosine, and hyperbolic tangent search trajectories to overcome stagnation. Both proposed models show superior capabilities in accelerating convergence while preserving diversity, in order to mitigate premature convergence.
The research motivations of the proposed models are as follows. The classical PSO algorithm explores the search space by following single leader and the particles’ own personal best experiences, therefore lack of interactions with other neighboring elite solutions accumulate through coevolution. Owing to a monotonous search operation led by single leader, the particle positions become increasingly similar over iterations. In this research, PSOVA1 is firstly proposed to enhance local and global optimal signals through the use of neighboring historical best experiences. A set of effective cooperative search strategies is introduced to overcome the limitations of the original PSO algorithm, namely a modified PSO operation with rectified local and global best signals, spiral-based local exploitation, enhancement of the swarm leader and the worst solutions using Gaussian distributions, as well as mirroring and DE-based mutations.
Secondly, PSOVA2 is further proposed to enhance the best leader generation and the search operation embedded in PSOVA1. In particular, it employs an adaptive exemplar breeding mechanism incorporating multiple local and global best solutions to guide the search process. A new search action is also proposed by embedding diverse search coefficients yielded using sine, cosine, and hyperbolic tangent formulae. In comparison with PSOVA1 where the search mainly focuses on a modified PSO operation in principle, the aforementioned new search operations equip the search process with a variety of distinctive search behaviors and irregular search trajectories. In short, the search mechanisms in PSOVA1 and PSOVA2 work in a collaborative manner to increase search diversity and mitigate premature convergence.
Moreover, most of the aforementioned existing PSO variants employed purely the single global best solution [19,22,24,31,32,34,35,39,41,44,45,51] to guide the search process. In addition, except for a few studies such as Lin et al. [32], Srisukkham et al. [42], Tan et al. [27], and Yu et al. [36], other existing work did not adopt any exemplar breeding strategies to enhance the optimal signals or generate hybrid leaders. Although some studies adopted diverse search mechanisms [19,24,27,33,42,52], the search processes in many existing studies [22,31,32,34,35,36,39,41,44,45,51] are mainly conducted by single position updating formula. Therefore, they are more likely to suffer from premature convergence. In comparison with these existing methods, the proposed PSOVA1 and PSOVA2 models employ exemplar breeding mechanisms as well as multiple global best signals to lead the search process and avoid local optima traps. A number of position updating operations (such as local and global based search actions) are embedded in both models. When a certain search operation becomes stagnant (e.g., the global search in PSOVA1 or sine-based search in PSOVA2), the proposed models are able to adopt an alternative search action (e.g., local search in PSOVA1 or cosine-based search in PSOVA2) to drive the search out of stagnation. In addition, swarm leader and worst solution enhancement is also conducted in both methods to reduce the probabilities of being trapped in local optima. The proposed search strategies in both models work cooperatively to overcome premature convergence and increase the chances of finding global optimality.

3. The Proposed PSOVA1 Model

In this research, we propose two PSO variants for feature selection, which aim to overcome two major shortcomings of the original PSO model, i.e., premature convergence and weak local exploitation near the optimal solutions [4,22]. We introduce the first proposed PSO model, i.e., PSOVA1, in this section. Specifically, the proposed PSOVA1 model employs four major strategies, including (1) Gaussian distribution-based swarm leader improvement, (2) DE and mirroring schemes for worst solution enhancement, (3) a modified PSO position updating strategy based on ameliorated pbest and gbest, and (4) spiral based local exploitation. The implementation of these four mechanisms is able to increase population and search diversity, therefore increasing the likelihood of attaining global optimality as compared with the original PSO algorithm.
The novel aspects of the proposed PSOVA1 model are presented below. Firstly, we propose a modified PSO operation where the rectified forms of gbest and pbest, as well as the Logistic map-oriented chaotic inertia weight are used to increase global exploration. In particular, the personal and global best signals in the search operation are further enhanced using remote and randomly selected promising neighboring solutions to overcome stagnation. Secondly, a logarithmic spiral search mechanism oriented by gbest is used to intensify local exploitation. A dynamic switching probability is designed to enable the search process to balance between the aforementioned global (first) and local (second) search operations. Thirdly, Gaussian distribution is used to enhance the swarm leader. It enables gbest to conduct local exploitation to avoid being trapped in local optima. Then, the mirroring and DE-based mutation operations are employed to improve the three weakest particles in the swarm. The details of the proposed PSOVA1 model are illustrated in Algorithm 1.
Overall, the Gaussian distribution based gbest enhancement, the mutation strategies for enhancement of the worst solutions, exploration schemes assisted by ameliorated gbest and pbest, as well as the intensified fine-tuning capability using the spiral search operation, cooperate with and benefit from each other to effectively avoid being trapped in local optima and increase the likelihood of attaining global optimality. We introduce each of the four proposed strategies in detail below.
Algorithm 1. The pseudo-code of the proposed PSOVA1 model.
1   Start
2   Initialise a particle swarm using the Logistic chaotic map;
3   Evaluate each particle using the objective function f (x) and identify the pbest solution of each particle, and the global best solution, gbest;
4   Construct a Worst_memory, which stores the three weakest particles with the
lowest fitness values, and identify the worst solution as gworst;
5   While (termination criteria are not met)
6   {
7   Conduct swarm leader enhancement using Gaussian distribution as defined in Equation (3);
Use the new solution to replace gbest if it is fitter;
8   For (each particle i in the population) do
9   {
10   If (particle i belongs to Worst_memory)
11   {
12   If (particle i is gworst)
13   {
14   Construct an offspring solution by employing the local mutation operation based on gbest as defined in Equation (4), and use it to replace the global worst solution if the new offspring solution is fitter;
15   Else
16   Construct an offspring solution by employing the DE-based
mutation operation based on three randomly selected pbest
solutions as defined in Equations (5)–(6);
17   Evaluate the offspring solution and update the position of particle i in Worst_memory based on the annealing schedule as defined in Equation (7);
18   } End If
19   Update the pbest and gbest solutions;
20   } End If
21   } End For
22   For (each particle i in the population) do
23   {
24   If Rand < pswitch
25   {
26   Establish a memory of groupi which includes all neighboring pbest
solutions with higher or equal fitness scores than that of the pbest solution
of the current particle i, i.e., pbesti;
27   Identify the neighboring fitter pbest solution in groupi with the highest
degree of dissimilarity to gbest, denoted as pbestD;
28   Calculate the ameliorated gbest solution, i.e., gbestM, by averaging the
following two solutions, i.e., pbestD and gbest, as indicated in Equation (8);
29   Randomly select another neighboring fitter pbest solution from groupi,
denoted as pbestR,
30   Calculate the ameliorated pbest solution, i.e., pbestM by averaging pbestR
and the personal best solution of particle i, pbesti, as shown in Equation (9);
31   Conduct position updating using gbestM and pbestM for particle i as
defined in Equation (10);
32   Else
33   Move particle i around gbest by following a logarithmic spiral search path
as shown in Equation (11);
34   } End If
35   } End For
36   For (each particle i in the population) do
37   {
38   Evaluate each particle i using the objective function;
39   Update the pbest and gbest solutions;
40   } End For
41 } End While
42 Output gbest;
43 End

3.1. A Swarm Leader Enhancing Mechanism

In the context of feature selection, both the elimination of critical features and inclusion of contradictory attributes can impose significant consequences on classification performance. Therefore, a swarm leader enhancing mechanism using the skewed Gaussian distributions is proposed to equip gbest with further discriminative capabilities to overcome local optima traps. Such Gaussian distributions and random walk strategies have also been widely adopted in existing studies for leader or swarm enhancement [33,34,36,50,55]. As shown in Equation (3), gbest is mutated successively based on three Gaussian distributions with different skewness settings. Specifically, on the basis of the gbest solution, the Gaussian distribution with a positive skewness (right-skewed) is likely to eliminate noisy or irrelevant features, while the operation with a negative skewness (left-skewed) is more inclined to include more discriminative features. In addition, the standard Gaussian distribution (non-skewed) is employed to conduct local exploitation of gbest with neutrality in determining the feature numbers [34,55,56].
g b e s t d = g b e s t d + α × G a u s s i a n ( h ) × ( U d L d )
where gbest’d represents the enhanced global best solution. Parameter α denotes the step size, and is assigned as 0.1 based on the recommendation of related studies [56]. Parameter h represents the skewness of the Gaussian distribution, and is set as −1, 1, and 0 for the left-, right- and non-skewed Gaussian distributions, respectively, based on extensive trial-and-error processes. Besides that, Ud and Ld represent the upper and lower boundaries of the dth dimension, respectively. The new solution generated by the Gaussian distribution is used to replace gbest if it is fitter.

3.2. Mutation-Based Worst Solution Enhancement

We subsequently enhance the weak particles in the swarm by conducting the mirroring mutation on the swarm leader and a DE-based operation on the local elite solutions.
Firstly, a gbest-based local mutation scheme is proposed to enhance the global worst solution in the swarm. As in Equation (4), the new particle is produced by conducting the mirroring effects and reversing the sign of gbest with a mutation probability, rmu, in each dimension. This simulates the effects of randomly activating or de-selecting some of the features on the basis of the current best feature subset represented by gbest. In short, the gbest-based local mutation scheme guarantees a balance between preserving effective information captured by the current gbest solution and introducing stochastic perturbations to create new momentum for the newly generated solution. Such mirroring actions were also widely adopted in existing studies [34,57] to increase population diversity:
x d n e w = { g b e s t d   i f   r a n d r m u , g b e s t d   o t h e r w i s e ,
where rmu represents the mutation probability, and is set to 0.9 based on trial-and-error and recommendations in related studies [56]. When a randomly generated value is more than or equals to rmu, the new offspring is assigned with the value of the mirroring—gbest solution in the dth dimension, otherwise it is assigned with the value of gbest solution in that dimension. This operation is used to yield a new offspring solution to replace the worst particle in the swarm, if it is fitter.
Secondly, a DE-based mechanism is proposed to improve the second and third worst individuals in the swarm. It produces new particles by following the mutation and crossover operations of DE using three pbest solutions randomly selected from the collection of all pbest individuals in the swarm, as shown in Equations (5) and (6). The differential weight, F, in Equation (5) is generated using the Sinusoidal chaotic map, in order to increase the variety of perturbations for the donor vector, xdonord, in each dimension. Furthermore, the crossover parameter, cr, is generated by the Logistic chaotic map to introduce more randomness to the crossover process in each dimension and exploit more feature interactions on a global scale. When a randomly generated value is more than cr, the current dimension in the new solution is inherited from the corresponding dimension of the personal best solution, otherwise it is inherited from that of the newly generated donor solution. Owing to the adoption of several distinctive personal best solutions in the search operations, this DE-based mutation operation is able to increase population diversity significantly when the pbest solutions of the particles illustrate sufficient variance from one another in the early search stage:
x d d o n o r = p b e s t d 1 + F × ( p b e s t d 2 p b e s t d 3 )
x d n e w = { x d d o n o r   i f   r a n d C r , p b e s t i d   o t h e r w i s e ,
where pbest1d, pbest2d, and pbest3d represent three randomly selected pbest solutions of the swarm particles in the dth dimension, while pbesti represents the pbest solution of the current particle i. xdonord and xnewd denote the donor and new solutions in the dth dimension, respectively. In addition, F and cr represent the differential weight and crossover factor, respectively.
The newly generated fitter solution is accepted directly while the acceptance of a weaker mutated solution is determined by an annealing schedule, as defined in Equation (7) [56]:
p = e x p ( Δ f T ) > δ
where T represents the temperature for controlling the annealing process, and Δf indicates the fitness difference between the mutated and original solutions. Constant δ is a randomly generated value in the range of [0, 1]. A linear cooling schedule is employed to decrease the temperature, i.e., T = σT, whereas σ is assigned as 0.9 according to [56].
The two mutation operations based on the DE and gbest mirroring operations operate in parallel, in order to improve the weak particles in the swarm.

3.3. Diversity-Enhanced PSO Evolving Strategy

In order to address stagnations in the original PSO model, we construct two distinctive search mechanisms, i.e., a modified PSO search strategy and an intensified spiral exploitation action, to increase diversification and intensification. A dynamic switching probability schedule is also proposed to achieve the best trade-off between both mechanisms and exploit the merits of both search operations to the maximum extent.
We firstly upgrade the position updating strategy in the original PSO operation by introducing ameliorated pbest and gbest, combined with the Logistic chaotic map, to enhance search diversity. As indicated in Equation (8), the global best experience is ameliorated by adopting the mean position of two solutions, i.e., the gbest solution and a neighboring superior pbest solution, i.e., pbestD, possessing the highest degree of dissimilarity to gbest. The dissimilarity measure between gbest and any pbest solution is determined by the number of distinctive units in their binary forms, which are converted by following the existing studies [10]. In other words, the pbest solution that has the least number of shared selected features in comparison with those recommended by gbest is selected as pbestD. Moreover, as defined in Equation (9), the local best experience is ameliorated by adopting the mean position of the particle’s own pbest and another randomly chosen superior pbest solution, i.e., pbestR, in the neighborhood. Equation (10) is used to conduct position updating, which employs the enhanced global and local optimal signals defined in Equations (8) and (9), respectively:
g b e s t d M = ( g b e s t d + p b e s t d D ) / 2
p b e s t d M = ( p b e s t i d + p b e s t d R ) / 2
v i d t + 1 = σ × v i d t + c 1 × r 1 × ( p b e s t d M x i d t ) + c 2 × r 2 × ( g b e s t d M x i d t )
where pbestD represents the pbest solution with the highest degree dissimilarity to gbest among all neighboring superior pbest solutions, while pbestR represents a randomly chosen pbest solution. Moreover, gbestM and pbestM represent the enhanced global and local optimal indicators in the proposed position updating strategy, respectively, while σ represents the inertia weight generated by the Logistic chaotic map.

3.4. An Intensified Spiral Exploitation Scheme

An intensified spiral exploitation scheme is introduced to overcome the limitations of the fine-tuning capability of the original PSO algorithm in the near optimal regions. The logarithmic spiral search is originally proposed in the MFO algorithm [11]. We employ this spiral operation to fine-tune the swarm particles in the final iterations. By conducting this local spiral search action, a search space of hyper-ellipse around gbest is constructed on each dimension using the spiral function, as defined in Equations (11) and (12) [11]. As a result, the exploitation around the near-optimal solutions can be significantly intensified:
x i d t + 1 = D × exp ( b × l ) × cos ( 2 π l ) + g b e s t d
D = | g b e s t d x i d t |
where D denotes the distance between gbest and particle i in the dth dimension, b is a constant to control the shape of logarithmic spiral, with l as a random number in the range of [–1, 1].
Moreover, we propose a dynamic switching probability schedule with the aim to achieve a trade-off between global exploration and local exploitation in the PSOVA1 model, as demonstrated in Equation (13):
p s w i t c h = 1 ( i t e r / M a x _ i t e r ) 2
where pswitch denotes the switching probability, while iter and Max_iter represent the current and maximum iterations, respectively. In each iteration, when the switching probability pswitch is higher than a randomly generated value in the range of [0, 1], i.e., pswitch > rand, the modified PSO operation discussed in Section 3.3 is conducted. Otherwise, the spiral search action depicted in this section is conducted. In general, the proposed dynamic schedule of pswitch not only ensures sufficient global exploration to identify the promising regions in the early search stage, but also guarantees thorough exploitations in the near optimal region before converging in the final iterations.

4. The Proposed PSOVA2 Model

We further enhance the PSOVA1 model by incorporating new search actions accompanied with diverse nonlinear search trajectories to extend search territory. Specifically, we propose four new strategies in PSOVA2 to refine the transition between search diversity and swarm convergence, i.e., (1) an adaptive exemplar breeding mechanism incorporating multiple local and global best solutions, (2) search coefficient generation using sine, cosine, and hyperbolic tangent functions, (3) worst solution enhancement using a hybrid re-dispatching scheme, and (4) an exponential exploitation mechanism for swarm leader improvement.
PSOVA2 further strengthens PSOVA1 by providing new search mechanisms on the best leader generation and position updating operations. The novel aspects of the proposed PSOVA2 model are as follows. Firstly, an adaptive exemplar breeding mechanism is proposed, which produces a new exemplar by incorporating multiple local and global best solutions to guide the search process. On top of it, a new search action is proposed by embedding diverse search coefficients yielded using sine, cosine, and hyperbolic tangent formulae. In comparison with PSOVA1 where the search mainly focuses on a modified PSO operation in principle, the aforementioned new search operations equip the search process with a variety of distinctive search behaviors and irregular search trajectories. In addition, scattering and random permutations from the pbest solutions are incorporated for enhancement of the worst solutions. An adaptive exponential search flight is also used for swarm leader improvement. These new strategies demonstrate great capabilities in accelerating convergence while preserving search diversity. The pseudo-code of PSOVA2 is provided in Algorithm 2. We introduce each proposed strategy in the following sub-sections.
Algorithm 2. The pseudo-code of the proposed PSOVA2 model.
1   Start
2   Initialise a particle swarm using the Logistic chaotic map;
3   Evaluate each particle using the objective function f (x) and identify the pbest
solution of each particle, and the global best solution, gbest;
4   While (termination criteria are not met)
5   {
6   Conduct swarm leader enhancement as defined in Equations (26)–(27);
7   Implement the worse solution enhancement as defined in Equations (23)–(25);
8   For (each particle i in the population) do
9   {
10   Construct a breeding exemplar as defined in Equations (15)–(18);
11   Select a coefficient generation function from Equations (19)–(22) randomly;
12   For (each dimension j) do
13   { % Choose the target optimal signal to follow in each dimension
14   If Rand < 0.4
15   {
16   Choose the breeding exemplar as the target signal for position updating;
17   Else
18   Choose the gbest solution as the target signal for position updating;
19   } End If
20   Update the position of particle i on dimension j as defined in Equation (14);
21   } End For
22   } End For
23   For (each particle i in the population) do
24   {
25   Evaluate each particle i using the objective function;
26   Update pbest and gbest solutions;
27   } End For
28   } End While
29   Output gbest;
30   End

4.1. A New Attraction Operation with Differentiated Search Trajectories

Firstly, a new search operation is proposed. It includes an exemplar breeding strategy and a search coefficient generation scheme using four nonlinear formulae. Equation (14) defines the proposed search action:
x i d t + 1 = x i d t + f s   ×   ( x d t a r g e t x i d t )   +   G a u s s i a n ( t )
where fs denotes a search coefficient generated by customized sine, cosine, and hyperbolic tangent functions, respectively, and xtarget represents a target optimal indicator such as the exemplar or the swarm leader. Gaussian (t) indicates a random walk following a Gaussian distribution. Equation (14) is used for position updating in PSOVA2. We introduce the exemplar breeding and nonlinear search coefficient generation in detail in Section 4.1.1 and Section 4.1.2, respectively.

4.1.1. Exemplar Generation Using Adaptive Incorporation of Multiple Optimal Solutions

Instead of completely following the gbest solution over the search course as in the original PSO algorithm, an adaptive exemplar generation scheme is proposed. It incorporates two adaptive operations for exemplar generation, i.e., (1) stochastic recombination and dynamic incorporation of different numbers of the pbest elicit solutions, and (2) an adaptive weight generation to attenuate the impact imposed by the pbest solutions, while amplifying the influence of the gbest solution over the search course. Specifically, an exemplar is generated through the proposed breeding mechanism between the pbest and gbest solutions for each particle through three steps. Firstly, a predefined number of the pbest solutions (i.e., three or fewer) are randomly selected, and then aggregated into one offspring solution by multiplying random but normalized weights on each dimension, as illustrated in Equation (15). Secondly, the adaptive weights for governing the priority of the aggregated offspring and the gbest solution during the breeding operation are generated by two proposed mathematical formulae defined in Equations (16) and (17). Figure 1 presents a visualization of adaptive weight generation defined in Equations (16) and (17). Lastly, the exemplar solution is produced by conducting weighted aggregation between the gbest solution and the offspring solution yielded by Equation (15) in each dimension, as defined in Equation (18):
x d o f f s p r i n g = { ( c 1 × p b e s t d 1 + c 2 × p b e s t d 2   +   c 3 × p b e s t d 3 ) / ( c 1 + c 2 + c 3 )   i f   i t e r [ 1 ,   25 ] , ( c 1 × p b e s t d 1 + c 2 × p b e s t d 2 ) / ( c 1 + c 2 )   i f   i t e r [ 26 ,   50 ] , p b e s t d 1   i f   i t e r [ 51 ,   75 ] , 0   i f   i t e r [ 76 ,   100 ] ,
m 1 = 0.4 + 0.5 × sin ( π / 2 × i t e r / M a x _ i t e r ) × sinh ( i t e r / M a x _ i t e r )
m 2 = 0.4 × cos ( π / 2 × i t e r / M a x _ i t e r ) × cosh ( i t e r / M a x _ i t e r )
x d e x e m p l a r = m 1 × g b e s t d   +   m 2 × x d o f f s p r i n g
where xoffspring and xexemplar represent the offspring solution generated from randomly sampled pbest solutions and the obtained exemplar solution through the breeding mechanism, while m1 and m2 represent the adaptive weights for gbest and xoffspring respectively. Parameters c1, c2, and c3 possess randomly generated scores within [0, 1].
Specifically, we prescribe a decreasing process to control the number of selected pbest solutions for exemplar generation. It starts with three pbest solutions being randomly selected for breeding, then eliminating one in every 25 iterations. As a result, four different cases are produced through the iterative process for pbest selection, i.e., 3 for iterations [1, 25], 2 for iterations [25, 50], 1 for iterations [25, 75], and 0 for iterations [75, 100]. At the beginning of the search process, the higher number of selected pbest solutions aims to introduce more perturbation to gbest during breeding owing to the higher degree of optimal signal diversity and less similarity among the pbest solutions. This can further facilitate the exploration in previously unexploited search territory. By eliminating the selected pbest solutions through the iterative process as well as the higher similarity among elicit solutions owing to a gradually converged population, the disturbance produced by breeding on gbest becomes less significant as compared with that from the early stages. Therefore, the search becomes more accelerated through the incorporation of elicit genes from gbest while maintaining the necessary level of diversity owing to distinctive elements from the recombination effect among the pbest solutions. When the search comes to the final stage, none of the pbest solutions are selected. As such, the exemplar becomes identical to the gbest solution, in order to facilitate the exploitation around the most optimal regions. As a result, the exploration at the early stage is intensified, and search diversity can be maintained through a dynamic incorporation of the pbest solutions.
In addition to the above proposed mechanisms, we introduce two adaptive trajectories for regulating the impact of the gbest and pbest signals during breeding over the entire iterative process, as illustrated in Figure 1. The weighting factor of the gbest signal (m1) keeps increasing from 0.4 to approximately 1 as the iteration increases, while that of the pbest signal (m2) keeps decreasing from 0.4 to 0. Moreover, the slopes for both adaptive trajectories change slowly at the beginning of the iteration, and then gradually ascend as the number of iterations increases. As such, the impact of the pbest indicators would not diminish too fast, in order to maintain diversity. At the same time, the influence of the gbest solution becomes strengthened over iterations, in order to accelerate convergence. In other words, the proposed adaptive search trajectories enable the exemplar to conduct more exploration attempts in the early stage by receiving significant and diverse influence from the pbest signals while ensuring a thorough exploitation around the promising regions in the final stage by receiving a dominant impact from the gbest solution. As a result, the proposed trajectories are capable of accelerating convergence while preserving diversity.
The generated exemplar is subsequently used to guide the search operation. To further increase diversification and avoid stagnation, we employ diverse search coefficients yielded using sine, cosine, and hyperbolic tangent functions, which are explained in detail in the next section.

4.1.2. Nonlinear Search Coefficient Generation

We further devise four nonlinear functions for coefficient generation in Equation (14). The objective is to conduct distinctive yet complementary search operations around the exemplar and the gbest solution, in order to further increase diversity and overcome stagnation. The proposed coefficient generation functions are presented in Equations (19)–(22) and plotted in Figure 2. In general, the first two functions, i.e., f1 and f2, enable the particles to jump randomly in all directions around the destination optimal signal. The next two functions, i.e., f3 and f4, avail the particles to approach the optimal indicator with various speeds and intensities. Specifically, as illustrated in Figure 2 (blue line), f1 takes a hyperbolic tangent formula, 2/3 * tanh (2x – ½), as defined in Equation (19). It increases in the range of [−0.3, 0.5] in a gradual manner. It facilitates the particles to deploy a thorough exploration around the target optimal signal in two ways, i.e., approaching it slowly when positive values are generated and distancing from it mildly when negative values are yielded. In contrast, as illustrated in Figure 2 (red line), f1 takes a sin (cos(2π × rand2)) formula, as defined in Equation (20). Comparing with other three functions, it changes more rapidly with a wider range approximately in [−0.9, 0.9] for coefficient generation. It enables the particles to perform larger jumps to escape from local stagnation.
As illustrated in Figure 2 (yellow line), f3 takes a cos (sin(π/2 × rand2)) formula, as defined in Equation (21). It constantly maintains at a high plateau in the range of [0.5, 1]. It regulates the particles to march towards the target optimal solution with a large step, in order to accelerate convergence. On the contrary, as indicated in Figure 2 (purple line), f4 takes a cos−1(cos(π/4 × x2)) formula, as defined in Equation (22). It increases in a gradual manner in the range of [0, 0.7]. It enables the particles to deploy an intensive exploitation in the promising regions. In each iteration, each particle is able to choose from the aforementioned four coefficient generation strategies with an equal probability to maintain search diversify. In comparison with standard sine, cosine, and hyperbolic tangent functions, the proposed refined formulae offer more erratic and irregular search trajectories:
f 1 = 2 / 3 * tan h ( 2 x 1 / 2 )
f 2 = sin ( cos ( 2 π × x 2 ) )
f 3 = cos ( sin ( π / 2 × x 2 ) )
f 4 = cos 1 ( cos ( π / 4 × x 2 ) )
where x is a randomly generated value within [0, 1], while f1, f2, f3, and f4 are the four coefficient generation functions (i.e., specified sine, cosine, and hyperbolic tangent functions). The generated coefficients are used as search parameters fs in the position updating procedure as defined in Equation (14).
As discussed earlier, we compose these four distinctive coefficient generation strategies in a complementary manner as an effort to enhance search diversity. When the particles are trapped at local optima, large jumps and reverse directions are able to drive the search out of stagnation. On the other hand, minor movements become dominant when a detailed, near optimal exploitation is needed. Moreover, in the entire input range, the generated coefficients from these four functions are always dominated by positive signals, i.e., at least three positive outputs among four, which are able to lead the swarm to the promising regions in an accelerated manner.
The parameter generation strategies are incorporated with the proposed exemplar breeding scheme to leverage their respective advantages, i.e., diversification of the movement strategies and the destination signals. Specifically, in every position updating process, each particle is able to choose one coefficient generation function from the proposed four formulae randomly. Then, in each dimension, the particle is able to choose one best signal to follow from the breeding exemplar and the gbest solution.
To be specific, the four coefficient generation strategies possess equal probabilities to be chosen for each particle. Note that gbest is allocated a higher probability to be chosen when updating the particle position in each dimension, as shown in lines 8–22 in Algorithm 2. A threshold of 0.4 is determined based on trial-and-error. Such a setting is able to achieve a reasonable balance between introducing a proper perturbation and inheriting benign signals from the swarm leader.

4.2. A Hybrid Re-Dispatching Scheme for Enhancement of the Worst Solutions

To further accelerate convergence, we enhance several worst solutions by diverting such solutions to the optimal regions using a hybrid dispatching scheme. Specifically, we enhance the worse solutions by exploiting the personal best solutions as well as stochastic disturbance induced by random initialisation. As shown in Equations (23) and (24), two donor vectors, denoted as xdonor1 and xdonor2, are generated by random initialisation and random permutations from the pbest solutions, respectively. In particular, the element on each dimension of xdonor2 is obtained by inheriting the value from the corresponding dimension of a randomly selected pbest solution. A random number is generated for each dimension as the determinant for the hybridization process, as shown in Equation (25). The element is inherited from the corresponding dimension of xdonor1 when the determinant is smaller than or equals to 0.5. Otherwise, it inherits the corresponding element from xdonor2.
x d d o n o r 1 = L d + β × ( U d L d )
x d d o n o r 2 = p b e s t d r a n d o m
x d n e w = { x d d o n o r 1   i f   r a n d 0.5 , x d d o n o r 2   o t h e r w i s e ,
where xdonor1 and xdonor2 represent the donor vectors generated by random initialisation and random selection from the pbest solutions, respectively, while β is a random number within [0, 1]. pbestdrandom denotes a randomly selected personal best solution in the d-th dimension. This worst solution enhancement procedure is conducted three times to generate three offspring xnewd solutions, which are subsequently used to replace the three worst particles with the lowest fitness scores in the swarm, respectively.
Compared with a complete random initialisation, this hybridization scheme for enhancement of the worst solutions is capable of enhancing such solutions by exploiting elicit genes from the population to accelerate convergence.

4.3. Swarm Leader Enhancement Using an Adaptive Exponential Search Flight

In addition, we propose an exponential function to generate random search steps for enhancement of the swarm leader, as defined in Equation (26).
As depicted in Figure 3, the generated step g is confined within [−0.5, 0.5] with an input value between [0, 1]. As a result, a smaller magnitude of steps enables the swarm leader to examine thoroughly within its vicinity from all directions, in order to discover a better position to further improve its quality. Equation (27) is used to generate an offspring solution of gbest using the newly generated search step g.
g = 2 / ( 1 + e ( 1 2 x ) ) 1
g b e s t d = g b e s t d + g × ( U d L d )
where x is a randomly generated value within [0, 1], while Ud and Ld denote the upper and lower boundaries of the d-th dimension, respectively. This new gbest’ solution is used to replace gbest if it is fitter.
Overall, the proposed PSOVA2 model incorporates the aforementioned four major improvements to further enhance search dynamics and diversity. They include an adaptive exemplar breeding mechanism, search coefficient generation using nonlinear functions, exponential exploitation and re-dispatching schemes for swarm leader and worst solution enhancement. They account for the efficiency of PSOVA2 in accelerating convergence while maintaining diversity.
Both proposed PSO variants are integrated with a K-Nearest-Neighbor (KNN) classifier to conduct fitness evaluation during the search process. Equation (28) [3,4,42,43] defines the objective function, which is used to assess the fitness of each particle:
f i t n e s s ( x ) = k 1 × a c c u r a c y x + k 2 × ( n u m _ o f _ f e a t u r e s x ) 1
where k1 and k2 denote the weights of classification accuracy and the number of selected features, respectively. We assign k1 = 0.9 and k2 = 0.1 by following the recommendation in previous studies [3,4,42,43].
The optimisation objective of the proposed PSO models is to identify the most discriminative feature subset from a given database. The fitness function aims to maximize the classification accuracy rate while reducing the number of selected features. The search process of the most significant feature subset is conducted as follows. The particles are initialised with continuous values in each dimension using the Logistic map at the beginning of the search process. Each particle is used to represent the initial randomly assigned feature subset, where the particle dimension is the same as the number of the features in a given dataset. During fitness evaluation, we convert each element of each particle into a binary value, i.e., 1 or 0, representing the selection (1) or non-selection (0) of a particular feature. The recommended feature subset by each particle is evaluated using the training data set. The KNN model with five neighbors, as recommended in related studies [19,58], is employed to evaluate the fitness of the selected feature subset with a 10-fold cross-validation method. A fitness score is calculated using Equation (28). The identified final swarm leader represents the most optimal feature subset. We subsequently evaluate the efficiency of this selected feature subset using the unseen test set in the test phase. The aforementioned feature selection process using each proposed PSO model combining with KNN is also illustrated in Algorithm 3. We evaluate the effectiveness of both proposed PSO variants in feature selection tasks in Section 5.
Algorithm 3. The pseudo-code of the hybrid PSOVA1/PSOVA2-KNN feature selection model.
1   Start
2   Initialise a particle swarm using the Logistic chaotic map;
3   For (each particle i in the population) do
4   {
5   Convert particle i into a corresponding feature subset by selecting features
on the dimensions where positive values are assigned;
6   Calculate classification performance of the feature subset encoded in particle
i on the training data set using the KNN classifier;
7   Evaluate the fitness score of particle i based on its classification performance
and number of selected features using the proposed objective function f (x)
as shown in Equation (28);
8   Identify the pbest solution of each particle and the global best solution
gbest;
9   } End For
10   While (termination criteria are not met)
11   {
12   Evolve swarm particles using the proposed mechanisms in PSOVA1 (i.e., line
7–35 in Algorithm 1) or PSOVA2 (i.e., line 6–22 in Algorithm 2);
13   For (each particle i in the population) do
14   {
15   Evaluate particle i using the objective function on the training set;
16   Update pbest and gbest solutions;
17   } End For
18   } End While
19   Output gbest;
20   Convert gbest into the identified optimal feature subset;
21   Calculate classification performance on the unseen test set based on the yielded
optimal feature subset using the KNN classifier;
22   Output the test classification results & the selected features;
23   End

5. Evaluation and Discussion

We employ a total of 13 datasets to investigate the efficiency of the proposed PSO models for feature selection. The employed datasets pose diverse challenges to feature selection problems, owing to a great variety of dimensionalities as well as complicated class distributions. The proposed PSO variants are integrated with a KNN-based wrapper model to conduct feature optimisation, where the number of the nearest neighbor is set to 5 according to the recommendation in previous studies [19,58]. Three performance indicators are used to examine the effectiveness of the proposed PSO variants, i.e., classification accuracy, number of selected features, and F-score. Furthermore, we compare the proposed PSO variants against five classical search algorithms, i.e., PSO [22], DE [59], SCA [60], DA [61], and GWO [62], as well as ten PSO variants, i.e., CSO [52], HPSO-SSM [19], binary PSO (BPSO) [63], modified binary PSO with local search and a swarm variability controlling scheme (MBPSO) [53], CatfishBPSO [54], GPSO [41], MPSOELM [45], MFOPSO [44], BBPSOVA [42], and ALPSO [27]. To ensure a fair comparison, we employ the same number of function evaluations (i.e., population size × the maximum number of iterations) as the stopping criterion for all search methods. In our experiments, the population size and the maximum number of iterations are set to 30 and 100, respectively, based on trial and error. We conduct 30 runs for each experiment.

5.1. Data Sets

We employ the ALL-IDB2 database [64] for Acute Lymphoblastic Leukaemia (ALL) diagnosis, as well as ten UCI data sets [65], namely Arcene, MicroMass, Parkinson’s disease (Parkinson), Human activity recognition (Activity), LSVT voice rehabilitation (Voice), Grammatical facial expressions (Facial Expression), Heart disease (Heart), Ionosphere, Epileptic seizure (Seizure) and Wisconsin breast cancer diagnostic data set (Wdbc), for evaluation. Besides that, two additional microarray gene expression data sets, i.e., Crohn’s disease (Crohn) and Multiple Myeloma (Myeloma), from the Gene Expression Omnibus repository [66], are employed for evaluation. The details of each data set are shown in Table 2. These data sets pose diverse challenges to feature selection models, owing to a great variety of dimensionalities and class numbers, as well as complex data distributions. Specifically, the dimensionality of the employed data sets spans from 30 to 22,283, while the number of the classes ranges from 2 to 10. Moreover, according to previous studies [42,67,68], the employed data sets contain significant challenging factors (e.g., small inter-class and large intra-class variations) which can severely affect classification performance. Overall, a comprehensive evaluation can be established for the proposed PSO variants, in view of the dimensionality, number of classes, and sample distributions, pertaining to the data sets used for evaluation.

5.2. Parameter Settings

We compare the proposed PSO variants against fifteen baseline methods, i.e., five classical search algorithms, i.e., PSO, DE, SCA, DA, and GWO, and ten advanced PSO variants, i.e., CSO, HPSO-SSM, BPSO, MBPSO, CatfishBPSO, GPSO, MPSOELM, MFOPSO, BBPSOVA, and ALPSO. The parameter settings of each baseline method employed in this study are set in accordance with the recommendations in their original studies. The detailed parameters of the proposed PSO models and fifteen baseline methods are presented in Table 3.

5.3. Results and Discussion

A comprehensive evaluation on the proposed PSO variants is established. Specifically, we adopt four different performance measures, i.e., classification accuracy, the F-score measure, number of selected features, and convergence performance, in our experiments. A total of 30 runs are conducted in each experiment, and the average results are computed for comparison. Table 4 and Table 5 summarise the classification accuracy rates, F-scores, and their corresponding standard deviation results, respectively, while Table 8 presents the numbers of selected features for all the search methods. The best results are marked in bold accordingly.

5.3.1. Classification Performance

With respect to classification accuracy in Table 4, PSOVA1 and PSOVA2 achieve the highest accuracy scores on all thirteen classification tasks. They outperform all the fifteen baseline algorithms consistently. Specifically, PSOVA1 produces the highest accuracy scores on two datasets, i.e., Parkinsons and Facial Expression, while PSOVA2 yields the best accuracy scores on the remaining eleven datasets. Moreover, the empirical results reveal the advantages of the proposed models over fifteen baseline methods, especially on data sets with high dimensionalities, e.g., Crohn (22,283), Myeloma (12,625), and MicroMass (1300), as well as data sets with fuzzy boundaries and small inter-class variations, e.g., Heart (72).
Specifically, on the Heart data set, PSOVA2 outperforms the top three best classical search methods, i.e., SCA, HPSO-SSM, and DE, by 6.21%, 7.97%, and 8.06%, respectively. On the MicroMass data set, PSOVA2 outperforms the top three best search methods, i.e., GWO, BBPSOVA, and SCA, by 4.88%, 4.94%, and 5.51%, respectively. Evident performance gaps can also be observed between PSOVA1 and fifteen baseline methods. The effectiveness of both proposed PSO models is further ascertained by the F-score measure, as shown in Table 5. Similar to the accuracy measures, the proposed PSO models achieve the highest F-score performances on all thirteen data sets and outperform fifteen baseline methods with significant performance gaps, especially on feature selection tasks with high complexities, e.g., MicroMass and Heart data sets. Moreover, in comparison with those of fifteen baseline models, our proposed PSO variants demonstrate smaller or similar standard deviation results with respect to both the accuracy and F-score measures. This indicates the reliability of the proposed PSOVA1 and PSOVA2 models in producing superior classification performances across the employed feature selection tasks with various dimensionalities. The reliability of the proposed PSO variants will be further examined using the Wilcoxon statistical test.
We subsequently analyze the performance gaps pertaining to the challenges posed by some example data sets as well as the superiority of both proposed models, as follows. With respect to ALL, the proposed models have successfully identified the clinical features critical to ALL diagnosis, e.g., cytoplasm and nucleus areas, ratio between the nucleus and cytoplasm areas, form factor, compactness, perimeter, and eccentricity [42,67]. These features are commonly selected more than 15 times out of 30 trials by both proposed models. Specifically, as an important indicator of cell irregularity and eccentricity, the inclusion of the ratio between the nucleus and cytoplasm areas in the selected feature subsets can make a significant difference to accurate diagnosis of ALL. However, the baseline models often fail to consider the interactive impact between cytoplasm and nucleus owing to the negligence of either of them in the selected features. Overall, the feature selection results further indicate the effectiveness of both proposed models in identifying the most discriminatory characteristics to ALL diagnosis. In comparison, the baseline models often partially identify these important discriminative features, or overlook some aspects of sophisticated feature interactions, owing to the stagnation at local optima. Likewise, with respect to the diagnosis of coronary heart disease with three different severity levels [69], the feature subsets generated by the proposed PSO models reveal a number of key features, e.g., chest pain type, serum cholesterol, maximum heart rate, and ST depression. These have been identified as critical clinical features for the diagnosis of heart disease in existing studies [70].
The Wilcoxon rank sum test is conducted based on the mean classification accuracy rates, in order to further indicate the statistical difference of both proposed PSO models against the baseline methods. As illustrated in Table 6 and Table 7, most of the test results are lower than 0.05, ascertaining that both proposed PSO models outperform the fifteen baseline models on most of the data sets, significantly. Comparing with PSOVA1, PSOVA2 achieves a better statistical superiority. Specifically, PSOVA1 outperforms all the baseline methods for five data sets (Crohn, Myeloma, MicroMass, Parkinsons, and Activity), while PSOVA2 outperforms all the baseline methods for eight data sets (Crohn, Myeloma, Arcene, MicroMass, Parkinsons, Activity, Seizure, and Heart), with statistical significance. Out of 180 evaluations (12 data sets × 15 baseline algorithms), PSOVA1 does not show statistically significant differences in eighteen instances with respect to the baseline methods, as compared with eleven instances from PSOVA2.
The top three baseline models with the most competitive performances in comparison with those of our proposed PSO variants are DE, BBPSOVA, and SCA. Specifically, PSOVA1 shows similar result distributions to those of DE on ALL, Ionosphere, and Wdbc, to those of BBPSOVA on Voice, Ionosphere, and Wdbc, as well as to those of SCA on Arcene, Heart, and Ionosphere data sets, whereas PSOVA2 demonstrates similar performance distributions to those of DE on ALL and Wdbc, to those of BBPSOVA on Voice and Wdbc, as well as to those of SCA on Ionosphere data set.
The advantages of the proposed PSO models become more apparent on classification tasks with higher dimensionalities and sophisticated class distributions, i.e., Crohn (22,283), Myeloma (12,625), Micromass (1300), Parkinson (753), and Activity (561). PSOVA2 depicts statistically significant superiority against all the baseline methods for these high-dimensional data sets. This is because of the adoption of diverse exemplars to guide the search in each dimension, as well as the employment of versatile search trajectories to rectify particle positions.
The search strategies in most of the baseline models are monotonous, therefore are more likely to be trapped in local optima in NP-hard problems, such as feature selection tasks. Owing to the proposed comprehensive strategies of avoiding the local optima traps, the search diversity and robustness are significantly enhanced in both proposed PSO models, therefore the likelihood of ascertaining the global optima. Overall, the statistical results prove the significant superiorities of both proposed PSO models over the five classical search methods and ten advanced PSO variants, especially in feature selection tasks with higher complexities.

5.3.2. Selected Feature Sizes

With respect to the number of selected features, as shown in Table 8, CSO selects the fewest numbers of features on eight data sets, i.e., Crohn, Myeloma, Arcene, Voice, Facial Expression, Seizure, ALL and Wdbc, while GWO obtains the smallest feature sizes on four data sets, i.e., MicroMass, Parkinsons, Activity, and Heart. Owing to the excessive elimination of essential features, CSO achieves the lowest classification accuracy rates on the five data sets, i.e., Micromass, Voice, ALL, Wdbc, and Crohn. This indicates that CSO falls into local optima on the above data sets during training, which leads to the stagnation in performance. According to the fitness evaluation illustrated in Equation (28), this phenomenon in turn results in the severe removal of features, in order to further improve the fitness scores. As such, very small feature subsets are identified during the feature selection process, which may not be able to capture sufficient characteristics, leading to a severe performance deterioration in the test stage. On the contrary, the proposed PSO variants succeed in achieving an efficient trade-off between eliminating redundant features and improving performance. They select comparatively smaller feature subsets than those from many search methods in most of the test cases, while achieving the highest accuracy rates and the F-score results on all thirteen test data sets. In particular, the proportions of the eliminated features by PSOVA2 are 65.46%, 66.22%, 65.88%, 63.32%, 63.51%, 66.93%, 64.84%, and 67.54%, on eight high-dimensional data sets, i.e., Crohn, Myeloma, Arcene, MicroMass, Parkinsons, Activity, Voice, and Facial Expression, respectively. A similar feature elimination capability is also depicted by PSOVA1. In short, the empirical results indicate the significant capabilities of the proposed PSO models in removing irrelevant and noisy features while identifying the most discriminative and effective ones without falling into local optima traps during the search process.

5.3.3. Convergence Rates and Computational Costs

The mean convergence curves over a set of 30 runs for each search method on two high-dimensional data sets, i.e., Myeloma and Crohn, respectively, are provided to indicate model efficiency in the training stage.
As illustrated in Figure 4 (Myeloma) and Figure 5 (Crohn), both proposed PSO models (two dash lines) achieve promising results. Specifically, the proposed PSO models illustrate faster convergence rates than those from the baseline models, while maintaining the momentum to improve the fitness score through the entire search course. PSOVA2 performs better than PSOVA1, especially during the later stage of the search course. The proposed exemplar breeding mechanisms and diverse attraction operations with non-linear parameters account for the superior capabilities of PSOVA2 in preserving diversity and overcoming local stagnation. Moreover, CSO illustrates faster convergence rates than those of the proposed models, but at the expense of excessive elimination of a large number of features. It is likely that CSO is trapped in local optima, and its performance becomes stagnant. This is supported by its deterioration in classification accuracy and F-measure results, as indicated in Table 4 and Table 5. On the contrary, the proposed models achieve comparatively a balanced trade-off between feature elimination and performance improvement.
Since fitness evaluation is the most time-consuming procedure during the search cycle, the computational load of PSOVA1 and PSOVA2 primarily hinges on the population size × the maximum number of iterations. Note that all the search methods employ the same maximum number of function evaluations during the training stage. As such, all the search methods have a similar computational cost in principle, which is governed by the time taken for fitness evaluation. On the other hand, the internal search mechanisms are different from one algorithm to another, therefore the computational cost of each algorithm differs slightly. Table 9 depicts the average computational costs during training with respect to the proposed PSO models and other search methods over 30 runs on the Crohn, Myeloma, and Seizure data sets. Since they have either high-dimensional features or large sample sizes, these data sets are selected for computational cost analysis. The computational costs of all the methods pertaining to other data sets may vary in accordance with the training sample sizes and dimensions. As indicated in Table 9, in most of the cases, both proposed PSO models show comparatively lower or comparable computational costs in comparison with those from most of the baselines methods. CSO, GWO, and PSOVA2 achieve the most efficient training computational costs for Crohn, Myeloma, and Seizure data sets, respectively.

5.3.4. Evaluation of The Proposed Mechanisms in PSOVA1 and PSOVA2

We subsequently demonstrate the efficiency of each proposed mechanisms in both PSOVA1 and PSOVA2 using the Seizure and Voice data sets. The mean classification accuracy rates over 30 runs are shown in Table 10. The empirical results indicate that each strategy in each proposed model is able to drive the search out of stagnation and enhance the feature selection performance. The results conform to the principles of the introduced mechanisms. In particular, the exemplar breeding mechanism and the versatile search operations using compound sine, cosine, and hyperbolic tangent functions in PSOVA2 are comparatively more effective than the modified PSO operation with ameliorated optimal signals and spiral-based local exploitation in PSOVA1. This is primarily owing to the employment of diverse exemplars to lead the search in each dimension, as well as the adoption of versatile search courses to rectify the particle positions in PSOVA2.
In comparison with the original PSO model and PSOVA1, instead of using single leader or rectified separate global and personal best experiences to guide the search process, an exemplar generation scheme with adaptive aggregation of the local and global optimal signals is used in PSOVA2. As such, the impact of the local optimal indicators is more significant at the beginning stage of the search process and the influence of the global best solution is more dominating towards the final iterations. Such an exemplar breeding scheme in PSOVA2 is more capable of overcoming stagnation. Unlike PSOVA1 where the search mainly focuses on a modified PSO algorithm, PSOVA2 employs four search strategies implemented using refined sine, cosine, and hyperbolic tangent formulae for the position updating procedure to increase search diversification.
The mechanisms proposed in both PSO models work in a collaborative manner to diversify the search process and mitigate premature convergence. In PSOVA1, when the modified PSO algorithm with rectified optimal signals becomes stagnant over the iterations, the local exploitation mechanism based on the spiral search action is able to further explore the near-optimal regions and drive the search out of stagnation. In PSOVA2, when the customized sine-based search operation is trapped in local optima, other search mechanisms such as cosine and hyperbolic tangent oriented search actions are able to extend the search territory to overcome early stagnation. In short, the empirical results indicate that the proposed mechanisms in each model offer great efficiency in mitigating premature convergence, leading to great capabilities in accelerating convergence while preserving diversity.
Besides the above, we further evaluate the efficiency of each proposed strategy in PSOVA1 and PSOVA2 for tackling minimization problems using a set of 11 benchmark functions. They include four multimodal functions (i.e., Ackley, Griewank, Rastrigin, and Powell) and seven unimodal landscapes (i.e., Dixon-Price, Rotated Hyper-Ellipsoid, Rosenbrock, Sphere, Sum of Different Powers, Sum Squares, and Zakharov). The definitions of these benchmark functions are provided in [23,34,50,71]. The following experimental settings are employed for model evaluation, i.e., population size = 30, dimension = 30, maximum number of iterations = 500, and trials = 30. Table 11 illustrates the mean, maximum, minimum, and standard deviation results for all the test functions with the best results highlighted in bold. As shown in Table 11, both the mean and minimal results over 30 runs indicate that our models with individual or composite proposed mechanisms all significantly outperform the standard PSO model. For each of the proposed PSO variants, sequential aggregation of the proposed mechanisms amounts to better search efficiencies and capabilities, as evidenced by the enhanced performances. Moreover, PSOVA2 outperforms PSOVA1 on 9 out of 11 test functions. Overall, the empirical results of the test functions demonstrate great superiority of the proposed models. The search mechanisms in PSOVA1 and PSOVA2 work in cooperation to achieve the best performances owing to the advanced trade-offs between diversification and intensification.

5.3.5. Discussion

The empirical results of classification performance, feature elimination effects, as well as convergence rates all indicate the superiority of the proposed PSO variants over other baseline methods in undertaking feature selection tasks, i.e., constructing simplified but valid feature subsets while improving classification performance.
Both proposed PSOVA1 and PSOVA2 models adopt hybrid leader signals and diversified search operations to overcome local optima traps. In essence, PSOVA2 inherits all merits of PSOVA1. It further endows the particles with a higher degree of freedom in terms of (1) the choice of destination signals, and (2) the choice of movements to approach the destination solutions. Besides the generation of the combined best leader by adaptively incorporating both local and global best signals, PSOVA2 implements multiple movement operations towards the destination signal where the search coefficients are delivered by four distinctive yet complementary nonlinear functions. These search mechanisms offer the choices of either a large jump to propel the convergence or a gradual stroll to intensify the exploitation, as well as the choices of either marching towards or distancing from the destination signals. As a result, PSOVA2 is likely to attain global optimality successfully, while preventing stagnation at the local optima traps effectively.
In contrast, for the employed baseline classical search methods, certain limitations have been identified in previous studies, as widely discussed in the literature. Specifically, the search capability of DE can be severely compromised, owing to the failure of generating promising solutions within a limited number of function evaluations [56]. GWO demonstrates a strong bias towards the origin of the coordinate system attributed by its simulated model, as well as stagnation at the local optima traps owing to the poor exploration capability [72]. DA suffers from a poor exploitation capability, owing to the fact that it does not keep track of the elite solutions [2,61]. In addition, most of the existing PSO variants are equipped with improvements from the perspective of either exploration or exploitation, rather than comprehensively taking into account the trade-off between both operations. Overall, the proposed PSO models demonstrate great superiorities over the baseline methods in attaining the global optimality, owing to a delicate consideration of both global exploration and local exploitation. This is realised through distraction with the elicit solutions as well as detection with diverse steps and possible movement in all directions, respectively. Therefore, both proposed PSO models are capable of improving classification performance by identifying the most discriminative features and eliminating noisy and irrelevant ones, as evidenced by the empirical results along with the statistical tests. Moreover, PSOVA2 performs better than PSOVA1 in undertaking feature selection problems owing to the enhanced diversity induced by a greater freedom in choosing the exemplar signals to guide the search in each dimension, as well as a greater versatility in ways of approaching such destination solutions.

6. Conclusions

In this research, we proposed two PSO models, namely PSOVA1 and PSOVA2, for undertaking a variety of feature selection tasks. Each of the proposed models incorporates a number of distinctive search mechanisms to elevate exploitation of undiscovered search regions, guided by hybrid leader signals. These formulated strategies in each model work cooperatively to produce diverse search behaviors in terms of search flights and directions. In particular, PSOVA2 elevates search diversity by adopting adaptive exemplars as well as four search operations where the search coefficients are implemented using refined sine, cosine, and hyperbolic tangent functions to overcome stagnation.
Evaluated using a total of 13 data sets, with diverse dimensionalities from 30 to 22,283, both models outperform five classical search methods and ten advanced PSO variants significantly in most test cases, as evidenced by the empirical and statistical test results. Specifically, PSOVA1 outperforms all the baseline methods for five data sets (Crohn, Myeloma, MicroMass, Parkinsons, and Activity), while PSOVA2 outperforms all the baseline methods for eight data sets (Crohn, Myeloma, Arcene, MicroMass, Parkinsons, Activity, Seizure and Heart), with statistical significance.
In future directions, other hybrid leader breeding mechanisms will be explored to further enhance performance. Moreover, we also aim to evaluate the proposed models using complex computer vision tasks, e.g., deep architecture generation for object detection and classification [51,73,74,75] as well as image description generation [76,77].

Author Contributions

Conceptualization, H.X. and L.Z.; methodology, H.X. and L.Z.; software, H.X.; validation, H.X.; formal analysis, H.X., L.Z. and C.P.L.; investigation, H.X. and L.Z.; resources, L.Z.; data curation, H.X., L.Z., C.P.L., Y.Y. and H.L.; writing—original draft preparation, H.X. and L.Z.; writing—review and editing, L.Z., C.P.L., Y.Y. and H.L.; visualization, H.X.; supervision, L.Z., C.P.L., Y.Y. and H.L.; project administration, L.Z.; funding acquisition, L.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research is funded by the RDF PhD Studentship at Northumbria University.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data sets employed in this study are publicly available at the site of UCI Machine Learning Repository, https://archive.ics.uci.edu/ml/datasets.php.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Gheyas, I.A.; Smith, L.S. Feature subset selection in large dimensionality domains. Pattern Recognit. 2010, 43, 5–13. [Google Scholar] [CrossRef] [Green Version]
  2. Mafarja, M.; Heidari, A.A.; Faris, H.; Mirjalili, S.; Aljarah, I. Dragonfly Algorithm: Theory, Literature Review, and Application in Feature Selection. In Nature-Inspired Optimizers: Theories, Literature Reviews and Applications; Mirjalili, S., Dong, J.S., Lewis, A., Eds.; Springer International Publishing: Cham, Switzerland, 2020; pp. 47–67. [Google Scholar]
  3. Zhang, L.; Mistry, K.; Lim, C.P.; Neoh, S.C. Feature selection using firefly optimization for classification and regression models. Decis. Support Syst. 2018, 106, 64–85. [Google Scholar] [CrossRef] [Green Version]
  4. Xue, B.; Zhang, M.; Browne, W.N.; Yao, X. A Survey on Evolutionary Computation Approaches to Feature Selection. IEEE Trans. Evol. Comput. 2016, 20, 606–626. [Google Scholar] [CrossRef] [Green Version]
  5. Dash, M.; Liu, H. Feature selection for classification. Intell. Data Anal. 1997, 1, 131–156. [Google Scholar] [CrossRef]
  6. Guyon, I.; Elisseeff, A. An introduction to variable and feature selection. J. Mach. Learn. Res. 2003, 3, 1157–1182. [Google Scholar]
  7. Zhou, T.; Lu, H.; Wang, W.; Yong, X. GA-SVM based feature selection and parameter optimization in hospitalization expense modeling. Appl. Soft Comput. 2019, 75, 323–332. [Google Scholar]
  8. Baig, M.Z.; Aslam, N.; Shum, H.P.; Zhang, L. Differential evolution algorithm as a tool for optimal feature subset selection in motor imagery EEG. Expert Syst. Appl. 2017, 90, 184–195. [Google Scholar] [CrossRef]
  9. Ghosh, A.; Datta, A.; Ghosh, S. Self-adaptive differential evolution for feature selection in hyperspectral image data. Appl. Soft Comput. 2013, 13, 1969–1977. [Google Scholar] [CrossRef]
  10. Xue, B.; Zhang, M.; Browne, W.N. Particle Swarm Optimization for Feature Selection in Classification: A Multi-Objective Ap-proach. IEEE Trans. Cybern. 2013, 43, 1656–1671. [Google Scholar] [CrossRef]
  11. Mirjalili, S. Moth-flame optimization algorithm: A novel nature-inspired heuristic paradigm. Knowl. Based Syst. 2015, 89, 228–249. [Google Scholar] [CrossRef]
  12. Jothi, G.; Hannah, H.I. Hybrid Tolerance Rough Set–Firefly based supervised feature selection for MRI brain tumor image classification. Appl. Soft Comput. 2016, 46, 639–651. [Google Scholar]
  13. Singh, U.; Singh, S.N. A new optimal feature selection scheme for classification of power quality disturbances based on ant colony framework. Appl. Soft Comput. 2019, 74, 216–225. [Google Scholar] [CrossRef]
  14. Abdel-Basset, M.; EI-Shahat, D.; EI-henawy, I.; Albuquerque, V.H.C.; Mirjalili, S. A new fusion of grey wolf optimizer algorithm with a two-phase mutation for feature selection. Expert Syst. Appl. 2020, 139, 112824. [Google Scholar] [CrossRef]
  15. Mafarja, M.; Mirjalili, S. Whale optimization approaches for wrapper feature selection. Appl. Soft Comput. 2018, 62, 441–453. [Google Scholar] [CrossRef]
  16. Sindhu, R.; Ngadiran, R.; Yacob, Y.M.; Zahri, N.A.H.; Hariharan, M. Sine–cosine algorithm for feature selection with elitism strategy and new updating mechanism. Neural Comput. Appl. 2017, 28, 2947–2958. [Google Scholar] [CrossRef]
  17. Hsieh, S.-T.; Sun, T.-Y.; Liu, C.-C.; Tsai, S.-J. Efficient Population Utilization Strategy for Particle Swarm Optimizer. IEEE Trans. Syst. Mancybern. Part B 2008, 39, 444–456. [Google Scholar] [CrossRef] [PubMed]
  18. Liang, J.; Qin, A.; Suganthan, P.; Baskar, S. Comprehensive learning particle swarm optimizer for global optimization of multimodal functions. IEEE Trans. Evol. Comput. 2006, 10, 281–295. [Google Scholar] [CrossRef]
  19. Chen, K.; Zhou, F.-Y.; Yuan, X.-F. Hybrid particle swarm optimization with spiral-shaped mechanism for feature selection. Expert Syst. Appl. 2019, 128, 140–156. [Google Scholar] [CrossRef]
  20. Ahn, C.W.; An, J.; Yoo, J.-C. Estimation of particle swarm distribution algorithms: Combining the benefits of PSO and EDAs. Inf. Sci. 2012, 192, 109–119. [Google Scholar] [CrossRef]
  21. Iqbal, M.; de Oca, M.A.M. An Estimation of Distribution Particle Swarm Optimization Algorithm. In Ant Colony Optimization and Swarm Intelligence; Springer: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
  22. Kennedy, J.; Eberhart, R. Particle swarm optimization. In Proceedings of the ICNN’95-International Conference on Neural Networks, Perth, WA, Australia, 27 November–1 December 1995; Volume 4, pp. 1942–1948. [Google Scholar] [CrossRef]
  23. Pandit, D.; Zhang, L.; Chattopadhyay, S.; Lim, C.P.; Liu, C. A scattering and repulsive swarm intelligence algorithm for solving global optimization problems. Knowl. Based Syst. 2018, 156, 12–42. [Google Scholar] [CrossRef] [Green Version]
  24. Wang, H.; Sun, H.; Li, C.; Rahnamayan, S.; Pan, J.-S. Diversity enhanced particle swarm optimization with neighborhood search. Inf. Sci. 2013, 223, 119–135. [Google Scholar] [CrossRef]
  25. Tan, T.Y.; Zhang, L.; Neoh, S.C.; Lim, C.P. Intelligent skin cancer detection using enhanced particle swarm optimization. Knowl. Based Syst. 2018, 158, 118–135. [Google Scholar] [CrossRef]
  26. Tan, T.Y.; Zhang, L.; Lim, C.P. Intelligent skin cancer diagnosis using improved particle swarm optimization and deep learn-ing models. Appl. Soft Comput. 2019, 84, 105725. [Google Scholar] [CrossRef]
  27. Tan, T.Y.; Zhang, L.; Lim, C.P.; Fielding, B.; Yu, Y.; Anderson, E. Evolving Ensemble Models for Image Segmentation Using En-hanced Particle Swarm Optimization. IEEE Access 2019, 7, 34004–34019. [Google Scholar] [CrossRef]
  28. Chen, J.; Zheng, J.; Wu, P.; Zhang, L.; Wu, Q. Dynamic particle swarm optimizer with escaping prey for solving constrained non-convex and piecewise optimization problems. Expert Syst. Appl. 2017, 86, 208–223. [Google Scholar] [CrossRef]
  29. Li, M.; Chen, H.; Shi, X.; Liu, S.; Zhang, M.; Lu, S. A multi-information fusion “triple variables with iteration” inertia weight PSO algorithm and its application. Appl. Soft Comput. 2019, 84, 105677. [Google Scholar] [CrossRef]
  30. Xia, X.; Gui, L.; He, G.; Wei, B.; Zhang, Y.; Yu, F.; Wu, H.; Zhan, Z. An expanded particle swarm optimization based on mul-ti-exemplar and forgetting ability. Inf. Sci. 2020, 508, 105–120. [Google Scholar] [CrossRef]
  31. Chen, Q.; Sun, J.; Palade, V. Distributed Contribution-Based Quantum-Behaved Particle Swarm Optimization With Controlled Diversity for Large-Scale Global Optimization Problems. IEEE Access 2019, 7, 150093–150104. [Google Scholar] [CrossRef]
  32. Lin, A.; Sun, W.; Yu, H.; Wu, G.; Tang, H. Global genetic learning particle swarm optimization with diversity enhancement by ring topology. Swarm Evol. Comput. 2019, 44, 571–583. [Google Scholar] [CrossRef]
  33. Zhang, Z.; Yu, Y.; Zheng, S.; Todo, Y.; Gao, S. Exploitation Enhanced Sine Cosine Algorithm with Compromised Population Diversity for Optimization. In Proceedings of the 2018 IEEE International Conference on Progress in Informatics and Computing (PIC), Suzhou, China, 14–16 December 2018; pp. 1–7. [Google Scholar]
  34. Jordehi, A.R. Enhanced leader PSO (ELPSO): A new PSO variant for solving global optimisation problems. Appl. Soft Comput. 2015, 26, 401–417. [Google Scholar] [CrossRef]
  35. Kang, L.; Chen, R.-S.; Xiong, N.; Chen, Y.-C.; Hu, Y.-X.; Chen, C.-M. Selecting Hyper-Parameters of Gaussian Process Regression Based on Non-Inertial Particle Swarm Optimization in Internet of Things. IEEE Access 2019, 7, 59504–59513. [Google Scholar] [CrossRef]
  36. Yu, X.; Yu, X.; Lu, Y.; Yen, G.G.; Cai, M. Differential evolution mutation operators for constrained multi-objective optimization. Appl. Soft Comput. 2018, 67, 452–466. [Google Scholar] [CrossRef]
  37. Cao, Y.; Zhang, H.; Li, W.; Zhou, M.; Zhang, Y.; Chaovalitwongse, W.A. Comprehensive learning Particle Swarm Optimi-zation algorithm with local search for multimodal functions. IEEE Trans. Evol. Comput. 2019, 23, 718–731. [Google Scholar] [CrossRef]
  38. Xu, X.; Li, J.; Zhou, M.; Xu, J.; Cao, J. Accelerated Two-Stage Particle Swarm Optimization for Clustering Not-Well-Separated Data. IEEE Trans. Syst. Mancybern. Syst. 2020, 50, 4212–4223. [Google Scholar] [CrossRef]
  39. Elbaz, K.; Shen, S.-L.; Sun, W.-J.; Yin, Z.-Y.; Zhou, A. Prediction Model of Shield Performance During Tunneling via Incorporating Improved Particle Swarm Optimization Into ANFIS. IEEE Access 2020, 8, 39659–39671. [Google Scholar] [CrossRef]
  40. Elbaz, K.; Shen, S.-L.; Zhou, A.; Yin, Z.-Y.; Lyu, H.-M. Prediction of Disc Cutter Life During Shield Tunneling with AI via the Incorporation of a Genetic Algorithm into a GMDH-Type Neural Network. Engineering 2020. [Google Scholar] [CrossRef]
  41. Chen, Q.; Chen, Y.; Jiang, W. Genetic Particle Swarm Optimization-Based Feature Selection for Very-High-Resolution Re-motely Sensed Imagery Object Change Detection. Sensors 2016, 16, 1204. [Google Scholar] [CrossRef] [Green Version]
  42. Srisukkham, W.; Zhang, L.; Neoh, S.C.; Todryk, S.; Lim, C.P. Intelligent leukaemia diagnosis with bare-bones PSO based feature optimization. Appl. Soft Comput. 2017, 56, 405–419. [Google Scholar] [CrossRef]
  43. Mistry, K.; Zhang, L.; Neoh, S.C.; Lim, C.P.; Fielding, B. A Micro-GA Embedded PSO Feature Selection Approach to Intelligent Facial Emotion Recognition. IEEE Trans. Cybern. 2017, 47, 1496–1509. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  44. Chang, W.-D. A modified particle swarm optimization with multiple subpopulations for multimodal function optimization problems. Appl. Soft Comput. 2015, 33, 170–182. [Google Scholar] [CrossRef]
  45. Nayak, D.R.; Dash, R.; Majhi, B. Discrete ripplet-II transform and modified PSO based improved evolutionary extreme learning machine for pathological brain detection. Neurocomputing 2018, 282, 232–247. [Google Scholar] [CrossRef]
  46. Jin, X.; Xu, A.; Bie, R.; Guo, P. Machine learning techniques and chi-square feature selection for cancer classification using SAGE gene expression profiles. In Lecture Notes in Computer Science; Springer: Singapore, 2006; pp. 106–115. [Google Scholar]
  47. Peng, H.; Long, F.; Ding, C. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 2005, 27, 1226–1238. [Google Scholar] [CrossRef] [PubMed]
  48. Gu, S.; Cheng, R.; Jin, Y. Feature selection for high-dimensional classification using a competitive swarm optimizer. Soft Com-Puting 2018, 22, 811–822. [Google Scholar] [CrossRef] [Green Version]
  49. Moradi, P.; Gholampour, M. A hybrid particle swarm optimization for feature subset selection by integrating a novel local search strategy. Appl. Soft Comput. 2016, 43, 117–130. [Google Scholar] [CrossRef]
  50. Tan, T.Y.; Zhang, L.; Lim, C.P. Adaptive melanoma diagnosis using evolving clustering, ensemble and deep neural networks. Knowl. Based Syst. 2020, 187, 104807. [Google Scholar] [CrossRef]
  51. Fielding, B.; Zhang, L. Evolving Image Classification Architectures with Enhanced Particle Swarm Optimisation. IEEE Access 2018, 6, 68560–68575. [Google Scholar] [CrossRef]
  52. Cheng, R.; Jin, Y. A Competitive Swarm Optimizer for Large Scale Optimization. IEEE Trans. Cybern. 2015, 45, 191–204. [Google Scholar] [CrossRef] [PubMed]
  53. Vieira, S.M.; Mendonca, L.F.; Farinha, G.J.; Sousa, J.M.C. Modified binary PSO for feature selection using SVM applied to mor-tality prediction of septic patients. Appl. Soft Comput. 2013, 13, 3494–3504. [Google Scholar] [CrossRef]
  54. Chuang, L.-Y.; Tsai, S.-W.; Yang, C.-H. Improved binary particle swarm optimization using catfish effect for feature selection. Expert Syst. Appl. 2011, 38, 12699–12707. [Google Scholar] [CrossRef]
  55. Zhang, Y.; Zhang, L.; Neoh, S.C.; Mistry, K.; Hossain, M.A. Intelligent affect regression for bodily expressions using hybrid particle swarm optimization and adaptive ensembles. Expert Syst. Appl. 2015, 42, 8678–8697. [Google Scholar] [CrossRef]
  56. Yang, X.-S. Nature-Inspired Optimization Algorithms; Yang, X.-S., Ed.; Elsevier: Oxford, UK, 2014; pp. 77–87. [Google Scholar]
  57. Verma, O.P.; Aggarwal, D.; Patodi, T. Opposition and dimensional based modified firefly algorithm. Expert Syst. Appl. 2016, 44, 168–176. [Google Scholar] [CrossRef]
  58. Emary, E.; Zawbaa, H.M.; Hassanien, A.E. Binary ant lion approaches for feature selection. Neurocomputing 2016, 213, 54–65. [Google Scholar] [CrossRef]
  59. Storn, R.; Price, K. Differential Evolution—A Simple and Efficient Heuristic for global Optimization over Continuous Spaces. J. Glob. Optim. 1997, 11, 341–359. [Google Scholar] [CrossRef]
  60. Mirjalili, S. SCA: A Sine Cosine Algorithm for solving optimization problems. Knowl. Based Syst. 2016, 96, 120–133. [Google Scholar] [CrossRef]
  61. Mirjalili, S. Dragonfly algorithm: A new meta-heuristic optimization technique for solving single-objective, discrete, and multi-objective problems. Neural Comput. Appl. 2016, 27, 1053–1073. [Google Scholar] [CrossRef]
  62. Mirjalili, S.; Mirjalili, S.M.; Lewis, A. Grey Wolf Optimizer. Adv. Eng. Softw. 2014, 69, 46–61. [Google Scholar] [CrossRef] [Green Version]
  63. Marinakis, Y.; Marinaki, M.; Dounias, G. Particle swarm optimization for pap-smear diagnosis. Expert Syst. Appl. 2008, 35, 1645–1656. [Google Scholar] [CrossRef]
  64. Labati, R.D.; Piuri, V.; Scotti, F. All-IDB: The acute lymphoblastic leukemia image database for image processing. In Proceedings of the 2011 18th IEEE International Conference on Image Processing, Brussels, Belgium, 11–14 September 2011; pp. 2045–2048. [Google Scholar]
  65. Blake, C.; Merz, C. Uci Repository of Machine Learning Databases; University of California: Irvine, CA, USA, 1998. [Google Scholar]
  66. Edgar, R.; Domrachev, M.; Lash, A.E. Gene Expression Omnibus: NCBI Gene Expression and Hybridization Array Data Re-pository. Nucleic Acids Res. 2002, 30, 207–210. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  67. Neoh, S.C.; Srisukkham, W.; Zhang, L.; Todryk, S.; Greystoke, B.; Lim, C.P.; Hossain, M.A.; Aslam, N. An Intelligent Decision Support System for Leukaemia Diagnosis using Microscopic Blood Images. Sci. Rep. 2015, 5, 14938. [Google Scholar] [CrossRef] [Green Version]
  68. Mahé, P.; Arsac, M.; Chatellier, S.; Monnin, V.; Perrot, N.; Mailler, S.; Girard, V.; Ramjeet, M.; Surre, J.; Lacroix, B.; et al. Automatic identification of mixed bacterial species fingerprints in a MALDI-TOF mass-spectrum. Bioinformatics 2014, 30, 1280–1286. [Google Scholar] [CrossRef] [Green Version]
  69. Detrano, R.; Janosi, A.; Steinbrunn, W.; Pfisterer, M.; Schmid, J.-J.; Sandhu, S.; Guppy, K.H.; Lee, S.; Froelicher, V. International application of a new probability algorithm for the diagnosis of coronary artery disease. Am. J. Cardiol. 1989, 64, 304–310. [Google Scholar] [CrossRef]
  70. Latha, C.B.C.; Jeeva, S.C. Improving the accuracy of prediction of heart disease risk based on ensemble classification tech-niques. Inform. Med. Unlocked 2019, 16, 100203. [Google Scholar] [CrossRef]
  71. Zhang, L.; Srisukkham, W.; Neoh, S.C.; Lim, C.P.; Pandit, D. Classifier ensemble reduction using a modified firefly algorithm: An empirical evaluation. Expert Syst. Appl. 2018, 93, 395–422. [Google Scholar] [CrossRef]
  72. Long, W.; Jian, J.; Liang, X.; Tang, M. An exploration-enhanced grey wolf optimizer to solve high-dimensional numerical opti-mization. Eng. Appl. Artif. Intell. 2018, 68, 63–80. [Google Scholar] [CrossRef]
  73. Lawrence, T.; Zhang, L. IoTNet: An Efficient and Accurate Convolutional Neural Network for IoT Devices. Sensors 2019, 19, 5541. [Google Scholar] [CrossRef] [Green Version]
  74. Xie, H.; Zhang, L.; Lim, C.P.; Yu, Y.; Liu, C.; Liu, H.; Walters, J. Improving K-means clustering with enhanced Firefly Algo-rithms. Appl. Soft Comput. 2019, 84, 105763. [Google Scholar] [CrossRef]
  75. Zhang, L.; Mistry, K.; Neoh, S.C.; Lim, C.P. Intelligent facial emotion recognition using moth-firefly optimization. Knowl. Based Syst. 2016, 111, 248–267. [Google Scholar] [CrossRef] [Green Version]
  76. Kinghorn, P.; Zhang, L.; Shao, L. A region-based image caption generator with refined descriptions. Neurocomputing 2018, 272, 416–424. [Google Scholar] [CrossRef] [Green Version]
  77. Kinghorn, P.; Zhang, L.; Shao, L. A hierarchical and regional deep learning architecture for image description generation. Pattern Recognit. Lett. 2019, 119, 77–85. [Google Scholar] [CrossRef] [Green Version]
Figure 1. Adaptive coefficients for the gbest solution (blue) and the pbest signal (red) for exemplar generation (where x axis denotes a randomly generated value between 0 and 1, and y axis denotes the weight parameters, i.e., m1 and m1 defined in Equations (16) and (17)).
Figure 1. Adaptive coefficients for the gbest solution (blue) and the pbest signal (red) for exemplar generation (where x axis denotes a randomly generated value between 0 and 1, and y axis denotes the weight parameters, i.e., m1 and m1 defined in Equations (16) and (17)).
Sensors 21 01816 g001
Figure 2. An illustration of four distinctive coefficient generation functions defined in Equations (19)–(22) (where x axis denotes a randomly generated value between 0 and 1, while y axis signifies fs as defined in Equation (14)).
Figure 2. An illustration of four distinctive coefficient generation functions defined in Equations (19)–(22) (where x axis denotes a randomly generated value between 0 and 1, while y axis signifies fs as defined in Equation (14)).
Sensors 21 01816 g002
Figure 3. The governing function for generating the random step g (where x axis represents a randomly generated value between 0 and 1, while y axis signifies the search step g, as defined in Equation (26)):
Figure 3. The governing function for generating the random step g (where x axis represents a randomly generated value between 0 and 1, while y axis signifies the search step g, as defined in Equation (26)):
Sensors 21 01816 g003
Figure 4. Mean convergence curves over 30 runs for the classical search methods (left) and advanced PSO variants (right) for the Multiple Myeloma data set (where x and y axes denote the iteration number and the fitness score, respectively.).
Figure 4. Mean convergence curves over 30 runs for the classical search methods (left) and advanced PSO variants (right) for the Multiple Myeloma data set (where x and y axes denote the iteration number and the fitness score, respectively.).
Sensors 21 01816 g004
Figure 5. Mean convergence curves over 30 runs for the classical search methods (left) and advanced PSO variants (right) for the Crohn’s disease data set (where x and y axes denote the iteration number and the fitness score, respectively.).
Figure 5. Mean convergence curves over 30 runs for the classical search methods (left) and advanced PSO variants (right) for the Crohn’s disease data set (where x and y axes denote the iteration number and the fitness score, respectively.).
Sensors 21 01816 g005
Table 1. Comparison between existing studies and this research.
Table 1. Comparison between existing studies and this research.
StudiesPopulation InitialisationMultiple LeadersExemplar Breeding StrategiesModification of Existing Search OperationsNovel Search MechanismsLeader EnhancementOther Diversity Enhancing Strategies
PSO [22]RandomNo (single leader)NoNo (the original PSO operation)NoNoNo
Wang et al. [24]RandomNoNoNoLocal and global neighborhood search based on the ring topologyNoTrial particle generation using a crossover factor & a DE operation
Lin et al. [32]RandomNoRing topology for exemplar generationThe updated PSO operation with the exemplar and the adaptive parametersNoNoNo
Chen et al. [31] RandomNoNoExpansion-contraction coefficient and diversity measurement used in position updatingNoNoGenotype diversity measure and contribution-based fitness evaluation allocation
Chang [44] (MFOPSO)RandomNoNoThe search led by each sub-swarm leaderNoNoMultiple sub-swarms
Fielding et al. [51]RandomNoNoCosine-based adaptive search parametersNoNoNo
Srisukkham et al. [42] (BBPSOVA)RandomThe mean of all the personal bests The average of the local and global best solutionsThe average of the local and global optimal signals leading the attraction actionAn evading action led by the mean of the worst indicatorsNoTwo sub-swarms
Tan et al. [27] (ALPSO)RandomTwo remote swarm leadersThe best leader and a remote second leaderUsing helix search coefficientsHybridization with SA and DE operationsNoNo
Chen et al. [41] (GPSO)RandomNoNoNo NoNoA crossover operator for population diversification
Nayak et al. [45] (MPSOELM)RandomNoNoUsing time-varying acceleration coefficients and an adaptive inertia weightNoNoNo
Jordehi [34] (ELPSO)RandomNoNoNoNo5-staged mutationNo
Kang et al. [35]RandomNoNoA momentum element is used to replace the inertial component. NoMutation-based leader enhancementNo
Zhang et al. [33]RandomNoNoNoLocal search action using two randomly selected particles with a Gaussian search stepNoDistance-based population diversity estimation
Yu et al. [36]RandomNoSolution selection based on domination relationships and density measurementNoNoNoInfeasible solution enhancement using Gaussian mutation
Chen et al. [19] (HPSO-SSM)RandomNoNoUsing a logistic map to generate the inertia weightLocal exploitation using a spiral search operationNoNonlinear coefficients used for velocity updating
Cheng and Jin [52] (CSO)RandomWinners from pairwise competition NoUsing a logarithmic linear regression relationship to generate the coefficient for the social componentPosition updating by learning from the winner solution NoNo
Vieira et al. [53] (MBPSO)RandomNoNoNoResetting the swarm leader by deselecting features, and mutation on personal best solutions by flipping randomly Using a mirroring operation when the maximum velocity is reached
Chuang et al. [54] (CatfishBPSO)RandomNoNoNo10% worst solutions replaced by dimension-wise random assignmentNoNo
Elbaz et al. [39]RandomNoNoUsing a time-varying adaptive inertia weight and a constriction factor for velocity updatingNoNoNo
PSOVA1 (This research)Logistic mapAn enhanced hybrid global best signalEnhancing local and global best solutions using neighboring personal best experiencesThe updated PSO operation with enhanced local and global best signals. Local exploitation using a spiral search operationSwarm leader enhancement using Gaussian distributionsMutation and DE-based worst solution enhancement
PSOVA2 (This research)Logistic mapAn adaptive exemplar incorporating multiple local and global best solutions Exemplar generation using adaptive weightings between local and global optimal signals, as well as a dynamic number of local best solutions.N/AA new search operation using the exemplar or the swarm leader as the best signal, with search coefficients generated using sine, cosine and hyperbolic tangent functions.Swarm leader enhancement using an adaptive exponential functionWorst solution enhancement using a hybrid re-dispatching scheme
Table 2. Introduction of the thirteen data sets for evaluation.
Table 2. Introduction of the thirteen data sets for evaluation.
Data SetNumber of AttributesNumber of ClassesNumber of Instances
Crohn22,2832127
Myeloma12,6252173
Arcene10,0002200
MicroMass130010360
Parkinsons7532756
Activity56161000
Voice3102126
Facial Expression30121062
Seizure17824600
ALL802180
Heart724124
Ionosphere332253
Wdbc302569
Table 3. Parameter settings of each algorithm.
Table 3. Parameter settings of each algorithm.
AlgorithmParameters
PSO [22]cognitive component c 1 = 2, social component c 2 = 2, inertial weight w = 0.9 m × ( ( 0.9 0.4 ) / m a x _ i t e r , where m and m a x _ i t e r denote the current and maximum iteration numbers, respectively.
DE [59]differential weight F ( 0 ,   1 ) , crossover parameter C r = 0.4.
SCA [60] r 1 = a m × a / m a x _ i t e r , where a = 3 . r 2 = 2 π × r a n d , r 3 = 2 × r a n d ,
and r 4 = r a n d . r 1 , r 2 , r 3 and r 4 are four main search parameters.
DA [61]separation factor = 0.1, alignment factor = 0.1, cohesion factor = 0.7, food factor = 1, enemy factor = 1, inertial weight = 0.9 m × ( ( 0.9 0.4 ) / m a x _ i t e r .
GWO [62] A = 2 × a × r 1 a , where a is linearly decreasing from 2 to 0, and r 1 ( 0 ,   1 ) . C = 2 × r 2 , where r 2 ( 0 ,   1 ) . A and C are both coefficient vectors.
CSO [52] r 1 , r 2 , r 3 ( 0 ,   1 ) , where r 1 , r 2 , and r 3 are search parameters randomly selected within [0, 1]. controlling parameter Φ = 0.1.
HPSO-SSM [19]cognitive component c 1 = 2, social component c 2 = 2, inertial weight w = Logistic map. R 1 = 1 / ( 1 + e x p ( a × ( min ( S P ) / max ( S P ) ) ) t , where S P is the particle position vector, while t is the current iteration, and a = 2 . R 2 = 1 R 1 .
BPSO [63]cognitive component c 1 = 2, social component c 2 = 2, w m a x = 0.9, w m i n = 0.01, inertial weight w = w m a x m × ( w m a x w m i n ) / m a x _ i t e r .
MBPSO [53]cognitive component c 1 = 2, social component c 2 = 2, inertial weight w = 1.4, mutation probability r m u = 1 / N t , where N t represents the dimensionality of the problem domain.
CatfishBPSO [54]cognitive component c 1 = 2, social component c 2 = 2, inertial weight w = 1, replacing rate = 0.1.
GPSO [41]inertia weight = 0.9, cognitive component c 1   = 2.6 , social component c 2   = 1.5, crossover probability = 0.7, mutation probability = 0.3.
MPSOELM [45]time-varying acceleration coefficients and an adaptive inertia weight.
MFOPSO [44]inertia weight=0.9, cognitive component c 1   = 2 ,   social component c 2   = 2.
BBPSOVA [42]search coefficients yielded by Logistic map.
ALPSO [27]inertia weight=0.6, search parameters produced by helix functions.
Prop. PSOVA1cognitive component c 1 = 2, social component c 2 = 2, inertial weight w = Logistic map, mutation probability threshold r m u = 0.9, F = Sinusoidal map.
Prop. PSOVA2switching probability for exemplar adoption = 0.4, initial weight for g b e s t = 0.4, search coefficients implemented using exponential, sine, cosine, and hyperbolic tangent functions.
Table 4. The mean results of the classification accuracy rates over 30 runs.
Table 4. The mean results of the classification accuracy rates over 30 runs.
Data SetsMetricsPSODESCADAGWOCSOHPSO-SSMCatfish-BPSOProp. PSOVA1Prop. PSOVA2
Crohnmean0.75560.76240.74790.74270.77860.71970.76750.78030.81280.8333
std.6.74E-023.10E-023.18E-023.28E-023.07E-027.16E-023.10E-023.73E-022.90E-023.09E-02
Myelomamean0.70960.72880.70130.70320.72120.69170.71280.69100.74420.7545
std.2.60E-022.29E-022.03E-022.42E-022.37E-026.01E-022.48E-021.56E-022.68E-022.66E-02
Arcenemean0.72170.72440.73720.71830.72110.73720.71220.71000.74110.7694
std.2.66E-022.78E-023.98E-023.71E-022.95E-023.79E-023.28E-023.77E-022.81E-023.58E-02
MicroMassmean0.58970.60520.60610.59330.61240.54090.59030.58360.64550.6612
std.4.34E-023.85E-025.13E-024.07E-024.38E-022.79E-024.12E-023.92E-024.59E-024.38E-02
Parkinsonsmean0.79490.79900.79220.78620.79400.79850.80000.79940.81150.8094
std.1.74E-021.63E-022.48E-022.15E-021.91E-021.30E-021.77E-021.56E-021.88E-021.60E-02
Activitymean0.88130.89190.88260.87850.89290.88760.88600.87850.90250.9117
std.1.64E-021.55E-021.86E-022.23E-021.44E-021.60E-021.95E-021.42E-021.28E-021.53E-02
Voicemean0.82370.81490.82020.82720.82190.77890.82370.81930.85260.8632
std.5.00E-025.58E-024.66E-025.83E-025.42E-028.37E-025.09E-023.95E-024.28E-024.37E-02
Facial Expressionmean0.71870.67480.68910.66350.68440.68610.69140.69980.73510.7340
std.4.64E-024.70E-024.05E-023.37E-024.68E-025.14E-023.86E-024.21E-024.60E-024.24E-02
Seizuremean0.84590.85900.85430.85770.86550.84900.84610.85160.86980.8860
std.5.08E-036.69E-031.12E-021.00E-022.01E-029.22E-035.28E-038.12E-035.13E-036.12E-03
ALLmean0.89510.91670.90370.90250.88580.87280.89440.91230.91850.9241
std.2.84E-022.69E-022.21E-021.91E-024.25E-025.59E-024.76E-023.28E-023.23E-023.26E-02
Heartmean0.59630.64350.66200.55370.63980.57130.64440.57690.67310.7241
std.8.33E-025.18E-025.56E-026.13E-026.35E-024.34E-024.83E-027.16E-024.63E-025.42E-02
Ionospheremean0.81710.82850.83200.81010.81970.81840.81890.80660.83510.8434
std.2.70E-023.10E-022.94E-022.62E-022.28E-022.89E-022.60E-022.89E-022.49E-022.16E-02
Wdbcmean0.95200.95340.91910.94580.93860.88280.92610.94970.95710.9585
std.1.04E-021.60E-024.19E-022.36E-023.30E-023.33E-023.60E-021.67E-021.33E-029.59E-03
Data setsMetricsBPSOMBPSOGPSOMPSO-ELMMFO-PSOBBPSO-VAALPSOProp. PSOVA1Prop. PSOVA2
Crohnmean0.74270.77950.75040.74790.77260.76840.78890.81280.8333
std.3.00E-022.25E-021.86E-023.45E-023.67E-023.00E-023.08E-022.90E-023.09E-02
Myelomamean0.69420.70510.70450.69170.71540.71280.70510.74420.7545
std.2.13E-021.94E-022.30E-022.56E-022.24E-022.81E-021.94E-022.68E-022.66E-02
Arcenemean0.71110.71170.70220.71060.73720.72000.71610.74110.7694
std.3.53E-022.79E-023.54E-023.18E-023.62E-024.23E-022.80E-022.81E-023.58E-02
MicroMassmean0.57580.57850.60520.58790.59150.61180.59940.64550.6612
std.3.58E-024.01E-023.46E-024.31E-024.77E-023.99E-025.30E-024.59E-024.38E-02
Parkinsonsmean0.79880.79620.79530.78900.78220.79070.79500.81150.8094
std.1.97E-021.97E-021.85E-021.84E-022.49E-021.97E-022.00E-021.88E-021.60E-02
Activitymean0.87250.87750.88640.87850.88060.88480.88100.90250.9117
std.1.59E-021.15E-021.23E-021.54E-021.80E-021.52E-021.74E-021.28E-021.53E-02
Voicemean0.82630.82460.85260.82980.83770.84390.81750.85260.8632
std.4.43E-024.72E-025.07E-023.83E-027.03E-025.72E-026.19E-024.28E-024.37E-02
Facial Expressionmean0.71700.72740.71770.72340.70310.70610.70320.73510.7340
std.3.56E-023.93E-024.33E-024.67E-025.89E-024.62E-024.40E-024.60E-024.24E-02
Seizuremean0.83700.83880.84920.84000.85190.84960.84300.86980.8860
std.4.74E-034.41E-035.62E-035.84E-035.14E-036.87E-037.06E-035.13E-036.12E-03
ALLmean0.89380.89880.90680.89810.90000.90190.90250.91850.9241
std.1.97E-023.32E-022.68E-022.78E-022.64E-022.25E-022.62E-023.23E-023.26E-02
Heartmean0.58150.57500.59440.59910.64260.62500.63330.67310.7241
std.5.91E-026.50E-025.67E-027.30E-028.86E-026.87E-027.29E-024.63E-025.42E-02
Ionospheremean0.82760.81100.81890.81400.81710.82280.81970.83510.8434
std.2.60E-023.27E-022.03E-023.18E-022.83E-022.30E-022.26E-022.49E-022.16E-02
Wdbcmean0.95010.94540.95170.94810.95090.95400.95010.95710.9585
std.1.10E-022.12E-029.18E-031.63E-021.31E-021.06E-021.24E-021.33E-029.59E-03
Table 5. The mean results of the F-score measures over 30 runs.
Table 5. The mean results of the F-score measures over 30 runs.
Data SetsMetricsPSODESCADAGWOCSOHPSO-SSMCatfish-BPSOProp. PSOVA1Prop. PSOVA2
Crohnmean0.82020.80520.79430.79060.82360.77650.81370.82630.85500.8585
std.3.69E-022.42E-022.44E-022.52E-022.42E-025.71E-022.39E-022.90E-022.24E-022.43E-02
Myelomamean0.82190.84110.80910.81050.82860.80260.82290.80340.85000.8551
std.1.64E-021.45E-021.27E-021.47E-021.40E-024.68E-021.58E-021.01E-021.57E-021.63E-02
Arcenemean0.67590.67570.69630.67800.67830.69590.66460.65740.69770.7130
std.3.85E-023.87E-024.94E-024.89E-023.31E-025.40E-024.60E-025.14E-023.16E-024.18E-02
MicroMassmean0.63490.64690.64280.63140.64450.59820.63500.62750.67590.6967
std.4.26E-023.48E-024.79E-023.94E-024.36E-022.17E-024.21E-023.88E-024.19E-024.03E-02
Parkinsonsmean0.86910.87120.86700.86310.86860.87010.87200.87190.87980.8783
std.1.15E-021.10E-021.73E-021.41E-021.33E-028.93E-031.13E-021.01E-021.32E-021.02E-02
Activitymean0.88640.89620.88740.88330.89710.89300.89010.88380.90670.9131
std.1.53E-021.49E-021.76E-022.16E-021.37E-021.65E-021.90E-021.34E-021.24E-021.44E-02
Voicemean0.71800.73810.72650.73160.72080.68900.73390.73280.77640.7852
std.9.23E-027.09E-028.03E-021.07E-019.79E-021.06E-018.13E-027.23E-026.94E-027.54E-02
Facial Expressionmean0.64580.61910.62880.61750.62870.56700.62920.63420.65720.6562
std.3.18E-023.10E-022.51E-021.86E-023.14E-021.92E-012.54E-022.81E-023.02E-022.87E-02
Seizuremean0.81970.83840.83590.83640.84860.84340.81990.82790.85260.8759
std.7.33E-038.96E-031.50E-021.41E-022.90E-029.36E-038.21E-031.11E-028.08E-038.76E-03
ALLmean0.92040.93450.92500.92660.90840.90370.91680.93310.93610.9408
std.2.28E-022.17E-021.62E-021.37E-023.93E-024.34E-024.37E-022.51E-022.67E-022.60E-02
Heartmean0.60390.65020.66610.56160.64360.58230.65130.58810.67830.7271
std.8.59E-025.25E-025.68E-026.81E-026.72E-024.53E-024.88E-027.28E-024.63E-025.49E-02
Ionospheremean0.84390.85160.85500.83750.84270.84180.84520.83710.85620.8625
std.2.06E-022.48E-022.33E-022.04E-022.23E-022.52E-022.05E-022.18E-022.05E-021.77E-02
Wdbcmean0.93400.93550.88360.92460.91460.82860.89570.93080.94150.9432
std.1.47E-022.34E-026.53E-023.57E-024.84E-025.04E-025.38E-022.47E-021.94E-021.31E-02
Data setsMetricsBPSOMBPSOGPSOMPSO-ELMMFO-PSOBBPSO-VAALPSOProp. PSOVA1Prop. PSOVA2
Crohnmean0.78890.82200.79450.79370.81880.81530.83060.85500.8585
std.2.19E-021.68E-021.35E-022.53E-022.89E-022.29E-022.40E-022.24E-022.43E-02
Myelomamean0.80570.81890.81860.80310.82480.82340.81890.85000.8551
std.1.32E-021.23E-021.41E-021.58E-021.39E-021.74E-021.22E-021.57E-021.63E-02
Arcenemean0.65730.65900.64600.66020.69850.67320.66730.69770.7130
std.4.56E-023.58E-024.65E-024.13E-024.78E-025.20E-022.36E-023.16E-024.18E-02
MicroMassmean0.62190.62000.64510.63600.63080.64490.64440.67590.6967
std.4.05E-023.59E-023.05E-023.95E-024.30E-023.92E-025.00E-024.19E-024.03E-02
Parkinsonsmean0.87160.87020.86950.86560.86120.86620.86880.87980.8783
std.1.30E-021.31E-021.26E-021.22E-021.70E-021.32E-025.02E-021.32E-021.02E-02
Activitymean0.87830.88240.89130.88420.88540.88950.88540.90670.9131
std.1.55E-021.14E-021.16E-021.47E-021.61E-021.45E-021.70E-021.24E-021.44E-02
Voicemean0.73680.73990.78040.73980.75980.76560.72720.77640.7852
std.7.58E-027.76E-027.78E-026.51E-021.06E-018.83E-025.07E-026.94E-027.54E-02
Facial Expressionmean0.65270.65560.63720.65370.64040.63600.63710.65720.6562
std.2.63E-022.93E-023.04E-023.50E-024.15E-022.97E-024.81E-023.02E-022.87E-02
Seizuremean0.80660.80940.82430.81110.82820.82510.81550.85260.8759
std.6.61E-036.33E-037.72E-038.31E-036.98E-039.39E-039.87E-038.08E-038.76E-03
ALLmean0.91950.92410.92830.92370.92440.92150.92530.93610.9408
std.1.55E-022.44E-022.07E-022.02E-021.96E-021.94E-023.53E-022.67E-022.60E-02
Heartmean0.59040.57880.60060.61660.64420.63190.63810.67830.7271
std.6.62E-027.86E-026.33E-027.14E-028.75E-026.94E-027.52E-024.63E-025.49E-02
Ionospheremean0.85210.83800.84520.84190.84260.84760.84530.85620.8625
std.1.82E-022.51E-021.50E-022.44E-022.45E-021.89E-023.31E-022.05E-021.77E-02
Wdbcmean0.93120.92390.93380.92860.93250.93660.93210.94150.9432
std.1.55E-023.10E-021.29E-022.33E-021.85E-021.53E-029.79E-031.94E-021.31E-02
Table 6. The Wilcoxon rank sum test results of the proposed PSOVA1 model.
Table 6. The Wilcoxon rank sum test results of the proposed PSOVA1 model.
Data SetsPSODESCADAGWOCSOHPSO-SSMCatfish-BPSO
Crohn2.25E-044.45E-075.04E-094.90E-109.82E-054.42E-082.14E-066.80E-04
Myeloma1.35E-053.40E-021.25E-071.26E-069.24E-048.59E-057.27E-056.63E-10
Arcene1.53E-023.53E-028.75E-012.44E-021.93E-026.16E-011.48E-034.41E-04
MicroMass2.47E-047.55E-038.69E-033.50E-044.11E-021.05E-092.12E-042.90E-05
Parkinsons1.65E-033.15E-026.60E-031.99E-052.38E-033.35E-024.69E-024.52E-02
Activity3.93E-066.61E-031.27E-041.40E-051.19E-024.51E-051.05E-031.49E-07
Voice3.21E-026.20E-039.98E-034.48E-022.78E-029.85E-043.35E-024.04E-03
Facial Expression5.24E-018.72E-051.23E-031.75E-065.63E-044.14E-055.06E-044.69E-03
Seizure3.07E-111.16E-033.33E-052.05E-045.49E-011.23E-087.52E-111.23E-07
ALL7.85E-037.75E-014.79E-022.92E-023.45E-031.35E-033.82E-024.76E-01
Heart1.44E-042.16E-022.94E-012.20E-093.15E-021.21E-093.84E-021.29E-06
Ionosphere1.16E-026.10E-018.11E-011.15E-034.18E-023.82E-022.77E-022.06E-04
Wdbc2.48E-025.23E-013.02E-051.30E-023.54E-025.44E-091.84E-041.84E-02
Data setsBPSOMBPSOGPSOMPSOELMMFOPSOBBPSOVAALPSO
Crohn1.45E-092.47E-052.25E-107.14E-092.94E-053.11E-064.13E-03
Myeloma1.44E-085.89E-071.00E-062.53E-081.14E-041.73E-044.49E-07
Arcene6.08E-046.28E-045.12E-056.38E-048.34E-012.15E-023.18E-03
MicroMass5.30E-061.13E-055.31E-031.38E-046.88E-041.99E-024.66E-03
Parkinsons3.93E-023.31E-024.74E-034.31E-044.41E-053.81E-046.72E-03
Activity1.07E-082.12E-083.99E-052.71E-076.55E-064.71E-059.65E-06
Voice2.91E-021.83E-028.87E-013.84E-024.52E-016.03E-011.61E-02
Facial Expression1.92E-023.40E-013.56E-023.28E-014.65E-021.45E-024.03E-02
Seizure2.92E-112.91E-114.85E-102.89E-113.40E-092.96E-094.62E-11
ALL1.98E-033.11E-021.29E-011.77E-022.38E-023.03E-024.75E-02
Heart2.87E-071.26E-071.56E-064.88E-054.85E-022.28E-039.35E-03
Ionosphere7.87E-014.58E-032.04E-023.37E-022.40E-021.22E-013.59E-02
Wdbc1.82E-024.16E-032.61E-024.01E-024.90E-022.13E-014.50E-02
Table 7. The Wilcoxon rank sum test results of the proposed PSOVA2 model.
Table 7. The Wilcoxon rank sum test results of the proposed PSOVA2 model.
Data SetsPSODESCADAGWOCSOHPSO-SSMCatfish-BPSO
Crohn5.69E-052.58E-074.33E-098.51E-104.23E-052.57E-081.19E-062.44E-04
Myeloma2.30E-075.30E-041.13E-091.12E-088.27E-062.13E-066.73E-075.56E-11
Arcene1.83E-065.23E-061.85E-033.00E-062.25E-067.92E-046.24E-071.48E-06
MicroMass1.43E-064.26E-052.06E-046.32E-066.27E-045.29E-112.36E-063.98E-07
Parkinsons1.48E-032.70E-022.87E-036.40E-051.92E-031.06E-024.84E-023.30E-02
Activity6.47E-083.65E-051.19E-061.79E-074.88E-051.24E-077.41E-062.49E-09
Voice8.81E-032.39E-033.45E-032.01E-027.35E-031.86E-048.08E-031.67E-03
Facial Expression4.15E-012.59E-051.04E-032.12E-073.30E-041.14E-042.71E-046.89E-03
Seizure2.94E-117.60E-111.18E-101.21E-099.69E-042.97E-112.95E-113.98E-11
ALL2.22E-035.08E-011.65E-021.03E-021.00E-033.80E-041.52E-022.86E-01
Heart6.52E-078.33E-075.36E-051.30E-095.29E-061.30E-091.32E-071.09E-08
Ionosphere2.53E-043.50E-021.00E-017.89E-063.26E-046.06E-045.16E-044.17E-06
Wdbc1.33E-023.68E-011.84E-055.02E-031.93E-025.09E-091.22E-041.05E-02
Data setsBPSOMBPSOGPSOMPSOELMMFOPSOBBPSOVAALPSO
Crohn1.41E-091.23E-053.38E-106.63E-091.64E-051.36E-061.35E-03
Myeloma4.06E-103.70E-091.76E-081.21E-091.09E-063.52E-063.78E-09
Arcene7.05E-071.51E-071.40E-073.06E-076.70E-044.40E-054.58E-07
MicroMass2.43E-099.54E-081.13E-054.80E-079.00E-063.26E-049.12E-05
Parkinsons2.97E-021.20E-025.39E-035.00E-051.90E-052.32E-045.81E-03
Activity6.21E-108.34E-101.42E-074.88E-093.97E-082.17E-076.88E-08
Voice8.18E-037.50E-033.97E-011.16E-021.47E-012.22E-015.33E-03
Facial Expression6.16E-024.91E-014.89E-023.94E-014.64E-022.83E-023.61E-02
Seizure2.93E-112.92E-113.04E-112.90E-112.96E-112.95E-112.97E-11
ALL5.91E-041.09E-025.54E-025.70E-036.72E-031.44E-022.55E-02
Heart5.54E-091.76E-081.96E-081.02E-071.10E-041.18E-061.35E-05
Ionosphere2.88E-025.43E-051.26E-044.39E-042.69E-041.66E-033.35E-04
Wdbc4.84E-032.25E-038.69E-031.14E-022.00E-021.10E-011.07E-02
Table 8. The mean results of the number of selected features over 30 runs.
Table 8. The mean results of the number of selected features over 30 runs.
Data SetsPSODESCADAGWOCSOHPSO-SSMCatfish-BPSOProp. PSOVA1Prop. PSOVA2
Crohn9468.88942.47594.78423.46292.61151.58846.29364.57026.67697.6
Myeloma5654.65130.24462.94740.13680.31633.55236.95476.44059.04264.5
Arcene3976.14046.13388.63695.42770.42545.33967.24424.83395.03412.4
MicroMass548.6527.2439.8485.9330.61123.0554.3588.8461.3476.8
Parkinsons323.3310.2266.3283.2209.8492.0323.6361.6273.1274.8
Activity237.6222.9184.0208.2146.3394.4232.7255.7194.0185.5
Voice128.0121.4108.3118.186.765.0122.0140.2108.6109.5
Facial Expression131.4112.888.472.080.760.184.6121.692.797.7
Seizure61.038.425.333.419.75.158.039.719.412.2
ALL26.523.018.429.512.89.525.428.819.015.8
Heart28.823.920.927.817.856.726.431.921.824.6
Ionosphere12.59.39.611.89.49.611.313.110.36.9
Wdbc9.95.53.99.44.733.44.710.49.87.9
Data setsBPSOMBPSOGPSOMPSO-ELMMFO-PSOBBPSO-VAALPSOProp. PSOVA1Prop. PSOVA2
Crohn11,134.811,106.710,030.210,188.46886.19093.09178.77026.67697.6
Myeloma6298.86299.05817.95924.54073.25299.65191.34059.04264.5
Arcene4977.24974.04484.64541.93014.34078.24051.23395.03412.4
MicroMass646.2641.5611.5619.5439.6562.1569.5461.3476.8
Parkinsons378.1374.4356.4360.8260.2327.0310.0273.1274.8
Activity277.2277.8261.4272.9195.1237.8241.7194.0185.5
Voice152.9148.2140.0147.3101.9131.1134.4108.6109.5
Facial Expression146.2142.0129.4135.295.7122.6115.792.797.7
Seizure80.174.557.268.638.749.954.419.412.2
ALL35.433.327.933.623.123.731.619.015.8
Heart34.030.932.035.025.127.432.021.824.6
Ionosphere12.510.69.113.38.69.19.910.36.9
Wdbc10.86.89.111.87.68.69.69.87.9
Table 9. The mean training computational costs over a set of 30 runs (in seconds).
Table 9. The mean training computational costs over a set of 30 runs (in seconds).
Data SetsPSODESCADAGWOCSOHPSO-SSMCatfish BPSOProp. PSOVA1Prop. PSOVA2
Crohn3.60E-013.16E-013.57E-013.50E-012.96E-012.91E-013.18E-013.25E-013.47E-013.17E-01
Myeloma3.00E-012.78E-013.10E-012.90E-012.48E-013.00E-012.80E-012.88E-012.90E-012.66E-01
Seizure1.24E+011.24E+011.25E+011.25E+011.25E+011.25E+011.33E+011.24E+011.26E+011.16E+01
Data setsBPSOMBPSOGPSOMPSO-ELMMFO-PSOBBPSO VAALPSOProp. PSOVA1Prop. PSOVA2
Crohn3.83E-013.55E-015.38E-013.80E-014.90E-014.30E-014.36E-013.47E-013.17E-01
Myeloma3.17E-013.09E-013.98E-013.20E-013.59E-013.47E-013.57E-012.90E-012.66E-01
Seizure1.26E+011.24E+011.27E+011.24E+011.26E+011.25E+011.25E+011.26E+011.16E+01
Table 10. The mean classification accuracy rates over 30 runs for the mechanisms in PSOVA1 and PSOVA2 using the Seizure and Voice data sets.
Table 10. The mean classification accuracy rates over 30 runs for the mechanisms in PSOVA1 and PSOVA2 using the Seizure and Voice data sets.
PSOVA1Mean Classification Accuracy RatePSOVA2Mean Classification Accuracy Rate
SeizureVoiceSeizureVoice
PSO0.84590.8237PSO0.84590.8237
PSO + Leader enhancement0.84750.8281PSO + Leader enhancement0.84630.8254
PSO + Leader & worse solution enhancement0.85100.8316PSO + Leader & worse solution enhancement0.84950.8298
Leader & worse solution enhancement + ameliorated signals0.86720.8491Leader & worse solution enhancement + exemplar breeding0.87330.8535
Leader & worse solution enhancement + ameliorated signals + spiral search0.86980.8526Leader & worse solution enhancement + exemplar breeding + coefficient generation0.88600.8632
Table 11. Evaluation results for 11 benchmark functions with dimension = 30.
Table 11. Evaluation results for 11 benchmark functions with dimension = 30.
PSOVA1PSOVA2
Standard PSOPSO+ Mirroring1 (Leader Enhancement) 1 + 2 (Worse Enhancement) 1+2+3 (Diverse Signals)1+2+3+4 (Spiral)1 (Leader Enhancement)1 + 2 (Worse Enhancement)1+2+3 (Exemplar)1+2+3+4 (Coefficient)
AckleyMEAN1.97E+011.76E+011.62E+017.19E+003.12E+001.69E+009.33E+006.79E+001.37E+009.07E-01
MIN1.89E+011.46E+015.98E+005.15E+002.11E+004.92E-012.50E+003.67E+002.29E-011.36E-01
MAX1.98E+011.87E+011.99E+019.47E+004.48E+002.44E+001.44E+018.77E+002.43E+002.11E+00
STD1.68E-019.73E-014.81E+001.03E+004.77E-014.95E-013.05E+001.14E+005.90E-016.28E-01
DixonMEAN2.22E+051.17E+037.25E+026.81E+013.98E+019.40E+001.12E+025.09E+011.15E+016.49E+00
MIN1.40E+011.58E+021.03E+024.82E+008.32E+001.66E+002.60E+004.67E+001.88E+001.39E+00
MAX9.77E+052.85E+032.91E+032.01E+021.56E+022.72E+013.40E+023.44E+025.45E+013.46E+01
STD2.45E+059.10E+025.37E+024.67E+013.29E+015.27E+001.38E+026.81E+011.17E+016.74E+00
GriewankMEAN1.24E+021.52E+014.54E+009.28E-014.11E-011.76E-013.79E+009.86E-011.76E-026.28E-03
MIN1.04E+003.47E+001.03E+005.99E-012.21E-022.09E-021.40E-012.61E-014.41E-032.21E-03
MAX2.71E+023.16E+011.75E+011.15E+008.34E-015.76E-019.10E+012.13E+004.55E-021.46E-02
STD6.48E+016.98E+003.70E+001.66E-012.47E-011.28E-011.65E+014.14E-011.07E-023.45E-03
RastriginMEAN3.24E+022.43E+022.23E+021.14E+028.54E+015.79E+011.30E+021.07E+027.71E+016.43E+01
MIN2.69E+021.85E+021.48E+022.28E+014.07E+012.73E+017.59E+015.65E+014.13E+013.45E+01
MAX3.96E+023.09E+023.08E+022.28E+021.31E+029.66E+011.78E+021.70E+021.14E+029.44E+01
STD3.64E+012.94E+014.05E+014.59E+012.17E+011.84E+012.63E+012.79E+011.86E+011.47E+01
RothypMEAN1.02E+054.39E+041.63E+041.04E+047.69E+025.94E+001.30E+044.41E+035.47E+002.11E+00
MIN1.70E+042.12E+044.23E+032.99E+032.52E+027.97E-013.15E+002.00E+011.29E+004.93E-01
MAX2.07E+058.25E+043.35E+042.48E+041.55E+031.98E+015.90E+042.56E+041.86E+014.78E+00
STD6.31E+041.51E+047.49E+034.72E+033.21E+025.14E+001.64E+046.36E+033.95E+001.08E+00
RosenbrockMEAN6.21E+052.63E+041.12E+049.80E+033.43E+027.43E+012.35E+042.94E+038.48E+016.56E+01
MIN2.84E+059.42E+033.54E+032.76E+031.64E+022.52E+015.20E+018.18E+013.13E+013.10E+01
MAX1.47E+065.45E+043.95E+042.03E+047.57E+021.58E+028.17E+042.52E+041.90E+022.21E+02
STD2.32E+051.19E+048.35E+034.72E+031.50E+024.07E+012.90E+046.26E+034.91E+015.24E+01
SphereMEAN2.81E+011.42E+018.48E+004.04E+003.76E-019.16E-023.53E+008.80E-016.10E-024.00E-02
MIN1.15E-025.75E+003.17E+001.96E+001.79E-012.90E-026.78E-033.18E-052.70E-022.24E-02
MAX7.87E+012.87E+011.81E+017.26E+007.48E-012.07E-012.63E+012.62E+011.08E-017.52E-02
STD2.47E+015.35E+003.33E+001.45E+001.39E-015.02E-029.06E+004.79E+002.26E-021.33E-02
SumpowMEAN7.07E-025.68E-021.28E-024.52E-039.81E-051.24E-063.55E-025.02E-032.87E-052.13E-05
MIN9.19E-041.10E-036.47E-041.13E-042.22E-061.37E-093.54E-031.23E-042.27E-061.21E-06
MAX8.16E-011.82E-016.53E-021.36E-024.86E-041.50E-051.80E-013.68E-021.65E-047.90E-05
STD1.59E-014.33E-021.67E-024.38E-039.77E-052.85E-063.53E-028.04E-032.98E-051.78E-05
ZakharovMEAN6.27E+024.11E+023.25E+021.70E+021.01E+028.27E+012.99E+021.56E+029.61E+017.39E+01
MIN5.53E+023.38E+022.03E+027.22E+015.65E+015.07E+012.00E+026.55E+015.58E+014.81E+01
MAX7.63E+024.52E+024.32E+022.90E+021.52E+021.49E+023.84E+022.19E+021.34E+021.04E+02
STD5.56E+012.81E+016.42E+014.41E+012.20E+012.07E+014.57E+014.12E+011.81E+011.46E+01
SumsquMEAN6.82E+024.26E+022.38E+026.48E+015.13E+002.95E+002.02E+024.40E+013.99E+002.07E+00
MIN7.92E+012.26E+021.22E+022.56E+011.65E+006.65E-011.09E-024.40E-021.16E+006.20E-01
MAX1.34E+037.25E+023.63E+021.45E+021.04E+018.44E+004.98E+023.47E+021.12E+019.05E+00
STD3.35E+021.15E+027.35E+012.83E+012.28E+001.86E+001.39E+027.25E+012.90E+001.67E+00
PowellMEAN4.91E+032.85E+035.03E+024.43E+023.72E+011.91E+013.08E+022.34E+022.63E+011.02E+01
MIN5.46E+024.94E+024.01E+023.20E+021.89E+009.97E-012.87E-011.47E+007.82E+001.83E+00
MAX8.11E+036.87E+036.19E+025.62E+021.34E+029.65E+012.96E+031.88E+039.61E+013.90E+01
STD2.21E+031.96E+035.86E+016.24E+012.74E+011.89E+016.18E+024.20E+021.83E+017.72E+00
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Xie, H.; Zhang, L.; Lim, C.P.; Yu, Y.; Liu, H. Feature Selection Using Enhanced Particle Swarm Optimisation for Classification Models. Sensors 2021, 21, 1816. https://doi.org/10.3390/s21051816

AMA Style

Xie H, Zhang L, Lim CP, Yu Y, Liu H. Feature Selection Using Enhanced Particle Swarm Optimisation for Classification Models. Sensors. 2021; 21(5):1816. https://doi.org/10.3390/s21051816

Chicago/Turabian Style

Xie, Hailun, Li Zhang, Chee Peng Lim, Yonghong Yu, and Han Liu. 2021. "Feature Selection Using Enhanced Particle Swarm Optimisation for Classification Models" Sensors 21, no. 5: 1816. https://doi.org/10.3390/s21051816

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop