Skip to Content
DataData
  • Article
  • Open Access

25 January 2024

An Optimized Hybrid Approach for Feature Selection Based on Chi-Square and Particle Swarm Optimization Algorithms

,
and
1
Faculty of Computing, Arab Open University, El-Shorouk 51, Cairo 11211, Egypt
2
Faculty of Computers & Artificial Intelligence, Helwan University (HU), Ain Helwan, Cairo 11795, Egypt
*
Authors to whom correspondence should be addressed.
This article belongs to the Section Information Systems and Data Management

Abstract

Feature selection is a significant issue in the machine learning process. Most datasets include features that are not needed for the problem being studied. These irrelevant features reduce both the efficiency and accuracy of the algorithm. It is possible to think about feature selection as an optimization problem. Swarm intelligence algorithms are promising techniques for solving this problem. This research paper presents a hybrid approach for tackling the problem of feature selection. A filter method (chi-square) and two wrapper swarm intelligence algorithms (grey wolf optimization (GWO) and particle swarm optimization (PSO)) are used in two different techniques to improve feature selection accuracy and system execution time. The performance of the two phases of the proposed approach is assessed using two distinct datasets. The results show that PSOGWO yields a maximum accuracy boost of 95.3%, while chi2-PSOGWO yields a maximum accuracy improvement of 95.961% for feature selection. The experimental results show that the proposed approach performs better than the compared approaches.

1. Introduction

Feature selection is a process that seeks to discover and remove features from a dataset that are not relevant or useful. These features are often perceived as unnecessary or extraneous to the problem being analyzed. Feature selection is used to generate a subset of attributes to use in constructing models for classification purposes [1]. Feature selection has been applied in a range of intelligent and expert systems such as intrusion detection [2], cancer detection [3], sentiment analysis [4], and disease detection and classification [5].
Feature selection methods can be categorized into wrapper-based methods, filter-based methods, and hybrid-based methods that combine elements from both approaches [6,7]. Filter-based methods (such as information gain [8], chi-square [9], minimum redundancy maximum relevance (MRMR) [1]) use statistical methods to rank and select the most pertinent features. This technique is applied prior to running the machine learning classifier and does not interact directly with it [10]. Wrapper-based methods utilize an optimization algorithm in conjunction with the classifier to identify the most suitable features. Wrapper-based methods are typically employed for feature selection because of their efficiency in decreasing the amount of features and increasing the classifier’s accuracy, as they have a direct connection to the classifier being used [10]. Wrapper-based methods are slower and more computationally expensive than filter-based feature selection techniques. Hybrid methods combine two distinct methods in order to reap the benefits of both (e.g., ECCSPSOA [11], CS-BPSO [12], ISSA [10]).
Metaheuristic algorithms are employed in feature selection techniques to reduce computational complexity. These algorithms efficiently and accurately optimize feature selection problems. Swarm intelligence (SI) and evolutionary algorithms (EA) are the two primary categories into which these algorithms can be classified.
I.
An evolutionary algorithm takes advantage of processes such as reproduction, mutation, recombination, and selection, which are modelled after biological evolution. The fitness function identifies the quality of candidate solutions to the optimization problem, which act as members of a population. The original population changes after several iterations of the evolutionary algorithm, moving towards global optimization [13].
II.
Swarm intelligence: The foundation of swarm intelligence is self-organizing group behavior, which involves the intelligence that is generated by the collective contributions of numerous individuals. Since these connections, as seen in nature like in bees or ants, do not exist naturally in humans, technology uses swarm artificial intelligence (AI) to provide feedback to human members as demonstrated in [14,15]. Collective behavior shows that united systems do better than the majority of single individuals. Ad hoc data and sharing are dynamically generated by the group, and the basis of agreement is the dissemination of collective wisdom. Briefly stated, swarm intelligence relies on the “knowledge of the public”, and is desperately needed to address a myriad of questions [16].
One of the most popular swarm intelligence techniques is the particle swarm intelligence algorithm (PSO). Comparing PSO to other metaheuristic algorithms like GA and genetic programming (GP), it has been demonstrated that PSO is computationally less expensive and can converge more quickly. Additionally, PSO tends to be easy to implement. In terms of speed and memory requirements, it is computationally cheap and has fewer adjustable parameters [17]. PSO can be customized to perform feature subset selection, aiming to find the optimal combination of features that should be included in the model. This process is essential for enhancing model performance by reducing the dimensionality of the data and eliminating irrelevant or redundant features. Therefore, PSO has been used as an effective technique in many fields, including feature selection [18]. The grey wolf optimizer (GWO) is one of the most recent and popular swarm intelligence algorithms. Compared to other swarm intelligent optimization methods, GWO offers the following benefits: no parameters to change, easy to implement and adapt for optimization challenges, adaptability, and scalability. GWO has been widely used as a feature selection approach in several fields during the past few years, including intrusion detection [19], big data analytics [20], and image classification [21].
This paper is organized as follows: recent research on feature selection is included in Section 2. Section 3 presents the complete chi2-PSOGWO feature selection technique, encompassing all its phases. A background study of the algorithms used is provided in Section 4. Section 5 presents the experimental results obtained, while a summary of and observations derived from the experiments are presented in Section 6. Section 7 comprises the conclusion and outlines future work.

3. Background

3.1. Overview of PSO

As shown in Algorithm 1, PSO models how knowledge of social behavior grows over time and how groups communicate when exchanging private knowledge about migratory patterns, flocking, or hunting. They are known as a swarm and particles, respectively, and together, they make up a solution. Using its own and its neighbors’ information, a particle changes its position.
Algorithm 1. PSO pseudo code:
1: initialize population of particles and velocities
2: while t < maximum number of iterations
3:    calculate the fitness of all particles
4    updating position and fitness of particles
5:  choose the particle of best fitness value and the Gbest of all particles
6:  for each particle
7:      calculate the velocity of particle by Equation (2)
8:     update particle position by Equation (1)
9:  end for
10: End while
The swarm starts by producing a collection of random particles, along with their positions and velocities. Equations (1) and (2) represent the method that is used to update the particles’ positions:
x i j ( t + 1 ) = x i j ( t ) + v i j ( t + 1 )
v i j ( t + 1 ) = w v i j ( t ) + c 1 r 1 ( x i j p ( t ) x i j ( t ) ) + c 2 r 2 ( x i j g ( t ) x i j t )
where t is the current iteration and w denotes an inertia weight and is used to speed up population convergence. When x i j is the i-th particle location in the j-th dimension, and v i j is the i-th velocity in the j-th dimension. Acceleration coefficients are expressed by the constants c1 and c2. The terms x i j p ( t ) and x i j g ( t ) denote particle i’s best prior position in the j-th dimension, respectively. r1 and r2 are random parameters between 0 and 1.
Afterwards, each particle is assessed by PSO’s main loop using a fitness function, and the results are checked against the local best and global best values [30].

3.2. Overview of GWO

The leadership organization and hunting tactics of grey wolves are modelled by the GWO algorithm. Grey wolf packs named alpha, beta, delta, and omega are used to imitate the leadership structure. The three essential elements of hunting, looking for prey, surrounding prey, and attacking prey, are also utilized. The pseudocode of GWO algorithm is presented in Algorithm 2.
Algorithm 2. GWO pseudo code:
1: initialize grey wolf populations
2: initialize a, A and c values
3: calculate the fitness of each search agent
4: x α = The best search agent
x β = The second best search agent
x δ = The Third best search agent
5: while t < maximum number of iterations
6:   for each GWO search agent
7:     update the position of current search agent by Equation (5)
8: end for
9: update A, c, w
10:     calculate the fitness of all search agents
11:   update  x α ,   x β ,   x δ
12: End while
The grey wolf hunting technique can be summarized as follows: It is reasonable to assume that the alpha (the best candidate solution), beta, and delta have the greatest understanding about prospective prey locations. In order to force the other search agents, including the omegas, to update their locations in accordance with the positions of the best search agents, the first three best answers are kept. Grey wolves update their locations using the following equations [31].
D α = | c 1 x α x | ,   D β = | c 2 x β x | ,   D δ = | c 3 x δ x |
x 1 = x α A 1 D α   ,   x 2 = x β A 2 D β   ,   x 3 = x δ A 3 D 3
x ( t + 1 ) = x 1 + x 2 + x 3   3
where x 1 ,     x 2 ,     x 3 are the distances between each δ, β, and α and the prey. t indicates the current iteration and A, C are coefficient vectors given by A = 2 a r 1 a , c = 2   r 2 .

3.3. Overview of Chi-Square

The chi-square score of each feature and target is computed for feature selection χ2, and the top two features are chosen. According to the logic behind the calculation of the “χ2 score”, if a feature has a low “χ2 score”, it is independent of the target class, which suggests that it is useless for categorizing data samples.
According to the theory behind chi-square score computation, features with low chi-square scores are independent of the target class and hence useless for categorizing data samples [32]. Chi-square feature selection assesses the independence of events for a collection of data. Using Equation (6), the chi-square feature selection approach assesses the independence of two events, the occurrence of a feature, and the occurrence of a class by comparing their occurrence rates [32]:
x 2 = ( O s e r v e d   f r e q u e n c y E x p e c t e c t e d   F r e q u e n c y ) 2 E x p e c t e d   F r e q u e n c y

4. Proposed Approach

In this paper, the proposed approach aims to enhance the accuracy of feature selection using two wrapper methods (PSO and GWO). The popular swarm intelligence algorithm, PSO, operates concurrently with a promising algorithm, GWO, serving as a wrapper method. The main architecture of this phase is shown in Figure 1. In phase 2, another enhancement is added to improve the execution time by implementing a filtering method (chi-square) before this combination. The chi-square filtering method is integrated into the process to eliminate the most irrelevant features, aiming to decrease the execution time of the feature selection process. To ensure the precision of the results, the data were categorized based on three parameters: the dimensions of the features, records, and the number of class attributes before analyzing the results. The first phase of the proposed approach is mainly compared to a hybrid approach that combines a salp swarm algorithm (SSA) with particle swarm optimization (PSO), as well as pure PSO and other pure algorithms such as salp swarm optimization (SSA), a bat algorithm (BAT), and a genetic algorithm (GA). This evaluation was conducted using seven distinct datasets. The subsequent phase of the approach was compared to similar hybrid algorithms that incorporated both filtering and wrapper methods, such algorithms that combine MRMR with PSO (MRMR-PSO), MRMR-SSA, MRMR-GA, ant colony optimization (MRMR-ACO), and ant lion optimization (MRMR-ALO). This comparison was performed using nine different datasets. Finally, both phases of the approach (before and after the addition of a chi-square filter) were compared to the primary PSO algorithm, and their respective advantages and disadvantages were discussed.
Figure 1. Architecture of phase 1 (PSOGWO).

4.1. PSOGWO (Phase 1)

In this section, the structure of PSOGWO is outlined, as shown in Figure 1. PSOGWO combines two wrapper algorithms called PSO and GWO. The objective of this phase is to assess the effectiveness of integrating GWO and PSO algorithms that have different search strategies, as shown in [30,31]. PSO is a widely used feature selection algorithm in the literature. With this change, the PSO’s update mechanism is integrated into the GWO’s main structure. The detailed structure of the PSOGWO approach is given in phase 1 of Figure 2.
Figure 2. Flow chart for chi2-PSOGWO.
In PSOGWO, the first and second steps are to establish the parameters and create a population that represents a collection of potential solutions to a given problem (feature selection). The effectiveness of each solution is then assessed by computing its fitness function and selecting the best one. The PSOGWO algorithm’s subsequent stage involves updating the population using the GWO and PSO algorithms, which run simultaneously. The fitness functions of grey wolves and the global best are then compared, and their values are updated. The fitness function is computed using Equation (7) [33]:
f i t n e s s = w e i g h t a c c × a c c u r a c y ( a g e n t ) + w e i g h t _ f e a t u r e × t o t _ f e a t s e l _ f e a t t o t _ f e a t
where tot_ feat is the total number of features contained in the agent s e l _ f e a t , is the number of features the agent has chosen, and a c c u r a c y ( a g e n t ) is the classification accuracy supplied by the agent.
After that, the grey wolves and the global best change their positions according to these new values and the parameters are updated based on the new position. This operation is repeated until the end conditions are met. The result is a vector of ones and zeros that indicates whether a feature was selected or dropped. Phase 1 of the proposed approach is shown in #phase 1 of Algorithm 3.
Algorithm 3.Chi square-PSOGWO pseudocode:
1: Initialization dataset
#Phase 2
2: Rank features using Chi-square filter method
3: indicate the chosen features in the minimized dataset
#Phase 1
4: initialize population of particles and velocities
5: initialize grey wolf populations
6: initialize w, a, A and c values
7: calculate the fitness of each search agent and particle
8: while t < maximum number of iterations
9:   for each particle
10:    update velocity by Equation (2)
      update position of particles by Equation (1)
11:   end for
12:   for each GWO search agent
13:    update the position of current search agent by Equation (5)
14:   end for
15:   compare fitness of Gbest, Localbest and x α , x β , x δ and update Gbest, x α , x β , x δ by the best values respectively (Gbest, Localbest = x α ) >   x β > x δ
16:    update A, c, w
17: End while

4.2. Chi-Square PSOGWO (Phase 2)

In this section, the integration of PSOGWO with a chi-square filtering algorithm (phase 2) is described, as illustrated in Figure 3. This phase is composed of the following stages:
Figure 3. Architecture of phase 2 (chi-square-PSOGWO).
The chi-square technique is used to filter the dataset and select only the necessary and relevant features. The chi-square score of each feature is determined.
To enable experts to decide by their own insight which extraneous or additional information to perform, the proposed approach enables the user to handpick the number of selected feature subsets.
After the filtered features are provided, the hybrid wrapper phase (PSOGWO) initializes the search space. This part involves deciding whether to include a feature or discard it. So, the output of this stage will be a string of binary numbers, i.e., one for selected features or one for non-selected features. Figure 2 explains the flowchart of the chi2-PSOGWO approach. Algorithm 3 presents the detailed pseudocode of the proposed chi2-PSOGWO approach. A random forest (RF) classifier was applied to measure the accuracy of the selected features.

5. Experiments and Results

In this section, the performances of the suggested phases (PSOGWO and chi2-PSOGWO) are evaluated by comparing the two phases of the proposed approach to other similar algorithms tackling feature selection [1,30,34].

5.1. Parameter Settings

Python was used to carry out the proposed hybrid structure’s overall implementation. A personal computer (PC) driven by an Intel i5 processor under Windows 10 and with 8 GB RAM was used. Table 1 presents the parameter settings of the proposed approach.
Table 1. Parameter settings of algorithms used.

5.2. General Data Settings

According to the literature [1], the feature count ranges from 0 to 19 in the lowest category, from 18 to 46 in the medium category, and 50 and above in the highest category for every attribute selection issue. The number of records is categorized following the same idea of feature categorization approach as that shown in Table 2.
Table 2. Summary of categories of used datasets.
To prepare the datasets for the task, both real and categorical values are transformed into numeric data. The data are unstructured and have a wide range of values. This variation creates problems in training a model. So, a minimal maximum scaling strategy is then used. A MinMaxScaler normalization technique is used to ensure that the scales of all the data in the database are similar by bringing them all to a common range. All data values can be scaled to have values between 0 and 1 using the MinMaxScaler normalization technique. The normalization technique used by MinMaxScaler is indicated by Equations (8) and (9).
X s t d = ( X X . m i n )   ( X . m a x X . m i n )
X s c a l e d = X s t d   *   ( X . m a x X .   m i n ) + X .   m i n
The minimum and maximum values for the feature X under consideration are denoted by the terms min and max in Equations (8) and (9). For a specific feature, the normalized values are provided by Equations (8) and (9). All of the values in the datasets are fitted and transformed before being utilized for training and testing [35].
Each dataset is split into test and training instances using the 70–30 rule (70% of training instances, 30% of testing instances).

5.3. Experiment 1

The proposed PSOGWO approach is compared to another hybrid algorithm called the SSAPSO algorithm. SSAPSO is an integration of particle swarm intelligence (PSO) and salp swarm intelligence algorithm (SSA). It is also compared to the pure PSO algorithm and other pure algorithms like the BAT and GA algorithms [30,34].

5.3.1. Dataset Settings

A set of UCI datasets is utilized to test the first phase [36]. Table 3 shows the seven datasets used in this experiment.
Table 3. PSOGWO dataset discerption.

5.3.2. Results and Discussion

The performance of PSOGWO is assessed by passing the selected features to the classifier to determine the accuracy of feature selection. To achieve a more accurate comparison with competing algorithms, identical datasets and classifiers (KNN) of the competitors’ algorithms were used.
As demonstrated in Figure 4, the highest accuracy measure was achieved by PSOGWO in the ionosphere, hepatitis, and heart datasets, which are in the medium (M) and low (L) categories in terms of their number of features and records, respectively. However, its accuracy decreases in the breast cancer dataset, which has a higher number of records, and the sonar dataset, which has a higher number of features. Table 4 explains why the PSOGWO phase appears in the first stage in three datasets and in the second stage in one of the seven datasets that were utilized. It performs better with a medium number of records, lower features, and lower target (no. of classes). Throughout this experiment, it can also be observed that phase 1 performs better in a lower number of classes.
Figure 4. Comparing accuracy of feature selection.
Table 4. The results of the accuracy (%) evaluation of feature selection in all datasets.

5.4. Experiment 2

This experiment compares the results of the chi2-PSOGWO approach against MRMR-PSO, MRMR-SSA, MRMR-GA, MRMR-ALO, and MRMR-ACO algorithms explained in [1]. In the initial stage of the proposed configured model, the chi-squared algorithm is parameterized in terms of counts of features to identify the most important features from the various features in the datasets. Consequently, the obtained features of the first stage are evaluated by the second stage (PSOGWO) in order to reach the final subset of features.

5.4.1. Dataset Settings

The suggested chi2-PSOGWO hybrid approach is evaluated through a number of trials using different Kaggle datasets [37]. Table 5 gives an overview of the datasets that were used. The name of the dataset, the number of features or attributes, the record number in each dataset, and the class variables to which each dataset belongs are the four different types of descriptions that are included in the table.
Table 5. Chi2-PSOGWO dataset discerption.

5.4.2. Result and Discussion

This section examines the outcomes of the hybrid chi2-PSOGWO approach in comparison with other hybrid filter–wrapper approaches, such as MRMR-SSA, MRMR-PSO, MRMR-GA, MRMR-ACO, and MRMR-ALO [1]. The evaluation is based on their accuracy (in %) assessed by the RF classifier and uses the dataset group of the competitor algorithms. as depicted in Table 5.
An analysis presented in Table 6 shows that chi2-PSOGWO is the most successful in the breast cancer, heart disease, habitats, and wine datasets. Additionally, it showed better performance when the number of features and records was medium (M) or low (L). However, when the number of records increased (e.g., in the banknote authentication dataset), accuracy diminished, even when the number of records was high. Also, it operates more effectively with lower class attributes. As demonstrated in Figure 5, the suggested approach performs better with a medium dimension of features. It also performs better with a lower number of records.
Table 6. Comparing accuracy using RF classifier (%).
Figure 5. Comparing accuracy of chi2-PSOGWO.

5.5. Experiment 3

In this section, the two phases of the approach proposed in this paper, chi2-PSOGWO and PSOGWO, are compared to the main PSO algorithm in terms of accuracy. The goal is to evaluate the improvement achieved by combining the main particle swarm intelligence algorithms with other wrapper-based methods (GWO) and filter-based methods (chi-square). Execution time is used to determine if incorporating the filter-based algorithm (chi-square) affects the performance of the PSOGWO combination.

5.5.1. Dataset Settings

The same dataset group (Table 2) as in experiment 1 is used to compare the two phases of the proposed approach.

5.5.2. Results and Discussion

The effectiveness of the PSO and GWO combination can be showcased through experiments carried out on datasets of diverse dimensions, record sizes, and class attributes. The experiments, as depicted in Figure 6, reveal a noticeable enhancement in the accuracy of PSO when integrated with GWO in datasets such as ionosphere, hepatitis, heart, waveform, and lymphography. Furthermore, in the case of the breast cancer, hepatitis, and waveform datasets, chi2-PSOGWO is involved in the second stage. As shown in Table 7, the two phases of the proposed approach show an improved performance in categories with a medium number of features, with no noticeable impact on the number of records or class attributes. Table 8 demonstrates the execution time of both PSOGWO and chi2-PSOGWO across all the datasets. Figure 7 presents a comparison of the execution time in the two phases of the proposed approach. The chi2-PSOGWO approach exhibits a reduced execution time compared to the PSOGWO method in the majority of datasets. This is attributed to the filtering performed by the chi-square algorithm, which effectively reduces the number of features entering the subsequent stage and thereby minimizes computational time.
Figure 6. Comparing accuracy of PSO, PSOGWO, and chi2-PSOGWO.
Table 7. Comparing accuracy of PSO, PSOGWO, and chi2-PSOGWO.
Table 8. Comparing execution time of PSOGWO and chi2-PSOGWO.
Figure 7. Comparing execution time of PSOGWO and chi2-PSOGWO.

6. Summary and Observations

The proposed approach entails assessing the accuracy and execution time of the PSO algorithm subsequent to its hybridization with both a wrapper method (GWO) and a filter method (chi-square). The performance of this hybrid approach was then compared with other pure and hybrid algorithms to determine its effectiveness. The datasets were categorized based on their parameters, such as the number of features, records, and class attributes (Table 3 and Table 5). The dataset was preprocessed by converting the non-numeric data into numeric data; any null values were removed to ensure accurate performance and the Minimum–Maximum Scaler was applied to scale the records to a common range. To ensure a precise comparison, the same classifiers (RF, KNN) were employed across the compared algorithms. The proposed approach excels in two phases: surpassing the pure PSO algorithm (Figure 6, Table 7) and outperforming other significant pure and hybrid algorithms in terms of accuracy (Table 4 and Table 6, Figure 4 and Figure 5). It can be observed that a learning algorithm’s running time may be significantly decreased by considerably reducing the number of redundant features, which aids in understanding the fundamental complexities of a practical classification problem (Figure 8). Therefore, the reason that the chi2-PSOGWO approach has lower accuracy than PSOGWO could be attributed to the fact that it may overlook some important features. Another possible explanation is that the fitness function uses the ratio between the difference between the selected features and the original features to the original feature set. When a smaller (filtered) dataset is used, it is possible for this ratio to decrease. This decrease in ratio could result in a lower theoretical accuracy compared to the situation where no filters are applied, as demonstrated in Equation (7).
Figure 8. Evaluation of accuracy and execution time.
However, the accuracy measured by chi2-PSOGWO was acceptable for effectively reducing running time and maintaining a high prediction accuracy, as shown in Figure 7. As a conclusion, the results were very encouraging in terms of achieving superior performance and outperforming benchmark algorithms in many cases compared to other similar approaches.

7. Conclusions and Future Work

This paper has the purpose of effectively tackling the challenges associated with feature selection and presents an approach that effectively resolves these issues. Various difficulties related to feature selection, such as classification accuracy and execution time, have been thoroughly evaluated. The proposed approach was implemented in two distinct phases, with each phase utilizing a unique dataset to achieve the best practice. Based on the experimental data, the proposed approach demonstrated its superiority over existing feature selection techniques. A comparative analysis was conducted between the two phases of the proposed approach and several established methods. The experimental results further confirmed that both the chi2-PSOGWO and PSOGWO algorithms demonstrated a noticeable enhancement in accuracy compared to other hybrid techniques. However, the chi2-PSOGWO algorithm exhibited a superior improvement in both accuracy and execution time. This research holds the potential for further improvement and application in various domains, including multi-objective problems, engineering design, parameter estimation, text clustering, text summarization, text categorization, image segmentation, mathematical benchmark functions, and other feature selection applications.

Author Contributions

Conceptualization, A.A. and R.M.; methodology, L.A.-H.; software, R.M.; validation, A.A., R.M. and L.A.-H.; formal analysis, L.A.-H.; investigation, R.M.; resources, L.A.-H.; data curation, R.M.; writing—original draft preparation, R.M.; writing—review and editing, L.A.-H.; visualization, R.M.; supervision, A.A.; project administration, A.A.; funding acquisition, R.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The experiments were conducted over two groups of datasets. The datasets are provided on the Kaggle repository (www.kaggle.com), accessed on 1 April 2022 and UCI repository (https://archive.ics.uci.edu/mL/datasets.php), accessed on 1 May 2022.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Mahapatra, M.; Majhi, S.K.; Dhal, S.K. MRMR-SSA: A hybrid approach for optimal feature selection. Evol. Intell. 2022, 15, 2017–2036. [Google Scholar] [CrossRef]
  2. Di Mauro, M.; Galatro, G.; Fortino, G.; Liotta, A. Supervised feature selection techniques in network intrusion detection: A critical review. Eng. Appl. Artif. Intell. 2021, 101, 104216. [Google Scholar] [CrossRef]
  3. Liu, B.; Yu, H.; Zeng, X.; Zhang, D.; Gong, J.; Tian, L.; Qian, J.; Zhao, L.; Zhang, S.; Liu, R. Lung cancer detection via breath by electronic nose enhanced with a sparse group feature selection approach. Sens. Actuators B Chem. 2021, 339, 129896. [Google Scholar] [CrossRef]
  4. Zhao, H.; Liu, Z.; Yao, X.; Yang, Q. A machine learning-based sentiment analysis of online product reviews with a novel term weighting and feature selection approach. Inf. Process. Manag. 2021, 58, 102656. [Google Scholar] [CrossRef]
  5. Sen, S.; Saha, S.; Chatterjee, S.; Mirjalili, S.; Sarkar, R. A bi-stage feature selection approach for COVID-19 prediction using chest CT images. Appl. Intell. 2021, 51, 8985–9000. [Google Scholar] [CrossRef] [PubMed]
  6. Guyon, I.; Elisseeff, A. An introduction to variable and feature selection. J. Mach. Learn. Res. 2003, 3, 1157–1182. [Google Scholar]
  7. Othman, G.; Zeebaree, D.Q. The applications of discrete wavelet transform in image processing: A review. J. Soft Comput. Data Min. 2020, 1, 31–43. [Google Scholar]
  8. Omuya, E.O.; Okeyo, G.O.; Kimwele, M.W. Feature selection for classification using principal component analysis and information gain. Expert Syst. Appl. 2021, 174, 114765. [Google Scholar] [CrossRef]
  9. Bahassine, S.; Madani, A.; Al-Sarem, M.; Kissi, M. Feature selection using an improved Chi-square for Arabic text classification. J. King Saud Univ. -Comput. Inf. Sci. 2020, 32, 225–231. [Google Scholar] [CrossRef]
  10. Tubishat, M.; Idris, N.; Shuib, L.; Abushariah, M.A.; Mirjalili, S. Improved Salp Swarm Algorithm based on opposition based learning and novel local search algorithm for feature selection. Expert Syst. Appl. 2020, 145, 113122. [Google Scholar] [CrossRef]
  11. Adamu, A.; Abdullahi, M.; Junaidu, S.B.; Hassan, I.H. An hybrid particle swarm optimization with crow search algorithm for feature selection. Mach. Learn. Appl. 2021, 6, 100108. [Google Scholar] [CrossRef]
  12. BinSaeedan, W.; Alramlawi, S. CS-BPSO: Hybrid feature selection based on chi-square and binary PSO algorithm for Arabic email authorship analysis. Knowl.-Based Syst. 2021, 227, 107224. [Google Scholar] [CrossRef]
  13. Gong, D.; Xu, B.; Zhang, Y.; Guo, Y.; Yang, S. A similarity-based cooperative co-evolutionary algorithm for dynamic interval multiobjective optimization problems. IEEE Trans. Evol. Comput. 2019, 24, 142–156. [Google Scholar] [CrossRef]
  14. Wang, M.; Wu, C.; Wang, L.; Xiang, D.; Huang, X. A feature selection approach for hyperspectral image based on modified ant lion optimizer. Knowl.-Based Syst. 2019, 168, 39–48. [Google Scholar] [CrossRef]
  15. Hanbay, K. A new standard error based artificial bee colony algorithm and its applications in feature selection. J. King Saud Univ.-Comput. Inf. Sci. 2022, 34, 4554–4567. [Google Scholar] [CrossRef]
  16. Chang, A.C. Intelligence-Based Medicine: Artificial Intelligence and Human Cognition in Clinical Medicine and Healthcare; Academic Press: Cambridge, MA, USA, 2020. [Google Scholar]
  17. Papazoglou, G.; Biskas, P. Review and Comparison of Genetic Algorithm and Particle Swarm Optimization in the Optimal Power Flow Problem. Energies 2023, 16, 1152. [Google Scholar] [CrossRef]
  18. Cao, Y.; Liu, G.; Sun, J.; Bavirisetti, D.P.; Xiao, G. PSO-Stacking improved ensemble model for campus building energy consumption forecasting based on priority feature selection. J. Build. Eng. 2023, 72, 106589. [Google Scholar] [CrossRef]
  19. Abd Elaziz, M.; Al-qaness, M.A.; Dahou, A.; Ibrahim, R.A.; Abd El-Latif, A.A. Intrusion detection approach for cloud and IoT environments using deep learning and Capuchin Search Algorithm. Adv. Eng. Softw. 2023, 176, 103402. [Google Scholar] [CrossRef]
  20. Qin, J. Analysis of factors influencing the image perception of tourism scenic area planning and development based on big data. Appl. Math. Nonlinear Sci. 2023; ahead of print. [Google Scholar] [CrossRef]
  21. Yawale, N.M.; Sahu, N.; Khalsa, N.N. Design of a Hybrid GWO CNN Model for Identification of Synthetic Images via Transfer Learning Process. Int. J. Intell. Eng. Syst. 2023, 16, 292–301. [Google Scholar]
  22. Seyyedabbasi, A. Binary Sand Cat Swarm Optimization Algorithm for Wrapper Feature Selection on Biological Data. Biomimetics 2023, 8, 310. [Google Scholar] [CrossRef]
  23. Zivkovic, M.; Stoean, C.; Chhabra, A.; Budimirovic, N.; Petrovic, A.; Bacanin, N. Novel improved salp swarm algorithm: An application for feature selection. Sensors 2022, 22, 1711. [Google Scholar] [CrossRef] [PubMed]
  24. Zouache, D.; Abdelaziz, F.B. A cooperative swarm intelligence algorithm based on quantum-inspired and rough sets for feature selection. Comput. Ind. Eng. 2018, 115, 26–36. [Google Scholar] [CrossRef]
  25. Sheykhizadeh, S.; Naseri, A. An efficient swarm intelligence approach to feature selection based on invasive weed optimization: Application to multivariate calibration and classification using spectroscopic data. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2018, 194, 202–210. [Google Scholar] [CrossRef] [PubMed]
  26. Chen, K.; Zhou, F.-Y.; Yuan, X.-F. Hybrid particle swarm optimization with spiral-shaped mechanism for feature selection. Expert Syst. Appl. 2019, 128, 140–156. [Google Scholar] [CrossRef]
  27. Mostafa, R.R.; Ewees, A.A.; Ghoniem, R.M.; Abualigah, L.; Hashim, F.A. Boosting chameleon swarm algorithm with consumption AEO operator for global optimization and feature selection. Knowl.-Based Syst. 2022, 246, 108743. [Google Scholar] [CrossRef]
  28. El-Kenawy, E.-S.; Eid, M. Hybrid gray wolf and particle swarm optimization for feature selection. Int. J. Innov. Comput. Inf. Control 2020, 16, 831–844. [Google Scholar]
  29. Alrefai, N.; Ibrahim, O. Optimized feature selection method using particle swarm intelligence with ensemble learning for cancer classification based on microarray datasets. Neural Comput. Appl. 2022, 34, 13513–13528. [Google Scholar] [CrossRef]
  30. Ibrahim, R.A.; Ewees, A.A.; Oliva, D.; Abd Elaziz, M.; Lu, S. Improved salp swarm algorithm based on particle swarm optimization for feature selection. J. Ambient Intell. Humaniz. Comput. 2019, 10, 3155–3169. [Google Scholar] [CrossRef]
  31. Mirjalili, S.; Mirjalili, S.M.; Lewis, A. Grey wolf optimizer. Adv. Eng. Softw. 2014, 69, 46–61. [Google Scholar] [CrossRef]
  32. Thakkar, A.; Lohiya, R. Attack classification using feature selection techniques: A comparative study. J. Ambient Intell. Humaniz. Comput. 2021, 12, 1249–1266. [Google Scholar] [CrossRef]
  33. Guha, R.; Chatterjee, B.; Khalid Hassan, S.; Ahmed, S.; Bhattacharyya, T.; Sarkar, R. Py_fs: A python package for feature selection using meta-heuristic optimization algorithms. In Computational Intelligence in Pattern Recognition: Proceedings of CIPR 2021; Springer: Singapore, 2022; pp. 495–504. [Google Scholar]
  34. Gad, A.G. Particle swarm optimization algorithm and its applications: A systematic review. Arch. Comput. Methods Eng. 2022, 29, 2531–2561. [Google Scholar] [CrossRef]
  35. Deepa, B.; Ramesh, K. Epileptic seizure detection using deep learning through min max scaler normalization. Int. J. Health Sci 2022, 6, 10981–10996. [Google Scholar] [CrossRef]
  36. Available online: https://archive.ics.uci.edu/ml/datasets.php (accessed on 1 May 2022).
  37. Available online: https://www.kaggle.com/ (accessed on 1 April 2022).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.