Next Article in Journal
Numerical Simulation of Microwave Ablation in the Human Liver
Next Article in Special Issue
Multilayer Reversible Data Hiding Based on the Difference Expansion Method Using Multilevel Thresholding of Host Images Based on the Slime Mould Algorithm
Previous Article in Journal
Optimization Study on Enhancing Deep-Cut Effect of the Vacuum Distillation Unit (VDU)
Previous Article in Special Issue
Scheduling by NSGA-II: Review and Bibliometric Analysis
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Enhance Teaching-Learning-Based Optimization for Tsallis-Entropy-Based Feature Selection Classification Approach

1
School of Education and Music, Sanming University, Sanming 365004, China
2
School of Information Engineering, Sanming University, Sanming 365004, China
3
Faculty of Computer Sciences and Informatics, Amman Arab University, Amman 11953, Jordan
4
School of Computer Science, Universiti Sains Malaysia, Penang, Gelungor 11800, Malaysia
5
School of Electrical Engineering and Automation, Wuhan University, Wuhan 430072, China
6
Department of Management Information System, College of Business Administration, Taif University, P.O. BOX 11099, Taif 21944, Saudi Arabia
*
Authors to whom correspondence should be addressed.
Processes 2022, 10(2), 360; https://doi.org/10.3390/pr10020360
Submission received: 9 January 2022 / Revised: 8 February 2022 / Accepted: 11 February 2022 / Published: 14 February 2022
(This article belongs to the Special Issue Evolutionary Process for Engineering Optimization)

Abstract

:
Feature selection is an effective method to reduce the number of data features, which boosts classification performance in machine learning. This paper uses the Tsallis-entropy-based feature selection to detect the significant feature. Support Vector Machine (SVM) is adopted as the classifier for classification purposes in this paper. We proposed an enhanced Teaching-Learning-Based Optimization (ETLBO) to optimize the SVM and Tsallis entropy parameters to improve classification accuracy. The adaptive weight strategy and Kent chaotic map are used to enhance the optimal ability of the traditional TLBO. The proposed method aims to avoid the main weaknesses of the original TLBO, which is trapped in local optimal and unbalance between the search mechanisms. Experiments based on 16 classical datasets are selected to test the performance of the ETLBO, and the results are compared with other well-established optimization algorithms. The obtained results illustrate that the proposed method has better performance in classification accuracy.

1. Introduction

Machine learning has been widely used in many practical applications such as data mining, text processing, pattern recognition, and medical image analysis, which often rely on large data sets [1,2]. From utilizing label information, feature selection algorithms are mainly categorized as filters or wrapper approaches [3,4]. The wrapper-based methods are commonly used to finish the classification task [5]. The main step includes classifiers, evaluation criteria of features, and finding the optimal features [6].
The SVM algorithm is one of the most popular supervised models and is regarded as one of the most robust methods in the machine learning field [7,8]. SVM has some robust characteristics compared to other methods, such as excellent generalization performance, which is able to generate high-quality decision boundaries based on a small subset of training data points [9]. The largest problems encountered in setting up the SVM model are how to select the kernel function and its parameter values. Inappropriate parameter settings will lead to poor classification results [10].
Swarm intelligence algorithms can solve complex engineering problems, but different optimization algorithms solve different engineering problems with different effects [11,12]. The optimization algorithms can reduce the time and improve the segmentation accuracy. There many optimization algorithms are proposed, such as Genetic Algorithm (GA) [13], Particle Swarm Optimization (PSO) [14], Differential Evolution (DE) [15], Ant Colony Optimization (ACO) [16], Artificial Bee Colony (ABC) algorithm [17], Grey Wolf Optimizer (GWO) [18], Ant Lion Optimizer (ALO) [19], Moth-flame Optimization (MFO) [20], Whale Optimization Algorithm (WOA) [21], Invasive weed optimization algorithm [22], Flower Pollination Algorithm [23]. Although all algorithms have advantages, no-free lunch (NFL) [24] has proved that no algorithm can solve all optimization problems.
There is no perfect optimization algorithm, and the optimization algorithm should be improved to solve engineering problems better. Many scholars study the strategies for improving optimization algorithm. The strategies commonly used by scholars are as follows adaptive weight strategy and chaotic map. Zhang Y. proposed an improved particle swarm optimization algorithm with an adaptive learning strategy [25]. The adaptive learning strategy increased the population diversity of PSO. Dong Z. proposed a self-adaptive weight vector adjustment strategy based on a chain segmentation strategy [26]. The self-adaptive solved the shape of the true Pareto front (PF) of the multi-objective problem. Li E. proposed a multi-objective decomposition algorithm based on adaptive weight vector and matching strategy [27]. The adaptive weight vector solved the degradation of the performance of the solution set. The chaotic map is also a general nonlinear phenomenon, and its behavior is complex and semi-random. It is mathematically defined as the randomness generated by a simple deterministic system [28]. Xu C. proposed an improved boundary bird swarm algorithm [29]. The algorithm combined the good global convergence and robustness of the birds’ swarm algorithm. Tran, N. T. presented a method for fatigue life prediction of 2-DOF compliant mechanism which combined the differential evolution algorithm and the adaptive neuro-fuzzy inference system [30]. The experiment result shows that the accuracy of the proposed method is high.
Teaching-Learning-Based Optimization (TLBO) is proposed by R. V. Rao, which solves the global problem of continuous nonlinear functions [31]. The TLBO approach works on the philosophy of teaching and learning. Many scholars study the strategies to improve the optimization ability for a different problem. Gunji A. B. proposed improved TLBO for solving assembly sequence problems [32]. Zhang H. proposed a hybridizing TLBO [33]. The approach can enable better tracking accuracy and efficiency. Ho, N.L. presented a hybrid Taguchi-teaching learning-based optimization algorithm (HTLBO) [34]. The proposed method had good agreement with the predicted results. The strategies can improve the optimal ability of TLBO. In this paper, for solving the problem of learning efficiency and initial parameter setting, we use several strategies to enhance the optimal ability of the TLBO.
The main contribution of our work includes:
(1)
The enhanced Teaching-Learning-Based Optimization (ETLBO) is proposed to improve optimal ability. The adaptive weight and Kent chaotic map are used to enhance the TLBO. These two strategies can improve the searching ability of the students and teachers in TLBO.
(2)
We adopt the Tsaliis entropy-based feature selection method for finding the crucial feature. The selected feature x and the parameter α of Tsallis entropy are optimized by ETLBO.
(3)
The parameter c of the SVM classifier is optimized by ETLBO for obtaining high classification accuracy. The core idea of this method is to automatically determine the parameter α of Tsallis entropy and parameter c of the SVM under different data.
The proposed method is tested on several feature selection and classification problems in terms of several comment evaluation measures. The results are compared with other well-established optimization methods. The obtained results showed that the proposed ETLBO got better and promising results in almost all the tested problems compared to other methods.
The rest of the paper is described as follows: Section 2 introduces Tsallis’s entropy-based feature selection formula. Section 3, Enhance Teaching-learning-based optimization, and the ETLBO optimizes the feature selection design is introduced. In Section 4 and Section 5, the feature selection results and the algorithm analysis are given. Finally, the conclusions are summarized in Section 6.

2. Related work

2.1. Tsallis Entropy-Based Feature Selection (TEFS)

TEFS estimates the importance of a feature by calculating its information gain (IG) with respect to the target feature. The IG is calculated by subtracting the Tsallis entropy of features concerning target from the total entropy of the target feature. The Tsallis entropy and IG are defined as follow:
H ( m ) = 1 1 α log i = 1 n p i α
I G ( m | n ) = H ( m ) H ( m | n )
where, H ( m ) represents the Tsallis entropy of a feature m, I G ( m | n ) represents the Tsallis entropy of a target in terms of a feature n, m is the number of target feature, n is the total number of the feature.
IG measures the significance of a feature by calculating how much information a feature obtains us about the target.

2.2. SVM Classifier

SVM finds the optimal separation of hyperplanes between classes by focusing on the training cases of the edges of effectively discarded classes. For training samples F = { ( x 1 , y 1 ) , , ( x n , y n ) } in different dimensional spaces, a classifier can be accurately summarized. The main core of SVM is finding a suitable kernel function k ( x i , x j ) = ϕ ( x i ) ϕ ( x j ) , where ϕ ( x i ) is a nonlinear function, and the function is used to transfer the nonlinear space of the sample input to two hyperplanes. The formula can be written as:
f ( x ) = w ϕ ( x ) + b
where, w is the weight vector, b is the threshold value, and ( ) represents the inner product operation. The objective of SVM is to determine the w , and b when minimizing the w T w / 2 , it can be seen below:
min 1 2 w + C i = 1 n ξ i
where, ξ i is the slack variable, C is the penalty parameter.
The most commonly used kernel is the Gaussian kernel, used for data conversion in SVM. The Gaussian kernel is defined as:
K ( x i , x j ) = exp ( x i x j 2 2 δ 2 )
where, δ > 0 denotes the width parameter, and δ controls the mapping results.
The strategy of reducing multi-class problems to a set of dichotomies enables support vector machines to be used more appropriately with fewer computational requirements, that is, to consider all classes at once and thus to obtain a multi-class support vector machine. One way to do this is by solving a single optimization problem, similar to the “one for all” approach on a fundamental basis. There are n decision functions or hyperplanes, and the problems can convert to one problem as:
min 1 2 i = 1 n w i T w i + C j = 1 m i y j ξ j i w y j T φ ( x j ) + b y j
where, ξ j i 0 . The resulting decision function can be represented as:
arg max i ( w i T φ ( x i ) + b i )

2.3. Fitness Function Design

The main indexes influencing FS are the classification error accuracy and the number of features. So, how to balance the number of features and the classification is the essential key for the FS problem. Whereas, f 1 is the Normalized Mutual Information (NMI) [13]. The formula can be seen as follow:
f 1 ( x ) = N M I ( X , S ) = M I ( X ; S ) [ G ( X ) + G ( S ) ] / 2
where, X is the set of clusters and S is the set of classes. The MI is the mutual information between X and S [35]. It can be defined as follow:
M I ( X ; S ) = k j P ( X k S j ) log P ( X k S j ) P ( X k ) P ( S j ) = k j | X k S j | N log N | X k S j | | X k | | S j |
where, P ( X k ) , P ( S j ) , P ( X k S j ) is the probability of the X k , S j , and X k S j . The G ( X ) comes from the maximum likelihood estimation of probability.
G ( X ) = k P ( X k ) log P ( X k ) = k | X k | N log | X k | N

3. Enhance Teaching-Learning-Based Optimization (ETLBO)

In this section, we introduce the proposed method in detail. Firstly, we introduce the TLBO and the strategies used in the proposed method. And then, the ETLBO is introduced. Finally, the flowchart of the proposed method is described.

3.1. Teacher Phase

It is the first part of the algorithm where the learner with the highest marks acts as a teacher, and the teacher’s task is to increase the mean marks of the class. The update process of i-th learner in teacher phase is formulated as:
X i , n e w = X i + r a n d × ( X t e a c h e r T F × X a v e )
where, X i is the solution of the i-th learner, X t e a c h e r represents the teacher’s solution, X a v e means the average of all learners, rand is a random number in (0,1), and T F is the teaching factor that decides the value of mean to be changed. The value can be either 1 or 2, which is again a heuristic step and decided randomly with equal probability T F = r o u n d [ 1 + r a n d ( 0 , 1 ) { 2 1 } ] .
In addition, the new solution X i , n e w is accepted only if it is better than the previous solution, it can be formulated as:
X i = { X i , n e w f ( X i , n e w ) > f ( X i ) X i o t h e r w i s e
where, f means the fitness function.

3.2. Learner Phase

The second part of the algorithm is where the learner updates its knowledge through interaction with other learners. In each iteration, two learners interact with X m and X n , in which the more innovative learner improves the marks of other learners. In the learner phase, one learner learns new things if the other learner has more knowledge than himself. The phenomenon is described as follows:
X m , n e w = { X m + r a n d × ( X m X n ) ; f ( X m ) > f ( X n ) X m + r a n d × ( X n X m ) ; f ( X n ) > f ( X m )
The temporary solution is accepted only if it is better than the previous solution; it can be formulated as:
X m = { X m , n e w ; f ( X m , n e w ) > f ( X m ) X m ; o t h e r w i s e

3.3. Adaptive Weight Strategy

The adaptive weight strategy is easier to jump out of local minima, facilitating global optimization. While the TLBO solves the problem of the complex optimized function, the algorithm will easily fall into the local optimum. And a smaller inertia factor is beneficial for precise local search for the current search domain. We design a new weight strategy t which can be written as follows:
t = ( 1 i t e r M a x _ i t e r ) 1 sin ( π i t e r M a x _ i t e r )
where, iter is the current number of the iteration; Max_iter is the max number of the iteration.

3.4. Kent Chaotic Map (KCM)

Chaotic mapping is one kind of nonlinear mapping that can generate a random number sequence. It is sensitive to initial values, which ensures that the encoder can generate an unrelated encoding sequence. There are many kinds of chaotic maps, such as Logistic map, Kent map, etc. In this paper, we use the Kent map as the improved strategy. The formula of the Kent map can be seen as follow:
f ( x ) = { x a 0 < x a 1 x 1 a a < x < 1
where, a is a variable value, x is the initial value of the x ( 0 ) . In this paper, a = 0.5 .

3.5. Proposed Method

There are two phases in the basic TLBO search process to update the individual’s position. In the teacher phase, we use the Kent chaotic map to improve the original state of the teacher. The teacher can be endowed with different abilities to teach the different students. This strategy allows the abilities of different teachers to be demonstrated. In the learner phase, we design a learning efficiency to improve the students’ learning state. The adaptive weight strategy can improve itself with the iteration increases. The students will learn more knowledge at the beginning phase of the iteration. The students can obtain enough knowledge at the end of the iteration, and the adaptive weight gets small. The students can learn the different knowledge at the different phases. The formula can be represented as follow:
X m , n e w = { X m × t + r a n d × ( X m X n ) ; f ( X m ) > f ( X n ) X m × t + r a n d × ( X n X m ) ; f ( X n ) > f ( X m )
where, t is the adaptive weight.
The proposed classification method can be divided into two parts: feature selection and the parameter selection of the SVM. At first, the Tsallis entropy of the target is calculated using Equation (1). Then the entropy of each feature concerning the target is calculated and subtracted from the target’s entropy using Equation (2). In this process, the selected feature x and the parameter α of Tsallis entropy are optimized by the ETLBO. The parameter α can decide the ability of the Tsallis entropy.
In the second part, we use the ETLBO to optimize parameter c of SVM. The penalty coefficient c is the compromise between the smoothness of the fitting function and the classification accuracy. When c is too large, the training accuracy is high, and the generalization ability is poor; while c is too small, errors will be increased. Therefore, a reasonable selection of parameter c can obviously improve the model’s classification accuracy and generalization ability.
Finally, the selected feature x, the parameter α of Tsallis entropy, and the parameter c of SVM are optimized by ETLBO. We use the parameter optimized by the ETLBO and the SVM to classify the test dataset. The SVM classifier output the classification result. The flowchart of the proposed method is shown in Figure 1.

4. Experiment and Result

To analysis the effectiveness of the proposed method, five optimization algorithms are used for comparison, such as PSO [13], WOA [20], HHO [36], TLBO [29], HSOA [37], and HTLBO [34]. The PSO, WOA, HHO, and TLBO are the original optimization algorithms. These optimization algorithms have the strong ability to find the optimal value of the mathematical function. While these algorithms optimize the engineering problems, the optimization performance is not well. Many schoolers study the strategies to improve the optimization algorithms. The HSOA and HTLBO are improved methods. These two algorithms use the hybrid way to enhance the optimization ability of the SOA and TLBO. The improved methods have the excellent performance to solve the problems which mentioned in the reference [34,37]. However, these algorithms may not solve all problems. Therefore, we select these algorithms as compared algorithms to test the performance of the proposed method.
The set of parameters is the same as the reference. All the methods are coded and implemented in MATLAB 2018B. To keep the fairness of the compared algorithms, each algorithm runs 30 times independently. To test the performance of the comparison algorithm, we set the number of populations to 30 and the maximum iteration to 500. The proposed ETLBO is training in MATLAB2018B. Experiments are managed on a computer with an i7-11800H central processing unit.
The results of the proposed method are described in this section. First, the fitness values obtained by the different optimization algorithms are compared to show the performances of these approaches. Then, we analyze the classification result of the compared algorithms. Finally, the discussion of the proposed method is described.

4.1. Datasets and Evaluation Index

The benchmark datasets used in the evaluations are introduced. The dataset selects 16 standard datasets from the University of California (UCI) data repository [38]. Table 1 records the primary information of these selected datasets.
To evaluate the result of the health index diagnosis, we use the F-score, the accuracy of the classification and the CPU time as the metric index.
The function of F-score can be defined as follow:
F s c o r e = ( 1 + β 2 ) P r e c i s i o n R e c a l l β 2 P r e c i s i o n + R e c a l l
P r e c i s i o n = T p T p + F p
R e c a l l = T p T p + F n
where, T n is the number of negative classes, F n is the number of negative classes, T p is the number of positive classes, and F p is the number of positive classes.

4.2. Experiment 1: Feature Selection

Table 2 shows the fitness value of the compared algorithms. The table shows that when the number of features is small, the compared algorithms can reduce the number of features. When the number of features increases, it takes a huge challenge for the optimization algorithms. The ETLBO obtains better performance than compared algorithms. Table 3 shows the std of the fitness values. It can be known from the given table that the ETLBO has strong robustness.
Table 4 shows the number of the selected attributes. The compared algorithms can reduce the number of features. The attributes are little, and the compared algorithms obtain the same result. The ETLBO gets the least attributes among the compared algorithms when the attributes are large. The total attributes of the dataset, the ETLBO also obtain the least attributes than other algorithms. It means that the ETLBO can reduce the number of features. However, reducing the number of features does not mean the classification accuracy is high.
Table 5 shows the parameter obtained by ETLBO. It can be seen from the table that the ETBLO obtains the different values under the diverse dataset. The ETBLO not only reduce the number of features but also acquires the parameter α of Tsallis entropy and the parameter c of SVM. We will test the performance of the compared algorithms in the next section.

4.3. Experiment 2: Classification

Table 6 shows the classification results of compared algorithms. Table 7 shows the f-score of the compared methods. The table result shows that the ETLBO is better than the original TLBO. The strategies improve the optimal ability of the TLBO. At the same time, the HSOA and ETLBO are better than the other algorithms. It means that the strategies significantly boost the original optimization algorithms. It can be known that the methods can be ordered as follows in terms of them F-score result: ETLBO > HTLBO > HSOA > HHO > PSO > WOA > TLBO. To sum up, the ETLBO obtains the high f-score values.
To sum up, the ETLBO obtained the best result in compared algorithms. The ETLBO not only reduces the number of features but also obtains high classification accuracy. Table 8 shows that the std of classification accuracy. The ETLBO has a better stable ability than other algorithms. The proposed method has strong robustness to finish the classification task.
A statistical test is an essential and vital measure to evaluate and prove the performance of the tested methods. Parameter statistical test is based on various assumptions. This section uses well-known non-parametric statistical test types, Wilcoxon’s rank-sum test [39]. Table 9 shows the results of the Wilcoxon rank-sum test. It can be found that the ETLBO is significantly different from other methods.
The CPU time is also an important index for the practical engineering testing problem. The CPU time results of the compared algorithms can be seen in Table 10. The CPU time ordering of each algorithm is: TLBO < PSO < WOA < HHO < ETLBO < HTLBO < HSOA. Although the ETLBO costs considerable CPU time, the classification accuracy has good performance. At the same time, the ETLBO uses less CPU time than HSOA. It means that the strategies have good adaptive effectiveness for the TLBO. The strategies enhance the TLBO under less CPU time than the improved method.

4.4. Experiment 3: Compared with Different Classifiers

In this section, we compare with the different classifiers. The compared classifiers contain K-NearestNeighbor (KNN), original SVM, and random forest (RF) [40]. Table 11 shows the configuration parameters and characteristics of the classifier models.
Table 12 demonstrates the evaluation index of the compare algorithsm. The BTLBO obtains the best result than other compared classifiers in all index. The BTLBO outperforms KNN, SVM, and RF by yielding an improvement of 3.45%, 2.94%, and 1.62% in F-score index. To sum up, the optimization algorithms obtain the optimal parameter of the SVM. The classification accuracy is higher than other compared classifiers.

5. Discussion

The proposed method has an optimal ability to solve the Tsallis-entropy-based feature selection problem in the feature selection domain. The ETLBO selects the suitable parameter of the Tsallis-entropy. At the same time, the proposed method reduces the number of features successfully. The optimization algorithms have a robust optimal ability; however, they do not adapt to solve the different optimized problems. So some adaptive strategies are very effective for improving themselves.
The proposed method obtains better classification accuracy than the compared algorithms in the classification field. The proposed method finds the proper parameter α of the SVM classifier. The proposed method has a higher classification accuracy and strong robustness than the compared algorithms. At the same time, the proposed method is better than orther compared classifiers. So, the ETLBO algorithms can be used in the classification task field.
The proposed method’s limitation is that the optimization algorithm needs iteration to find the optimal solution, which is time-consuming. Improving the optimization capability and reducing the number of iterations can solve this problem. Therefore, it is necessary to search for powerful optimization algorithms and new strategies in future work.

6. Conclusions

In this paper, an enhanced teaching-learning-based optimization is proposed. The adaptive weight strategy and Kent chaotic map are used to enhance the TLBO. The ETLBO optimizes the selected feature x, the parameter α of Tsallis entropy, and the parameter c of SVM. The proposed method reduces the number of features through the UCI data experiment and finds the critical features for classification. Finally, the classification accuracy of the proposed method is better than compared algorithms.
We will design an effective and useful function to reduce the number of features in future work. We will focus on solving the randomness of the TLBO and obtaining more stability parameters of the fitness function. At the same time, we will also test the novel strategies to boost the TLBO.

Author Contributions

Conceptualization, D.W. and H.J.; methodology, D.W. and H.J.; software, D.W. and Z.X.; validation, H.J. and Z.X.; formal analysis, D.W., R.Z. and H.W.; investigation, D.W. and H.J.; writing—original draft preparation, D.W. and M.A.; writing—review and editing, D.W., L.A., M.A. and H.J.; visualization, D.W., H.W., M.A. and H.J.; funding acquisition. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by National fund cultivation project of Sanming University (PYS2107), the Sanming University Introduces High-level Talents to Start Scientific Research Funding Support Project (21YG01S), The 14th five year plan of Educational Science in Fujian Province (FJJKBK21-149), Bidding project for higher education research of Sanming University (SHE2101), Research project on education and teaching reform of undergraduate colleges and universities in Fujian Province (FBJG20210338), Fujian innovation strategy research joint project (2020R0135). This study was financially supported via a funding grant by Deanship of Scientific Research, Taif University Researchers Supporting Project number (TURSP-2020/300), Taif University, Taif, Saudi Arabia.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Ji, B.; Lu, X.; Sun, G.; Zhang, W.; Li, J.; Xiao, Y. Bio-inspired feature selection: An improved binary particle swarm optimization approach. IEEE Access 2020, 8, 85989–86002. [Google Scholar] [CrossRef]
  2. Kumar, S.; Tejani, G.G.; Pholdee, N.; Bureerat, S. Multiobjecitve structural optimization using improved heat transfer search. Knowl.-Based Syst. 2021, 219, 106811. [Google Scholar] [CrossRef]
  3. Sun, L.; Wang, L.; Ding, W.; Qian, Y.; Xu, J. Feature selection using fuzzy neighborhood entropy-based uncertainty measures for fuzzy neighborhood multigranulation rough sets. IEEE Trans. Fuzzy Syst. 2020, 99, 1–14. [Google Scholar] [CrossRef]
  4. Zhao, J.; Liang, J.; Dong, Z.; Tang, D.; Liu, Z. NEC: A nested equivalence class-based dependency calculation approach for fast feature selection using rough set theory. Inform. Sci. 2020, 536, 431–453. [Google Scholar] [CrossRef]
  5. Liu, H.; Zhao, Z. Manipulating data and dimension reduction methods: Feature selection. In Encyclopedia Complexity Systems Science; Springer: New York, NY, USA, 2009; pp. 5348–5359. [Google Scholar]
  6. Al-Tashi, Q.; Said, J.A.; Helmi, M.R.; Seyedali, M.; Hitham, A. Binary optimization using hybrid grey wolf optimization for feature selection. IEEE Access 2019, 7, 39496–39508. [Google Scholar] [CrossRef]
  7. Homayoun, H.; Mahdi, J.; Xinghuo, Y. An opinion formation based binary optimization approach for feature selection. Phys. A Stat. Mech. Its Appl. 2018, 491, 142–152. [Google Scholar]
  8. Mafarja, M.M.; Mirjalili, S. Hybrid whale optimization algorithm with simulated annealing for feature selection. Neurocomputing 2017, 260, 302–312. [Google Scholar] [CrossRef]
  9. Aljarah, L.; Ai-zoubl, A.M.; Faris, H.; Hassonah, M.A.; Mirjalili, S.; Saadeh, H. Simultaneous feature selection and support vector machine optimization using the grasshopper optimization algorithm. Cogn. Comput. 2018, 2, 1–18. [Google Scholar] [CrossRef] [Green Version]
  10. Lin, S.; Ying, K.; Chen, S.; Lee, Z. Particle swarm optimization for parameter determination and feature selection of support vector machines. Expert Syst. Appl. 2008, 35, 1817–1824. [Google Scholar] [CrossRef]
  11. Sherpa, S.R.; Wolfe, D.W.; Van Es, H.M. Sampling and data analysis optimization for estimating soil organic carbon stocks in agroecosystems. Soil Sci. Soc. Am. J. 2016, 80, 1377. [Google Scholar] [CrossRef]
  12. Lee, H.M.; Yoo, D.G.; Sadollah, A.; Kim, J.H. Optimal cost design of water distribution networks using a decomposition approach. Eng. Optim. 2016, 48, 16. [Google Scholar] [CrossRef]
  13. Roberge, V.; Tarbouchi, M.; Okou, F. Strategies to accelerate harmonic minimization in multilevel inverters using a parallel genetic algorithm on graphical processing unit. IEEE Trans. Power Electron. 2014, 29, 5087–5090. [Google Scholar] [CrossRef]
  14. Russell, E.; James, K. A new optimizer using particle swarm theory. In Proceedings of the 6th International Symposium on Micro Machine and Human Science, MHS’95, Nagoya, Japan, 4–6 October 1995; pp. 39–43. [Google Scholar]
  15. Rahnamayan, S.; Tizhoosh, H.; Salama, M. Opposition-based differential evolution. IEEE Trans. Evolut. Comput. 2008, 12, 64–79. [Google Scholar] [CrossRef] [Green Version]
  16. Liao, T.; Socha, K.; Marco, A.; Stutzle, T.; Dorigo, M. Ant colony optimization for mixed-variable optimization problems. IEEE Trans. Evolut. Comput. 2013, 18, 53–518. [Google Scholar] [CrossRef]
  17. Taran, S.; Bajaj, V. Sleep apnea detection using artificial bee colony optimize hermite basis functions for eeg signals. IEEE Trans. Instrum. Meas. 2019, 69, 608–616. [Google Scholar] [CrossRef]
  18. Precup, R.; David, R.; Petriu, E.M. Grey wolf optimizer algorithm-based tuning of fuzzy control systems with reduced parametric sensitivity. IEEE Trans. Ind. Electron. 2017, 64, 527–534. [Google Scholar] [CrossRef]
  19. Hatata, A.Y.; Lafi, A. Ant lion optimizer for optimal coordination of doc relays in distribution systems containing dgs. IEEE Access 2018, 6, 72241–72252. [Google Scholar] [CrossRef]
  20. Mirjalili, S. Moth-flame optimization algorithm: A novel natureinspired heuristic paradigm. Knowl.-Based Syst. 2015, 89, 228–249. [Google Scholar] [CrossRef]
  21. Mohamed, A.; Ahmed, A.; Aboul, E. Whale optimization algorithm and moth-flame optimization for multilevel thresholding image segmentation. Expert Syst. Appl. 2017, 83, 51–67. [Google Scholar]
  22. Sang, H.; Pan, Q.; Li, J.; Wang, P.; Han, Y.; Gao, K.; Duan, P. Effective invasive weed optimization algorithms for distributed assembly permutation flowshop problem with total flowtime criterion. Swarm Evolut. Comput. 2019, 444, 64–73. [Google Scholar] [CrossRef]
  23. Zhou, Y.; Luo, Q.; Chen, H.; He, A.; Wu, J. A discrete invasive weed optimization algorithm for solving traveling salesman problem. Neurocomputing 2015, 151, 1227–1236. [Google Scholar] [CrossRef]
  24. Wolpert, D.H.; Macready, W.G. No free lunch theorems for optimization. IEEE Trans. Evolut. Comput. 1997, 1, 67–82. [Google Scholar] [CrossRef] [Green Version]
  25. Zhang, Y.; Liu, X.; Bao, F.; Chi, J.; Zhang, C.; Liu, P. Particle swarm optimization with adaptive learning strategy. Knowl.-Based Syst. 2020, 196, 105789. [Google Scholar] [CrossRef]
  26. Dong, Z.; Wang, X.; Tang, L. Moea/d with a self-adaptive weight vector adjustment strategy based on chain segmentation. Inform. Sci. 2020, 521, 209–230. [Google Scholar] [CrossRef]
  27. Li, E.; Chen, R. Multi-objective decomposition optimization algorithm based on adaptive weight vector and matching strategy. Appl. Intell. 2020, 6, 1–17. [Google Scholar] [CrossRef]
  28. Feng, J.; Zhang, J.; Zhu, X.; Lian, W. A novel chaos optimization algorithm. Multimedia Tools Appl. 2016, 76, 1–32. [Google Scholar] [CrossRef]
  29. Xu, C.B.; Yang, R. Parameter estimation for chaotic systems using improved bird swarm algorithm. Mod. Phys. Lett. B 2017, 1, 1750346. [Google Scholar] [CrossRef]
  30. Tran, N.T.; Dao, T.-P.; Nguyen-Trang, T.; Ha, C.-N. Prediction of Fatigue Life for a New 2-DOF Compliant Mechanism by Clustering-Based ANFIS Approach. Math. Probl. Eng. 2021, 2021, 1–14. [Google Scholar] [CrossRef]
  31. Rao, R.V.; Savsani, V.J.; Vakharia, D.P. Teaching learning based optimization: A novel method for constrained mechanical design optimization problems. Comput. Aided Des. 2011, 43, 303–315. [Google Scholar] [CrossRef]
  32. Gunji, A.B.; Deepak, B.; Bahubalendruni, C.; Biswal, D. An optimal robotic assembly sequence planning by assembly subsets detection method using teaching learning-based optimization algorithm. IEEE Trans. Autom. Sci. Eng. 2018, 1, 1–17. [Google Scholar] [CrossRef]
  33. Zhang, H.; Gao, Z.; Ma, X.; Jie, Z.; Zhang, J. Hybridizing teaching-learning-based optimization with adaptive grasshopper optimization algorithm for abrupt motion tracking. IEEE Access 2019, 7, 168575–168592. [Google Scholar] [CrossRef]
  34. Ho, N.L.; Dao, T.-P.; Le Chau, N.; Huang, S.-C. Multi-objective optimization design of a compliant microgripper based on hybrid teaching learning-based optimization algorithm. Microsyst. Technol. 2018, 25, 2067–2083. [Google Scholar] [CrossRef]
  35. Estevez, P.A.; Tesmer, M.; Perez, C.; Zurada, J. Normalized mutual information feature selection. IEEE Trans. Neural Netw. 2009, 20, 189–201. [Google Scholar] [CrossRef] [Green Version]
  36. Heidari, A.A.; Mirjalili, S.; Faris, H.; Aljarah, I.; Mafarja, M.; Chen, H. Harris hawks optimization: Algorithm and applications. Future Gener. Comput. Syst. 2019, 97, 849–872. [Google Scholar] [CrossRef]
  37. Jia, H.; Xing, Z.; Song, W. A new hybrid seagull optimization algorithm for feature selection. IEEE Access 2019, 12, 49614–49631. [Google Scholar] [CrossRef]
  38. Newman, D.J.; Hettich, S.; Blake, C.L.; Merz, C.J. UCI Repository of Machine Learning Databases. Available online: http://www.ics.uci.edu/~mlearn/MLRepository.html (accessed on 1 June 2016).
  39. Derrac, J.S.; Garcia, D.; Molina, F.; Herrera, A. Practical tutorial on the use of non-parametric statistical test as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm Evolut. Comput. 2011, 1, 13–18. [Google Scholar] [CrossRef]
  40. Machaka, R. Machine learning-based prediction of phases in high-entropy alloys. Comput. Mater. Sci. 2020, 188, 110244. [Google Scholar] [CrossRef]
Figure 1. The flowchart of the proposed method.
Figure 1. The flowchart of the proposed method.
Processes 10 00360 g001
Table 1. The datasets used in the experiments.
Table 1. The datasets used in the experiments.
DatasetsSamplesFeatures
1Iris1504
2Wine17813
3Sonar20860
4Vehicle84618
5Balancescale6254
6CMC14739
7Cancer6839
8Vowel8713
9Thyroid2155
10WDBC56930
11HeartEW27013
12Lymphography14818
13SonarEW20860
14IonosphereEW35134
15Vote30016
16WaveformEW500040
Table 2. The fitness values of compared algorithms.
Table 2. The fitness values of compared algorithms.
PSOWOAHHOTLBOHSOAHTLBOETLBO
Iris0.02710.02710.02710.02710.02710.02710.0271
Wine0.1846 0.1657 0.1594 0.1709 0.1585 0.15760.1521
Sonar0.2655 0.2147 0.2356 0.2386 0.2268 0.22580.2145
Vehicle0.2972 0.2575 0.2746 0.2528 0.2432 0.27410.2716
Balancescale0.0619 0.0519 0.0540 0.0613 0.0521 0.05560.0516
CMC0.1937 0.1855 0.1837 0.1777 0.1701 0.17580.1621
Cancer0.0629 0.0710 0.0660 0.0645 0.0634 0.06450.0612
Vowel0.12510.12510.12510.12510.12510.12510.1251
Thyroid0.0934 0.0853 0.0995 0.0922 0.0925 0.09210.0851
WDBC0.3828 0.3155 0.3678 0.3258 0.3234 0.32210.3154
HeartEW0.2942 0.2993 0.3325 0.2869 0.3077 0.29890.2814
Lymphography0.2208 0.2040 0.2127 0.1965 0.1999 0.19890.1951
SonarEW0.3650 0.4065 0.3892 0.3721 0.3610 0.35890.3514
IonosphereEW0.3390 0.3425 0.3718 0.3339 0.3426 0.34110.3226
Vote0.2274 0.2567 0.2460 0.2542 0.2215 0.22180.2158
WaveformEW0.4348 0.4151 0.3998 0.4436 0.4278 0.40250.3915
Table 3. The std of fitness values.
Table 3. The std of fitness values.
PSOWOAHHOTLBOHSOAHTLBOETLBO
Iris4.56 × 10−44.69 × 10−45.40 × 10−44.60 × 10−44.86 × 10−44.88 × 10−44.50 × 10−4
Wine6.84 × 10−46.53 × 10−46.07 × 10−46.22 × 10−45.79 × 10−45.82 × 10−45.60 × 10−4
Sonar1.27 × 10−41.12 × 10−51.00 × 10−51.08 × 10−51.10 × 10−51.11 × 10−51.00 × 10−5
Vehicle6.54 × 10−46.16 × 10−46.37 × 10−46.95 × 10−46.13 × 10−46.15 × 10−46.10 × 10−4
Balancescale4.25 × 10−44.49 × 10−44.50 × 10−44.89 × 10−44.17 × 10−44.20 × 10−44.10 × 10−4
CMC4.13 × 10−43.47 × 10−43.96 × 10−43.56 × 10−43.40 × 10−43.41 × 10−43.30 × 10−4
Cancer1.15 × 10−31.13 × 10−31.03 × 10−31.03 × 10−31.02 × 10−31.02 × 10−49.60 × 10−4
Vowel7.80 × 10−47.68 × 10−48.44 × 10−47.94 × 10−47.74 × 10−47.78 × 10−47.50 × 10−4
Thyroid6.82 × 10−46.79 × 10−47.33 × 10−47.22 × 10−47.26 × 10−47.26 × 10−46.70 × 10−4
WDBC1.05 × 10−31.01 × 10−31.02 × 10−31.01 × 10−39.74 × 10−49.78 × 10−49.40 × 10−4
HeartEW9.67 × 10−49.34 × 10−49.08 × 10−41.00 × 10−38.91 × 10−48.93 × 10−48.80 × 10−4
Lymphography8.62 × 10−47.11 × 10−48.00 × 10−48.13 × 10−47.44 × 10−47.47 × 10−46.90 × 10−4
SonarEW4.54 × 10−43.81 × 10−43.91 × 10−44.50 × 10−44.00 × 10−44.02 × 10−43.80 × 10−4
IonosphereEW9.56 × 10−48.85 × 10−48.22 × 10−47.72 × 10−47.82 × 10−47.88 × 10−47.40 × 10−4
Vote8.81 × 10−49.30 × 10−49.87 × 10−49.64 × 10−49.23 × 10−49.25 × 10−18.50 × 10−4
WaveformEW4.34 × 10−44.51 × 10−44.28 × 10−44.34 × 10−43.92 × 10−43.93 × 10−43.90 × 10−4
Table 4. The average number of selected attributes.
Table 4. The average number of selected attributes.
AttributesPSOWOAHHOTLBOHSOAHTLBOETLBO
Iris43333333
Wine1366510665
Sonar6031323248292928
Vehicle184444444
Balancescale44344433
CMC97668786
Cancer95657665
Vowel33333333
Thyroid54434443
WDBC3098710786
HeartEW135546554
Lymphography186776865
SonarEW6019181922151312
IonosphereEW3425232017151412
Vote168798666
WaveformEW4021262423201816
Total336160161155183142136121
Table 5. The parameter obtained by ETLBO.
Table 5. The parameter obtained by ETLBO.
α c
Iris0.52 1.14
Wine0.64 1.20
Sonar0.80 1.24
Vehicle0.22 1.83
Balancescale0.77 1.17
CMC0.26 1.22
Cancer0.15 0.65
Vowel0.07 1.82
Thyroid0.61 1.01
WDBC0.81 1.17
HeartEW0.78 0.95
Lymphography0.47 1.64
SonarEW0.08 1.82
IonosphereEW0.55 0.99
Vote0.73 1.58
WaveformEW0.55 0.73
Total0.73 1.83
Table 6. The classification accuracy of compared algorithms.
Table 6. The classification accuracy of compared algorithms.
PSOWOAHHOTLBOHSOAHTLBOETLBO
Iris0.9545 0.9481 0.9320 0.9167 0.9548 0.9546 0.9579
Wine0.9413 0.9303 0.9232 0.9369 0.9411 0.9479 0.9488
Sonar0.9160 0.9375 0.9242 0.9025 0.9347 0.9359 0.9435
Vehicle0.9012 0.9306 0.9147 0.9358 0.9345 0.9339 0.9414
Balancescale0.9338 0.9241 0.9308 0.9287 0.9450 0.9442 0.9465
CMC0.9440 0.9271 0.9259 0.9253 0.9387 0.9371 0.9454
Cancer0.9060 0.9180 0.9403 0.9356 0.9349 0.9408 0.9438
Vowel0.9600 0.9477 0.9429 0.9559 0.9628 0.9581 0.9651
Thyroid0.9031 0.9249 0.9048 0.8849 0.9202 0.9282 0.9285
WDBC0.9329 0.9351 0.9177 0.9057 0.9345 0.9358 0.9385
HeartEW0.9230 0.9118 0.9143 0.8995 0.9323 0.9271 0.9358
Lymphography0.9022 0.9256 0.9001 0.9141 0.9210 0.9180 0.9268
SonarEW0.9360 0.9115 0.9116 0.9164 0.9336 0.9357 0.9361
IonosphereEW0.9361 0.9448 0.9276 0.9233 0.9384 0.9441 0.9458
Vote0.9136 0.9170 0.9217 0.9008 0.9381 0.9378 0.9451
WaveformEW0.9276 0.9354 0.9229 0.9029 0.9292 0.9284 0.9365
Table 7. The f-score of compared algorithms.
Table 7. The f-score of compared algorithms.
PSOWOAHHOTLBOHSOAHTLBOETLBO
Iris0.9362 0.90380.91930.9007 0.92620.93340.9498
ine0.9118 0.91160.91750.9219 0.91940.92880.9433
Sonar0.9043 0.9130.92060.8754 0.92230.90910.9359
Vehicle0.8782 0.88950.90380.9142 0.9180.91640.9387
Balancescale0.9169 0.90550.91390.9139 0.93090.92760.9422
CMC0.9295 0.90.89880.8977 0.92170.91090.9429
Cancer0.8861 0.91190.89450.9104 0.91330.92230.9396
Vowel0.9399 0.9140.93360.9345 0.94640.95060.9601
Thyroid0.8812 0.87590.89540.8585 0.9080.9040.9211
WDBC0.9102 0.90390.90710.8913 0.91280.91430.9329
HeartEW0.8936 0.88870.89620.8892 0.90450.91860.9275
Lymphography0.8894 0.87250.91170.8891 0.88830.91090.9184
SonarEW0.9199 0.89880.8970.8888 0.90660.90920.9360
IonosphereEW0.9171 0.90880.92120.9090 0.93040.91680.9411
Vote0.8890 0.91080.90570.8776 0.92250.92040.9449
WaveformEW0.9115 0.89480.91060.8888 0.90890.91020.9346
Table 8. The std of classification accuracy.
Table 8. The std of classification accuracy.
PSOWOAHHOTLBOHSOAHTLBOETLBO
Iris4.57 × 10−54.69 × 10−55.45 × 10−54.63 × 10−54.90 × 10−54.60 × 10−54.54 × 10−5
Wine6.87 × 10−56.56 × 10−56.08 × 10−56.22 × 10−55.84 × 10−56.90 × 10−55.65 × 10−5
Sonar1.28 × 10−61.12 × 10−51.01 × 10−61.08 × 10−51.10 × 10−61.28 × 10−61.00 × 10−5
Vehicle6.60 × 10−56.19 × 10−56.39 × 10−56.98 × 10−56.14 × 10−56.63 × 10−56.12 × 10−5
Balancescale4.28 × 10−54.50 × 10−54.52 × 10−54.92 × 10−54.18 × 10−54.31 × 10−54.13 × 10−5
CMC4.14 × 10−53.50 × 10−53.97 × 10−53.58 × 10−53.41 × 10−54.18 × 10−53.32 × 10−5
Cancer1.15 × 10−41.14 × 10−41.03 × 10−41.04 × 10−41.02 × 10−41.16 × 10−59.62 × 10−5
Vowel7.80 × 10−57.69 × 10−58.48 × 10−57.98 × 10−57.78 × 10−57.88 × 10−57.55 × 10−5
Thyroid6.86 × 10−56.83 × 10−57.39 × 10−57.28 × 10−57.33 × 10−56.88 × 10−56.74 × 10−5
WDBC1.06 × 10−41.01 × 10−41.02 × 10−41.02 × 10−49.84 × 10−51.06 × 10−59.42 × 10−5
HeartEW9.76 × 10−59.43 × 10−59.11 × 10−51.00 × 10−48.95 × 10−59.86 × 10−58.86 × 10−5
Lymphography8.70 × 10−57.18 × 10−58.07 × 10−58.16 × 10−57.50 × 10−58.73 × 10−56.94 × 10−5
SonarEW4.55 × 10−53.84 × 10−53.92 × 10−54.51 × 10−54.02 × 10−54.56 × 10−53.81 × 10−5
IonosphereEW9.61 × 10−58.88 × 10−58.24 × 10−57.74 × 10−57.87 × 10−59.66 × 10−57.45 × 10−5
Vote8.82 × 10−59.35 × 10−59.97 × 10−59.64 × 10−59.31 × 10−58.89 × 10−58.56 × 10−5
WaveformEW4.38 × 10−54.55 × 10−54.29 × 10−54.35 × 10−53.94 × 10−54.41 × 10−53.92 × 10−5
Table 9. Wilcoxon’s rank-sum test of classification accuracy.
Table 9. Wilcoxon’s rank-sum test of classification accuracy.
PSO WOA HHO TLBO HSOA HTLBO
p-Valuehp-Valuehp-Valuehp-Valuehp-Valuehp-Valueh
Iris<0.051<0.051<0.051<0.051<0.051<0.051
Wine<0.051<0.051<0.051<0.051<0.051<0.051
Sonar<0.051<0.051<0.051<0.051<0.051<0.051
Vehicle<0.051<0.051<0.051<0.051<0.051<0.051
Balancescale<0.051<0.051<0.051<0.051<0.051<0.051
CMC<0.051<0.051<0.051<0.051<0.051<0.051
Cancer<0.051<0.051<0.051<0.051<0.051<0.051
Vowel<0.051<0.051<0.051<0.051<0.051<0.051
Thyroid<0.051<0.051<0.051<0.051<0.051<0.051
WDBC<0.051<0.051<0.051<0.051<0.051<0.051
HeartEW<0.051<0.051<0.051<0.051<0.051<0.051
Lymphography<0.051<0.051<0.051<0.051<0.051<0.051
SonarEW<0.051<0.051<0.051<0.051<0.051<0.051
IonosphereEW<0.051<0.051<0.051<0.051<0.051<0.051
Vote<0.051<0.051<0.051<0.051<0.051<0.051
WaveformEW<0.051<0.051<0.051<0.051<0.051<0.051
Table 10. The CPU time of the compared algorithms.
Table 10. The CPU time of the compared algorithms.
ETLBOPSOWOAHHOTLBOHSOAHTLBO
Iris2.39731.84412.02852.17941.67642.6372.4128
Wine4.12473.17293.49023.74982.88444.53724.1315
Sonar6.655.11545.62696.04544.65037.3156.6542
Vehicle4.17223.20943.53033.79292.91764.58944.1765
Balancescale2.43021.86942.05632.20931.69942.67322.4462
CMC3.07922.36862.60552.79932.15333.38723.0803
Cancer1.55091.1931.31231.411.08461.7061.5669
Vowel3.69882.84523.12983.36262.58664.06873.7183
Thyroid3.66762.82123.10343.33422.56484.03443.6827
WDBC6.40254.9255.41755.82044.47727.04276.4075
HeartEW4.51643.47413.82154.10583.15834.9684.5271
Lymphography6.46564.97355.47095.87784.52147.11226.4850
SonarEW6.93755.33655.87026.30684.85147.63136.9569
IonosphereEW7.5525.80926.39016.86545.28118.30727.5664
Vote5.59874.30674.73745.08983.91526.15865.6065
WaveformEW3.38212.60162.86183.07462.36513.72033.3848
Table 11. Configuration parameters and characteristics of the classifier models.
Table 11. Configuration parameters and characteristics of the classifier models.
ClassifierCaret Method ValueR PackageTuning ParametersCharacteristics
KNNknn k-5Unique classifier. The number of neighbors is directly compared to the test data using the KNN function in the Caret package.
SVMsvmRadialE1071Σ−7 × 10−2
c-1
Radial basic function outperformed linear SVM.
RFrfrandomForestmtry-8
ntree-150
Overcomes the disadvantage of simple DT using a large number of DT’s to classify by majority vote. Use the randomForest function.
Table 12. The evaluation index of compared algorithms.
Table 12. The evaluation index of compared algorithms.
ClassifierAcPcRF-Score
KNN0.9275 0.9135 0.9123 0.9097
SVM0.9241 0.9178 0.9167 0.9142
RF0.9352 0.9189 0.9197 0.9261
BTLBO0.94780.94550.94270.9411
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Wu, D.; Jia, H.; Abualigah, L.; Xing, Z.; Zheng, R.; Wang, H.; Altalhi, M. Enhance Teaching-Learning-Based Optimization for Tsallis-Entropy-Based Feature Selection Classification Approach. Processes 2022, 10, 360. https://doi.org/10.3390/pr10020360

AMA Style

Wu D, Jia H, Abualigah L, Xing Z, Zheng R, Wang H, Altalhi M. Enhance Teaching-Learning-Based Optimization for Tsallis-Entropy-Based Feature Selection Classification Approach. Processes. 2022; 10(2):360. https://doi.org/10.3390/pr10020360

Chicago/Turabian Style

Wu, Di, Heming Jia, Laith Abualigah, Zhikai Xing, Rong Zheng, Hongyu Wang, and Maryam Altalhi. 2022. "Enhance Teaching-Learning-Based Optimization for Tsallis-Entropy-Based Feature Selection Classification Approach" Processes 10, no. 2: 360. https://doi.org/10.3390/pr10020360

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop