# Application of the Gravitational Search Algorithm for Constructing Fuzzy Classifiers of Imbalanced Data

^{*}

## Abstract

**:**

## 1. Introduction

- We propose a new metric based on the sum of the overall accuracy and the geometric mean of each class accuracy. The presence of the coefficient controls the priority of the estimates used.
- We demonstrated the use of the feature selection method based on the binary gravitational search algorithm in order to reduce the effect of imbalance on classification. The application of the new metric as the fitness function assisted to find subsets of relevant features for both classes.
- We presented the combination of binary and continuous algorithms for constructing fuzzy classifiers of imbalanced data. The continuous gravitational search algorithm helped to increase the quality of classification on selected features.

## 2. Related Works

- Threshold metrics geared towards minimizing the number of errors, i.e., the overall accuracy, the averaged accuracy (arithmetic and geometric), the Fβ-measure, and the Kappa-statistics;
- Metrics based on the probabilistic understanding of an error and used to assess the reliability of classifiers, such as the mean absolute error, the mean square error, and the cross-entropy;
- Metrics based on estimating instance separability, for example, the AUC, which is equivalent to Mann–Whitney–Wilcoxon statistics [9] for two classes.

## 3. Materials and Methods

#### 3.1. The Fuzzy Classifier

#### 3.1.1. The Fuzzy Classifier Structure

**C**= {c

_{1}, c

_{2}, …, c

_{l}} to each object

**x**

_{p}= {x

_{p}

_{1}, x

_{p}

_{2}, …, x

_{pm}} from the set of n objects (p ∈ [1, n]), where x

_{pk}is the value of the kth feature of the pth object, k ∈ [1, m], m is the number of features. The fuzzy classifier is constructed on the basis of production rules, each of which has its own set of fuzzy terms. A fuzzy term is a structure on the feature definition domain, reflecting the degree of object membership to a rule. The terms can be described by membership functions of various kinds such as triangles, trapezoids, bells, or Gaussian-type functions. In this work, we used the membership functions of the Gaussian type, which differ from others by the property of symmetry. Figure 1 shows an example of partitioning some attribute x

_{1}by Gaussian terms.

**θ**= (b

_{11}, c

_{11}, b

_{12}, c

_{12}, b

_{13}, c

_{13}, b

_{21}, c

_{21}, …, b

_{mr}, c

_{mr}).

_{i}: If x

_{1}is T

_{i}

_{1}and x

_{2}is T

_{i2}and … and x

_{m}is T

_{im}then class is c

_{j},

_{j}is the label of the jth class from the set of classes

**C**, class is an output variable.

**S**= (s

_{1}, s

_{2}, …, s

_{m}) must be introduced into the antecedent part. If s

_{k}= 1, then the kth feature is taken into account in the classification; otherwise the feature is ignored. Given the vector

**S**, the fuzzy rule will change as follows:

_{i}: If (s

_{1}˄x

_{1}) is T

_{i1}and (s

_{2}˄x

_{2}) is T

_{i2}and … and (s

_{m}˄x

_{m}) is T

_{im}then class is c

_{j},

_{p}˄x

_{p}) indicates the use (s

_{p}= 1) or ignorance (s

_{p}= 0) of the feature and its terms in the classifier. The binary vector

**S**= (s

_{1}, s

_{2}, …, s

_{m}) is formed by the feature selection algorithm.

#### 3.1.2. Generation of the Fuzzy Rule Base

#### 3.1.3. Output of Fuzzy Classifier

**x**

_{p}is formed by sequentially performing three steps. In the first step, the value of the membership function of the object to each term is calculated:

#### 3.1.4. Classification Quality Evaluation

**x**

_{p}; c

_{p}), p ∈ [1, z]}, where z is the number of instances, the measure of accuracy can be given as follows:

**x**

_{p};

**θ**,

**S**) is the output of the fuzzy classifier with the parameter vector

**θ**and the binary feature vector

**S**at the point

**x**

_{p}. As noted earlier, the overall accuracy is not an objective assessment of classification quality when there is an imbalance in the class distribution.

_{i}(

**θ**,

**S**) is the classification accuracy of ith class:

_{i}is the number of instances with the ith class label. Thus, the fewer instances represent a class, the geometric mean increases more significantly with an increment in the number of correctly classified instances of that class. In the case when one of the classes is classified absolutely incorrectly, the geometric mean is zero.

#### 3.2. Training a Classifier with the Gravitational Search Algorithm

**S**and continuous for optimizing the continuous vector of term parameters θ. The GSA was first proposed by Rashedi, Nezamabadi-pour, and Saryazdi in 2009 [33], and in the same year, its binary version was described [34]. This algorithm is widely used to solve various problems. For example, the GSA was applied to optimize parameters in a geothermal power generation system in the study of Özkaraca and Keçebaş [35], to determine the location of a microseismic source in order to warn about explosions in tunnels in [36]. Mahanipour and Nezamabadi-pour described the use of GSA for the automatic creation of computer programs in [37] and the feature construction in [38].

**S**is generated randomly. At each iteration, the algorithm calculates particle masses, gravity, acceleration, and velocity. Transformation functions are applied to transform the obtained speed value into a binary equivalent in order to update the feature vector. In this paper, we use the V-type transformation function:

_{c}the vector value is updated by the simple addition of the current value and the calculated speed:

_{0}, the coefficient of the gravitational constant decrease α, and the variable for calculating the attractive force ε. The computational complexity of the GSA with n agents is O(n × d) where d is the search space dimension [39]. We did not modify the original GSA, therefore, both algorithms have the complexity O(P × d), where P is the number of particles and d is the size of the dataset.

**θ**, the binary GSA searches for the optimal vector

**S**; then, the classifier is rebuilt on the obtained set of features

**S**

_{best}and the algorithm for optimizing the term parameters is launched; the continuous GSA runs for a given number of iterations and provides the best parameter vector

**θ**

_{best}; and the resulting

**S**

_{best}and

**θ**

_{best}are used to construct and validate the classifier on test data.

## 4. Experimental Results

_{all}is the number of features in a dataset, Str

_{all}is the number of lines, Str

_{+}is the number of rows of the smallest class, Str

_{-}is the number of rows of the largest class, and IR is the imbalance ratio. The imbalance ratio is the ratio of the number of rows of a negative class to the number of rows of a positive class.

_{c}: 750 iterations, 15 particles, G

_{0}= 10, α = 10, and ε = 0.01. The particle population was cleared after each 150th iteration, except for the best particle on the basis of which the population was generated anew. The parameters were chosen empirically as the most universal for the selected datasets.

_{c}.

_{0}= 10, α = 10, and ε = 0.01. The parameters of the continuous algorithm did not differ from those used at the first stage of the experiment. Table 3 shows the results of the classifier on the selected feature sets before parameter tuning (GSA

_{b}) and after optimization (GSA

_{b}+ GSA

_{c}). In the following table and further, formatting the cells according to a color scale was used to visualize the results. The values presented in each row were compared with each other. The hue of the color depended on the relative magnitude of the value compared to other cells in the row. Thus, the worst results are marked in red, the best are highlighted in green, the remaining values are colored in intermediate colors.

## 5. Discussion

_{c}parameters were taken into account. Table 5 shows the results of the pairwise comparison of the number of features by Wilcoxon’s sign rank criterion for linked samples. The significance level is 0.05; the null hypothesis states that the difference median between the two samples is zero.

_{all}) and in the selected feature sets (F

_{bin}). The last three rows are the comparison of the number of features when using the GSA

_{b}with different values of the coefficient γ in the fitness function.

_{b}(Table 3). In this case, we considered the results without optimizing parameters. The average performance indexes of the classifiers are given in Table 7 (F is the number of features).

_{c}in relation to fuzzy classifiers built on oversampled data demonstrate better overall accuracy with comparable recognition quality of a positive class. Therefore, if for the classification task it is important not only to classify the positive class correctly, but also not to receive large losses in the recognition of a negative class, then a fuzzy classifier with parameter tuning with the GSA

_{c}is a more preferable tool.

## 6. Conclusions

## Author Contributions

## Funding

## Conflicts of Interest

## Appendix A

**Table A1.**The results of the comparison of various classification algorithms with fuzzy classifiers optimized with the gravitational search algorithm by the value of the overall accuracy.

Algorithms | FC, γ = 0 | FC, γ = 1 | FC, γ = 0.5 | ||||||
---|---|---|---|---|---|---|---|---|---|

STS | p | NH | STS | p | NH | STS | p | NH | |

GNB | −2.521 | 0.012 | Reject | −2.521 | 0.012 | Reject | −2.521 | 0.012 | Reject |

LR | −0.21 | 0.833 | Retain | 0.7 | 0.484 | Retain | 0.42 | 0.674 | Retain |

DT | −0.56 | 0.575 | Retain | 0.42 | 0.674 | Retain | 0.14 | 0.889 | Retain |

MLP | −0.14 | 0.889 | Retain | 1.122 | 0.262 | Retain | 0.7 | 0.484 | Retain |

LSV | −1.12 | 0.263 | Retain | 0.14 | 0.889 | Retain | −0.28 | 0.779 | Retain |

3NN | 0 | 1 | Retain | 0.98 | 0.327 | Retain | 0.7 | 0.484 | Retain |

AB | −0.14 | 0.889 | Retain | 0.98 | 0.327 | Retain | 0.7 | 0.484 | Retain |

RF | −0.491 | 0.624 | Retain | 1.26 | 0.208 | Retain | 0.771 | 0.441 | Retain |

GB | 0.07 | 0.944 | Retain | 1.183 | 0.237 | Retain | 0.845 | 0.398 | Retain |

**Table A2.**The results of the comparison of various classification algorithms with fuzzy classifiers optimized with the gravitational search algorithm by the value of the geometric mean accuracy of each class.

Algorithms | FC, γ = 0 | FC, γ = 1 | FC, γ = 0.5 | ||||||
---|---|---|---|---|---|---|---|---|---|

STS | p | NH | STS | p | NH | STS | p | NH | |

GNB | 0.28 | 0.779 | Retain | −2.521 | 0.012 | Reject | −2.521 | 0.012 | Reject |

LR | 0.421 | 0.674 | Retain | −1.54 | 0.123 | Retain | −1.54 | 0.123 | Retain |

DT | 1.12 | 0.263 | Retain | −1.82 | 0.069 | Retain | −1.68 | 0.093 | Retain |

MLP | 1.26 | 0.208 | Retain | −1.4 | 0.161 | Retain | −1.26 | 0.208 | Retain |

LSV | −0.7 | 0.484 | Retain | −1.963 | 0.05 | Reject | −1.82 | 0.069 | Retain |

3NN | 1.26 | 0.208 | Retain | −1.521 | 0.128 | Retain | −1.26 | 0.208 | Retain |

AB | 0.84 | 0.401 | Retain | −1.69 | 0.091 | Retain | −1.26 | 0.208 | Retain |

RF | 0 | 1 | Retain | −1.68 | 0.093 | Retain | −1.68 | 0.093 | Retain |

GB | 0.84 | 0.401 | Retain | −1.54 | 0.123 | Retain | −1.4 | 0.161 | Retain |

**Table A3.**The results of the comparison of various classification algorithms with fuzzy classifiers optimized with the gravitational search algorithm by the value of the true positive rate.

Algorithms | FC, γ = 0 | FC, γ = 1 | FC, γ = 0.5 | ||||||
---|---|---|---|---|---|---|---|---|---|

STS | p | NH | STS | p | NH | STS | p | NH | |

GNB | 1.859 | 0.063 | Retain | −0.42 | 0.674 | Retain | 0.42 | 0.674 | Retain |

LR | 0.338 | 0.735 | Retain | −2.24 | 0.025 | Reject | −1.54 | 0.123 | Retain |

DT | 1.014 | 0.31 | Retain | −2.521 | 0.012 | Reject | −1.82 | 0.069 | Retain |

MLP | 1.54 | 0.123 | Retain | −2.028 | 0.043 | Reject | −1.26 | 0.208 | Retain |

LSV | −0.28 | 0.779 | Retain | −1.992 | 0.046 | Reject | −1.68 | 0.093 | Retain |

3NN | 1.521 | 0.128 | Retain | −2.521 | 0.012 | Reject | −1.4 | 0.161 | Retain |

AB | 1.014 | 0.31 | Retain | −2.24 | 0.025 | Reject | −1.4 | 0.161 | Retain |

RF | −0.169 | 0.866 | Retain | −2.524 | 0.012 | Reject | −1.68 | 0.093 | Retain |

GB | 1.4 | 0.161 | Retain | −2.24 | 0.025 | Reject | −1.54 | 0.123 | Retain |

**Table A4.**The results of the comparison of various classification algorithms with fuzzy classifiers optimized with the gravitational search algorithm by the value of the true negative rate.

Algorithms | FC, γ = 0 | FC, γ = 1 | FC, γ = 0.5 | ||||||
---|---|---|---|---|---|---|---|---|---|

STS | p | NH | STS | p | NH | STS | p | NH | |

GNB | −2.521 | 0.012 | Reject | −2.366 | 0.018 | Reject | −2.521 | 0.012 | Reject |

LR | −1.12 | 0.263 | Retain | 1.26 | 0.208 | Retain | 0.98 | 0.327 | Retain |

DT | −2.38 | 0.017 | Reject | 1.183 | 0.237 | Retain | 0.98 | 0.327 | Retain |

MLP | −1.54 | 0.123 | Retain | 1.26 | 0.208 | Retain | 1.12 | 0.263 | Retain |

LSV | −1.68 | 0.093 | Retain | 0.56 | 0.575 | Retain | 0.42 | 0.674 | Retain |

3NN | −1.54 | 0.123 | Retain | 1.26 | 0.208 | Retain | 1.332 | 0.183 | Retain |

AB | −1.54 | 0.123 | Retain | 2.383 | 0.017 | Reject | 1.96 | 0.05 | Reject |

RF | −1.262 | 0.207 | Retain | 2.103 | 0.035 | Reject | 2.1 | 0.036 | Reject |

GB | −0.631 | 0.528 | Retain | 2.38 | 0.017 | Reject | 2.1 | 0.036 | Reject |

## References

- Peng, L.; Zhang, H.; Yang, B.; Chen, Y. A new approach for imbalanced data classification based on data gravitation. Inf. Sci.
**2014**, 288, 347–373. [Google Scholar] [CrossRef] - Special Issue on Recent advances in Theory, Methodology and Applications of Imbalanced Learning. IEEE Trans. Neural Netw. Learn. Syst.
**2018**, 29, 763. [CrossRef] - He, H.; Garcia, E.A. Learning from Imbalanced Data. IEEE Trans. Know. Data Eng.
**2009**, 21, 1263–1284. [Google Scholar] [CrossRef] - Ali, A.; Shamsuddin, S.M.; Ralescu, A. Classification with class imbalance problem: A review. Int. J. Adv. Soft Comput. Appl.
**2013**, 5, 1–30. [Google Scholar] - Mathew, J.; Pang, C.K.; Luo, M.; Leong, W.H. Classification of Imbalanced Data by Oversampling in Kernel Space of Support Vector Machines. IEEE Trans. Neural Netw. Learn. Syst.
**2018**, 29, 4065–4076. [Google Scholar] [CrossRef] [PubMed] - Bardamova, M.; Konev, A.; Hodashinsky, I.; Shelupanov, A. A Fuzzy Classifier with Feature Selection Based on the Gravitational Search Algorithm. Symmetry
**2018**, 10, 609. [Google Scholar] [CrossRef] - He, H.; Ma, Y. (Eds.) Imbalanced Learning: Foundations, Algorithms, and Applications; John Wiley & Sons Inc.: Hoboken, NJ, USA, 2013; p. 216. [Google Scholar]
- Hand, D. Measuring classifier performance: A coherent alternative to the area under the ROC curve. Mach. Learn.
**2009**, 77, 103–123. [Google Scholar] [CrossRef] - Ferri, C.; Haernandez-Orallo, J.; Modroiu, R. An experimental comparison of performance measures for classification. Pattern Recognit. Lett.
**2009**, 30, 27–38. [Google Scholar] [CrossRef] - Fernandez, J.C.; Carbonero, M.; Gutierrez, P.A.; Hervas-Martınez, C. Multi-objective evolutionary optimization using the relationship between F1 and accuracy metrics in classification tasks. Appl. Intell.
**2019**, 49, 3447–3463. [Google Scholar] [CrossRef] - Lopez, V.; Fernandez, A.; Garcia, S.; Paladec, V.; Herrera, F. An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics. Inf. Sci.
**2013**, 250, 113–141. [Google Scholar] [CrossRef] - Lopez, V.; del Rio, S.; Benitez, J.M.; Herrera, F. Cost-sensitive linguistic fuzzy rule based classification systems under the MapReduce framework for imbalanced big data. Fuzzy Set Syst.
**2015**, 258, 5–38. [Google Scholar] [CrossRef] - Vluymans, S.; Tarrago, D.S.; Saeys, Y.; Cornelis, C.; Herrera, F. Fuzzy rough classifiers for class imbalanced multi-instance data. Pattern Recognit.
**2016**, 53, 36–45. [Google Scholar] [CrossRef] - Fernández, A.; Del Jesus, M.J.; Herrera, F. Hierarchical fuzzy rule based classification systems with genetic rule selection for imbalanced datasets. Int. J. Approx. Reason.
**2009**, 50, 561–577. [Google Scholar] [CrossRef] - Villarino, G.; Gómez, D.; Rodríguez, J.T.; Montero, J. A bipolar knowledge representation model to improve supervised fuzzy classification algorithms. Soft Comput.
**2018**, 22, 5121–5146. [Google Scholar] [CrossRef] - Haixiang, G.; Li, Y.; Shang, J.; Mingyun, G.; Yuanyue, H.; Gong, B. Learning from class-imbalanced data: Review of methods and application. Expert Syst. Appl.
**2017**, 73, 220–239. [Google Scholar] [CrossRef] - Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res.
**2002**, 16, 321–357. [Google Scholar] [CrossRef] - Liu, G.; Yang, Y.; Li, B. Fuzzy rule-based oversampling technique for imbalanced and incomplete data learning. Knowl. Based Syst.
**2018**, 158, 154–174. [Google Scholar] [CrossRef] - D‘Addabbo, A.; Maglietta, R. Parallel selective sampling method for imbalanced and large data classification. Pattern Recognit. Lett.
**2015**, 62, 61–67. [Google Scholar] [CrossRef] - Diez-Pastor, J.F.; Rodriguez, J.J.; García-Osorio, C.; Kuncheva, L.I. Random balance: ensembles of variable priors classifiers for imbalanced data. Knowl. Based Syst.
**2015**, 85, 96–111. [Google Scholar] [CrossRef] - Saez, J.A.; Luengo, J.; Stefanowski, J.; Herrera, F. SMOTE–IPF: Addressing the noisy and borderline examples problem in imbalanced classification by a re-sampling method with filtering. Inf. Sci.
**2015**, 291, 184–203. [Google Scholar] [CrossRef] - Lin, W.-C.; Tsai, C.-F.; Hu, Y.-H.; Jhang, J.-S. Clustering-based undersampling in class-imbalanced data. Inf. Sci.
**2017**, 409–410, 17–26. [Google Scholar] [CrossRef] - Ofek, N.; Rokach, L.; Stern, R.; Shabtai, A. Fast-CBUS: a fast clustering-based undersampling method for addressing the class imbalance problem. Neurocomputing
**2017**, 243, 88–102. [Google Scholar] [CrossRef] - Diao, R. Feature Selection with Harmony Search and Its Applications. Available online: https://www.researchgate.net/publication/283652269_Feature_selection_with_harmony_search_and_its_applications (accessed on 10 March 2019).
- Witten, I.H.; Frank, E. Data Mining Practical Machine Learning Tools and Techniques, 2nd ed.; Morgan Kaufmann: Amsterdam, The Netherlands, 2011; 558p. [Google Scholar]
- Liu, H.; Yu, L. Toward Integrating Feature Selection Algorithms for Classification and Clustering. IEEE Trans. Knowl. Data Eng.
**2005**, 17, 491–502. [Google Scholar] - Senthamarai Kannan, S.; Ramaraj, N. A novel hybrid feature selection via symmetrical uncertainty ranking based local memetic search algorithm. Knowl. Based Syst.
**2010**, 23, 580–585. [Google Scholar] [CrossRef] - Bonnlander, B.; Weigend, A. Selecting input variables using mutual information and nonparametric density estimation. Int. Symp. Artif. Neural Netw.
**1994**, 49, 42–50. [Google Scholar] - Du, L.; Xu, Y.; Zhu, H. Feature Selection for Multi-Class Imbalanced Data Sets Based on Genetic Algorithm. Ann. Data Sci.
**2015**, 2, 293–300. [Google Scholar] [CrossRef] - Hernandez, J.C.H.; Duval, B.; Hao, J.-K. A genetic embedded approach for gene selection and classification of microarray data. In Lecture Notes in Computer Science. Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics. 5th European Conference, EvoBIO 2007, Valencia, Spain, 11–13 April 2007; Marchiori, E., Moore, J.H., Rajapakse, J.C., Eds.; Springer: Berlin/Heidelberg, Germany, 2007; Volume 4447, pp. 90–101. [Google Scholar] [CrossRef]
- Moayedikia, A.; Ong, K.-L.; Boo, Y.L.; Yeoh, W.G.S.; Jensen, R. Feature selection for high dimensional imbalanced class data using harmony search. Eng. Appl. Artif. Intell.
**2017**, 57, 38–49. [Google Scholar] [CrossRef] - Hodashinsky, I.; Sarin, K. Feature Selection for Classification through Population Random Search with Memory. Autom. Remote Control
**2019**, 80, 324–333. [Google Scholar] [CrossRef] - Rashedi, E.; Nezamabadi-pour, H.; Saryazdi, S. GSA: A Gravitational Search Algorithm. Inf. Sci.
**2009**, 179, 2232–2248. [Google Scholar] [CrossRef] - Rashedi, E.; Nezamabadi-pour, H.; Saryazdi, S. BGSA: Binary gravitational search algorithm. Nat. Comput.
**2010**, 9, 727–745. [Google Scholar] [CrossRef] - Özkaraca, O.; Keçebaş, A. Performance analysis and optimization for maximum exergy efficiency of a geothermal power plant using gravitational search algorithm. Energy Convers. Manag.
**2019**, 185, 155–168. [Google Scholar] [CrossRef] - Ma, C.; Jiang, Y.; Li, T. Gravitational Search Algorithm for Microseismic Source Location in Tunneling: Performance Analysis and Engineering Case Study. Rock Mech. Rock Eng.
**2019**, 1–18. [Google Scholar] [CrossRef] - Mahanipour, A.; Nezamabadi-pour, H. GSP: an automatic programming technique with gravitational search algorithm. Appl. Intell.
**2019**, 49, 1502–1516. [Google Scholar] [CrossRef] - Mahanipour, A.; Nezamabadi-pour, H. A multiple feature construction method based on gravitational search algorithm. Expert Syst. Appl.
**2019**, 127, 199–209. [Google Scholar] [CrossRef] - Pelusi, D.; Mascella, R.; Tallini, L.; Nayak, J.; Naik, B.; Abraham, A. Neural network and fuzzy system for the tuning of Gravitational Search Algorithm parameters. Expert Syst. Appl.
**2018**, 102, 234–244. [Google Scholar] [CrossRef] - Knowledge Extraction Based on Evolutionary Learning. Available online: http://keel.es (accessed on 10 May 2019).
- Lemaître, G.; Nogueira, F.; Aridas, C.K. Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning. J. Mach. Learn. Res.
**2017**, 18, 1–5. [Google Scholar] - Scikit-learn. User Guide. Supervised Learning. Available online: https://scikit-learn.org/stable/supervised_learning.html#supervised-learning (accessed on 13 August 2019).

№ | Data Set | F_{all} | Str_{all} | Str_{+} | Str_{-} | IR |
---|---|---|---|---|---|---|

1 | vehicle0 | 18 | 846 | 199 | 647 | 3.25 |

2 | newthyroid2 | 5 | 215 | 35 | 180 | 5.14 |

3 | segment0 | 19 | 2308 | 329 | 1979 | 6.02 |

4 | page-blocks0 | 10 | 5472 | 559 | 4913 | 8.79 |

5 | vowel0 | 13 | 988 | 90 | 898 | 9.98 |

6 | cleveland-0vs4 | 13 | 177 | 13 | 164 | 12.62 |

7 | ecoli4 | 7 | 336 | 20 | 316 | 15.8 |

8 | yeast4 | 8 | 1484 | 51 | 1433 | 28.1 |

**Table 2.**Classification results obtained while using the continuous gravitational search algorithm for tuning fuzzy classifier parameters.

γ | 0 | 1 | 0.25 | 0.5 | 0.75 | |||||
---|---|---|---|---|---|---|---|---|---|---|

Avr. | Best | Avr. | Best | Avr. | Best | Avr. | Best | Avr. | Best | |

vehicle0 | ||||||||||

Acc. | 81.64 | 82.50 | 81.52 | 82.03 | 84.75 | 85.11 | 82.62 | 86.28 | 82.82 | 84.28 |

GM | 57.47 | 59.56 | 69.28 | 70.36 | 74.85 | 81.45 | 74.28 | 81.63 | 74.12 | 80.40 |

TP_{rate} | 34.84 | 37.69 | 54.10 | 55.78 | 62.56 | 77.23 | 63.65 | 74.87 | 62.93 | 74.82 |

TN_{rate} | 96.03 | 96.29 | 89.95 | 90.11 | 91.55 | 87.48 | 88.46 | 89.80 | 88.91 | 87.16 |

newthyroid2 | ||||||||||

Acc. | 98.76 | 99.07 | 98.76 | 99.07 | 98.60 | 99.07 | 98.91 | 99.53 | 98.45 | 99.07 |

GM | 98.05 | 98.24 | 98.85 | 99.44 | 98.35 | 99.44 | 98.54 | 99.72 | 97.46 | 98.24 |

TP_{rate} | 97.14 | 97.14 | 99.05 | 100.00 | 98.10 | 100.00 | 98.10 | 100.00 | 96.19 | 97.14 |

TN_{rate} | 99.07 | 99.44 | 98.70 | 98.89 | 98.70 | 98.89 | 99.07 | 99.44 | 98.89 | 99.44 |

segment0 | ||||||||||

Acc. | 91.29 | 91.42 | 90.83 | 90.90 | 91.41 | 91.46 | 91.13 | 91.16 | 90.87 | 90.90 |

GM | 92.36 | 92.82 | 94.15 | 94.31 | 93.53 | 93.57 | 93.85 | 93.99 | 94.05 | 94.07 |

TP_{rate} | 93.92 | 94.83 | 99.09 | 99.39 | 96.65 | 96.65 | 97.87 | 98.18 | 98.78 | 98.78 |

TN_{rate} | 90.85 | 90.85 | 89.46 | 89.49 | 90.53 | 90.60 | 90.01 | 89.99 | 89.56 | 89.59 |

page-blocks0 | ||||||||||

Acc. | 93.24 | 93.93 | 88.96 | 91.03 | 93.37 | 94.19 | 92.85 | 93.59 | 90.99 | 91.01 |

GM | 65.30 | 72.39 | 76.79 | 80.59 | 74.64 | 77.05 | 74.01 | 79.42 | 74.18 | 78.28 |

TP_{rate} | 44.07 | 53.31 | 64.52 | 69.59 | 57.36 | 60.64 | 56.95 | 65.30 | 58.64 | 65.86 |

TN_{rate} | 98.83 | 98.55 | 91.74 | 93.47 | 97.47 | 98.01 | 96.93 | 96.80 | 94.67 | 93.87 |

vowel0 | ||||||||||

Acc. | 92.11 | 92.71 | 88.59 | 89.67 | 96.86 | 97.67 | 96.19 | 96.86 | 95.75 | 96.46 |

GM | 47.99 | 54.09 | 90.22 | 92.15 | 93.94 | 96.69 | 95.01 | 96.75 | 94.75 | 97.04 |

TP_{rate} | 36.67 | 46.67 | 92.59 | 95.56 | 90.74 | 95.56 | 93.70 | 97.78 | 93.70 | 97.78 |

TN_{rate} | 97.66 | 95.88 | 88.20 | 89.09 | 97.48 | 97.89 | 96.44 | 94.77 | 95.96 | 96.33 |

cleveland-0vs4 | ||||||||||

Acc. | 92.86 | 95.51 | 87.03 | 90.37 | 91.76 | 95.49 | 90.79 | 93.25 | 88.92 | 90.41 |

GM | 54.43 | 73.17 | 74.50 | 80.36 | 71.47 | 73.00 | 66.10 | 72.20 | 70.73 | 76.32 |

TP_{rate} | 38.46 | 53.85 | 61.54 | 69.23 | 56.67 | 56.67 | 46.15 | 53.85 | 57.78 | 66.67 |

TN_{rate} | 97.15 | 98.78 | 89.02 | 92.07 | 94.73 | 98.77 | 94.31 | 96.34 | 91.67 | 92.69 |

ecoli4 | ||||||||||

Acc. | 96.91 | 97.32 | 94.25 | 97.02 | 96.92 | 97.62 | 95.78 | 95.84 | 94.84 | 95.53 |

GM | 76.17 | 79.06 | 91.06 | 95.90 | 78.71 | 81.74 | 78.25 | 85.83 | 82.07 | 85.73 |

TP_{rate} | 61.00 | 65.00 | 88.33 | 95.00 | 65.00 | 70.00 | 66.00 | 80.00 | 73.33 | 80.00 |

TN_{rate} | 99.18 | 99.37 | 94.62 | 97.15 | 98.94 | 99.37 | 97.66 | 96.84 | 96.20 | 96.52 |

yeast4 | ||||||||||

Acc. | 96.52 | 96.63 | 81.81 | 86.66 | 92.57 | 92.05 | 89.31 | 88.54 | 85.47 | 88.88 |

GM | 2.11 | 6.32 | 78.28 | 80.18 | 69.12 | 74.76 | 76.55 | 83.22 | 74.41 | 78.20 |

TP_{rate} | 0.65 | 1.96 | 75.16 | 74.51 | 51.45 | 60.55 | 66.01 | 78.43 | 64.61 | 68.55 |

TN_{rate} | 100.00 | 100.00 | 82.04 | 87.09 | 94.03 | 93.17 | 90.14 | 88.90 | 86.21 | 89.60 |

**Table 3.**The results of constructing fuzzy classifiers on imbalanced datasets obtained with feature selection and parameter tuning.

γ | 0 | 0 | 1 | 1 | 0.5 | 0.5 |
---|---|---|---|---|---|---|

GSA_{b} | GSA_{b} + GSA_{c} | GSA_{b} | GSA_{b} + GSA_{c} | GSA_{b} | GSA_{b} + GSA_{c} | |

Dataset | vehicle0 | |||||

Features | 10.20 | 7.60 | 9.00 | |||

Accuracy | 83.33 | 84.43 | 80.61 | 77.07 | 81.09 | 84.28 |

GM | 66.40 | 67.25 | 75.28 | 78.01 | 71.78 | 78.28 |

TP_{rate} | 47.24 | 47.91 | 67.34 | 83.08 | 58.79 | 69.35 |

TN_{rate} | 94.44 | 95.67 | 84.70 | 75.22 | 87.94 | 88.87 |

Dataset | newthyroid2 | |||||

Features | 3.60 | 3.20 | 3.20 | |||

Accuracy | 99.53 | 99.07 | 98.60 | 98.45 | 98.60 | 98.45 |

GM | 98.52 | 97.03 | 99.16 | 98.66 | 99.16 | 98.66 |

TP_{rate} | 97.14 | 94.29 | 100.00 | 99.05 | 100.00 | 99.05 |

TN_{rate} | 100.00 | 100.00 | 98.33 | 98.33 | 98.33 | 98.33 |

Dataset | segment0 | |||||

Features | 7.20 | 6.60 | 6.80 | |||

Accuracy | 97.36 | 97.88 | 96.45 | 98.73 | 97.40 | 98.60 |

GM | 95.76 | 96.80 | 96.28 | 98.79 | 97.08 | 98.34 |

TP_{rate} | 93.62 | 95.34 | 96.05 | 98.89 | 96.66 | 97.97 |

TN_{rate} | 97.98 | 98.30 | 96.51 | 98.70 | 97.52 | 98.70 |

Dataset | page-blocks0 | |||||

Features | 3.80 | 4.20 | 2.80 | |||

Accuracy | 93.60 | 94.49 | 88.54 | 88.13 | 92.20 | 92.59 |

GM | 67.85 | 74.65 | 74.14 | 81.93 | 73.31 | 76.89 |

TP_{rate} | 46.69 | 56.65 | 60.00 | 75.19 | 56.17 | 61.96 |

TN_{rate} | 98.94 | 98.80 | 91.80 | 89.61 | 96.30 | 96.07 |

Dataset | vowel0 | |||||

Features | 6.20 | 6.60 | 6.60 | |||

Accuracy | 88.86 | 92.11 | 87.45 | 97.64 | 88.25 | 97.20 |

GM | 85.64 | 75.59 | 90.02 | 96.97 | 88.94 | 94.85 |

TP_{rate} | 82.22 | 67.78 | 93.33 | 96.30 | 90.00 | 92.22 |

TN_{rate} | 89.53 | 94.54 | 86.86 | 97.77 | 88.08 | 97.70 |

Data set | cleveland-0vs4 | |||||

Features | 4.00 | 6.80 | 6.60 | |||

Accuracy | 93.78 | 93.79 | 88.70 | 92.06 | 85.86 | 89.97 |

GM | 39.17 | 47.80 | 82.38 | 82.46 | 68.01 | 66.57 |

TP_{rate} | 30.77 | 33.33 | 76.92 | 74.36 | 53.85 | 48.72 |

TN_{rate} | 98.78 | 98.58 | 89.63 | 93.50 | 88.41 | 93.29 |

Dataset | ecoli4 | |||||

Features | 3.00 | 3.20 | 3.00 | |||

Accuracy | 98.21 | 98.02 | 96.13 | 94.14 | 97.92 | 97.12 |

GM | 89.01 | 86.89 | 87.35 | 84.36 | 85.81 | 87.11 |

TP_{rate} | 80.00 | 76.67 | 80.00 | 76.67 | 75.00 | 78.33 |

TN_{rate} | 99.37 | 99.37 | 97.15 | 95.25 | 99.37 | 98.31 |

Dataset | yeast4 | |||||

Features | 3.20 | 3.20 | 2.40 | |||

Accuracy | 96.23 | 96.23 | 78.24 | 84.05 | 87.26 | 90.43 |

GM | 6.30 | 6.30 | 66.99 | 77.69 | 67.05 | 79.26 |

TP_{rate} | 1.96 | 1.96 | 58.82 | 71.90 | 52.94 | 69.28 |

TN_{rate} | 99.58 | 99.58 | 78.93 | 84.48 | 88.49 | 91.18 |

**Table 4.**The results of constructing fuzzy classifiers on the best feature sets found by the binary gravitational algorithm.

Metrics | Results | ||
---|---|---|---|

DataSet | vehicle0 | ||

γ | 0 | 1 | 0.5 |

Features | 1, 4, 8, 9, 10, 13, 14, 15, 16 | 1, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18 | 1, 5, 7, 9, 10, 11, 12, 15, 16, 17, 18 |

F | 9 | 11 | 11 |

Acc. | 85.07 | 78.30 | 84.00 |

GM | 66.87 | 82.86 | 80.58 |

TP_{rate} | 46.40 | 93.33 | 75.88 |

TN_{rate} | 96.96 | 73.64 | 86.50 |

Dataset | newthyroid2 | ||

γ | 0 | 1 | 0.5 |

Features | 1, 2, 3, 5 | 1, 2, 5 | 1, 2, 5 |

F | 4 | 3 | 3 |

Acc. | 99.53 | 99.84 | 99.53 |

GM | 98.52 | 99.51 | 99.32 |

TP_{rate} | 97.14 | 99.05 | 99.05 |

TN_{rate} | 100.00 | 100.00 | 99.63 |

Dataset | segment0 | ||

γ | 0 | 1 | 0.5 |

Features | 1, 4, 6, 11, 14, 18, 19 | 1, 6, 8, 14, 16, 18 | 6, 8, 11, 14, 18, 19 |

F | 7 | 6 | 6 |

Acc. | 98.93 | 99.08 | 99.02 |

GM | 98.22 | 99.08 | 98.66 |

TP_{rate} | 97.26 | 99.09 | 98.18 |

TN_{rate} | 99.21 | 99.07 | 99.16 |

Dataset | page-blocks0 | ||

γ | 0 | 1 | 0.5 |

Features | 1, 2, 5, 10 | 4, 10 | 4, 10 |

F | 4 | 2 | 2 |

Acc. | 94.77 | 91.75 | 92.85 |

GM | 77.06 | 84.92 | 81.52 |

TP_{rate} | 60.35 | 77.22 | 70.60 |

TN_{rate} | 98.68 | 93.41 | 95.39 |

Dataset | vowel0 | ||

γ | 0 | 1 | 0.5 |

Features | 5, 6, 7, 8, 9, 10, 13 | 4, 5, 6, 7, 9, 13 | 4, 5, 6, 7, 8, 13 |

F | 7 | 6 | 6 |

Acc. | 96.39 | 98.18 | 97.74 |

GM | 82.83 | 98.11 | 97.41 |

TP_{rate} | 70.00 | 98.15 | 97.04 |

TN_{rate} | 99.03 | 98.18 | 97.81 |

Dataset | cleveland-0vs4 | ||

γ | 0 | 1 | 0.5 |

Features | 4, 8, 10 | 1, 4, 7, 9, 10, 13 | 10, 12 |

F | 3 | 6 | 2 |

Acc. | 94.72 | 93.04 | 93.02 |

GM | 54.96 | 85.19 | 86.17 |

TP_{rate} | 41.03 | 76.92 | 82.05 |

TN_{rate} | 98.98 | 94.31 | 93.90 |

Dataset | ecoli4 | ||

γ | 0 | 1 | 0.5 |

Features | 5, 6, 7 | 2, 3, 4, 5, 7 | 2, 3, 5, 7 |

F | 3 | 5 | 4 |

Acc. | 98.71 | 96.13 | 97.92 |

GM | 88.90 | 90.11 | 93.88 |

TP_{rate} | 80.00 | 85.00 | 90.00 |

TN_{rate} | 99.89 | 96.84 | 98.42 |

Dataset | yeast4 | ||

γ | 0 | 1 | 0.5 |

Features | 1, 2, 3, 7, 8 | 1, 3, 5 | 1, 3 |

F | 5 | 3 | 2 |

Acc. | 95.62 | 84.05 | 91.19 |

GM | 19.73 | 79.93 | 80.40 |

TP_{rate} | 9.8 | 76.47 | 70.59 |

TN_{rate} | 98.67 | 84.32 | 91.93 |

Feature Sets | Standardized Test Statistic | p-Value | Null Hypothesis |
---|---|---|---|

F_{all} − F_{bin}, γ = 0 | 2.521 | 0.012 | Reject |

F_{all} − F_{bin}, γ = 1 | 2.521 | 0.012 | Reject |

F_{all} − F_{bin}, γ = 0,5 | 2.524 | 0.012 | Reject |

F_{bin}, γ = 0 − F_{bin}, γ = 1 | 0 | 1 | Retain |

F_{bin}, γ = 0 − F_{bin}, γ = 0.5 | 0.851 | 0.395 | Retain |

F_{bin}, γ = 1 − F_{bin}, γ = 0.5 | 0.638 | 0.524 | Retain |

**Table 6.**The results of comparing classification performance indexes in the absence and presence of feature selection performed using the binary gravitational search algorithm.

Metric | γ | Standardized Test Statistic | p-Value | Null Hypothesis |
---|---|---|---|---|

Accuracy (all - bin) | 0 | −2.197 | 0.028 | Reject |

1 | −0.98 | 0.327 | Retain | |

0.5 | −1.68 | 0.093 | Retain | |

GM (all - bin) | 0 | −1.82 | 0.069 | Retain |

1 | −1.4 | 0.161 | Retain | |

0.5 | −2.24 | 0.025 | Reject | |

TP_{rate} (all - bin) | 0 | −1.544 | 0.123 | Retain |

1 | −1.051 | 0.293 | Retain | |

0.5 | −2.036 | 0.042 | Reject | |

TN_{rate} (all - bin) | 0 | −0.73 | 0.465 | Retain |

1 | −0.594 | 0.553 | Retain | |

0.5 | −0.877 | 0.38 | Retain |

**Table 7.**The results of constructing fuzzy classifiers obtained with the different feature selection algorithms.

Alg. | GSA_{b} | RS | MI | Alg. | GSA_{b} | RS | MI | ||||
---|---|---|---|---|---|---|---|---|---|---|---|

γ = 0 | γ = 1 | γ = 0.5 | γ = 0 | γ = 1 | γ = 0.5 | ||||||

Data | vehicle0 | Data | vowel0 | ||||||||

F | 10.20 | 7.60 | 9.00 | 6.60 | 8.40 | F | 6.20 | 6.60 | 6.60 | 5.80 | 5.40 |

Acc. | 83.33 | 80.61 | 81.09 | 70.08 | 79.43 | Acc. | 88.86 | 87.45 | 88.25 | 77.93 | 85.29 |

GM | 66.40 | 75.28 | 71.78 | 62.67 | 65.45 | GM | 85.64 | 90.02 | 88.94 | 81.13 | 75.59 |

TP_{rate} | 47.24 | 67.34 | 58.79 | 53.29 | 48.22 | TP_{rate} | 82.22 | 93.33 | 90.00 | 77.17 | 70.00 |

TN_{rate} | 94.44 | 84.70 | 87.94 | 75.26 | 89.02 | TN_{rate} | 89.53 | 86.86 | 88.08 | 85.56 | 86.55 |

Data | newthyroid2 | Data | cleveland-0_vs_4 | ||||||||

F | 3.60 | 3.20 | 3.20 | 2.80 | 3.00 | F | 4.00 | 6.80 | 6.60 | 6.40 | 3.00 |

Acc. | 99.53 | 98.60 | 98.60 | 95.35 | 99.53 | Acc. | 93.78 | 88.70 | 85.86 | 53.51 | 98.22 |

GM | 98.52 | 99.16 | 99.16 | 94.85 | 98.52 | GM | 39.17 | 82.38 | 68.01 | 39.52 | 88.49 |

TP_{rate} | 97.14 | 100.0 | 100.0 | 94.29 | 97.14 | TP_{rate} | 30.77 | 76.92 | 53.85 | 53.75 | 80.00 |

TN_{rate} | 100.0 | 98.33 | 98.33 | 95.56 | 100.0 | TN_{rate} | 98.78 | 89.63 | 88.41 | 56.67 | 99.37 |

Data | segment0 | Data | ecoli4 | ||||||||

F | 7.20 | 6.60 | 6.80 | 10.60 | 9.40 | F | 3.00 | 3.20 | 3.00 | 6.40 | 4.80 |

Acc. | 97.36 | 96.45 | 97.40 | 90.99 | 91.12 | Acc. | 98.21 | 96.13 | 97.92 | 96.73 | 87.34 |

GM | 95.76 | 96.28 | 97.08 | 88.67 | 85.08 | GM | 89.01 | 87.35 | 85.81 | 68.70 | 88.90 |

TP_{rate} | 93.62 | 96.05 | 96.66 | 85.72 | 77.85 | TP_{rate} | 80.00 | 80.00 | 75.00 | 50.00 | 91.11 |

TN_{rate} | 97.98 | 96.51 | 97.52 | 91.86 | 93.33 | TN_{rate} | 99.37 | 97.15 | 99.37 | 99.68 | 86.96 |

Data | page-blocks0 | Data | yeast4 | ||||||||

F | 3.80 | 4.20 | 2.80 | 6.80 | 5.60 | F | 3.20 | 3.20 | 2.40 | 6.40 | 3.00 |

Acc. | 93.60 | 88.54 | 92.20 | 81.49 | 87.65 | Acc. | 96.23 | 78.24 | 87.26 | 94.21 | 91.24 |

GM | 67.85 | 74.14 | 73.31 | 59.80 | 51.98 | GM | 6.30 | 66.99 | 67.05 | 29.46 | 62.79 |

TP_{rate} | 46.69 | 60.00 | 56.17 | 42.04 | 31.82 | TP_{rate} | 1.96 | 58.82 | 52.94 | 16.00 | 45.64 |

TN_{rate} | 98.94 | 91.80 | 96.30 | 85.98 | 94.00 | TN_{rate} | 99.58 | 78.93 | 88.49 | 97.00 | 92.88 |

**Table 8.**Comparison of fuzzy classifier results obtained using different algorithms for feature selection.

Algorithm | STS | p | NH | Algorithm | STS | p | NH |
---|---|---|---|---|---|---|---|

Features | |||||||

RS - GSA (γ = 0) | 0.981 | 0.326 | Retain | MI - GSA (γ = 0) | 0.281 | 0.778 | Retain |

RS - GSA (γ = 1) | 1.123 | 0.261 | Retain | MI - GSA (γ = 1) | 0.421 | 0.674 | Retain |

RS - GSA (γ = 0.5) | 1.122 | 0.262 | Retain | MI - GSA (γ = 0.5) | 0.35 | 0.726 | Retain |

Accuracy | |||||||

RS - GSA (γ = 0) | −2.521 | 0.012 | Reject | MI - GSA (γ = 0) | −1.859 | 0.063 | Retain |

RS - GSA (γ = 1) | −1.4 | 0.161 | Retain | MI - GSA (γ = 1) | −0.14 | 0.889 | Retain |

RS - GSA (γ = 0.5) | −1.96 | 0.05 | Reject | MI - GSA (γ = 0.5) | −0.7 | 0.484 | Retain |

GM | |||||||

RS - GSA (γ = 0) | −1.26 | 0.208 | Retain | MI - GSA (γ = 0) | −0.169 | 0.866 | Retain |

RS - GSA (γ = 1) | −2.521 | 0.012 | Reject | MI - GSA (γ = 1) | −1.68 | 0.093 | Retain |

RS - GSA (γ = 0.5) | −2.521 | 0.012 | Reject | MI - GSA (γ = 0.5) | −1.26 | 0.208 | Retain |

TP_{rate} | |||||||

RS - GSA (γ = 0) | −0.14 | 0.889 | Retain | MI - GSA (γ = 0) | 0.338 | 0.735 | Retain |

RS - GSA (γ = 1) | −2.521 | 0.012 | Reject | MI - GSA (γ = 1) | −1.82 | 0.069 | Retain |

RS - GSA (γ = 0.5) | −2.371 | 0.018 | Reject | MI - GSA (γ = 0.5) | −0.84 | 0.401 | Retain |

TN_{rate} | |||||||

RS - GSA (γ = 0) | −2.383 | 0.017 | Reject | MI - GSA (γ = 0) | −2.197 | 0.028 | Reject |

RS - GSA (γ = 1) | −1.12 | 0.263 | Retain | MI - GSA (γ = 1) | 0.84 | 0.401 | Retain |

RS - GSA (γ = 0.5) | −1.682 | 0.092 | Retain | MI - GSA (γ = 0.5) | −0.14 | 0.889 | Retain |

Metrics | vhc0 | nth2 | sgm0 | pbl0 | vwl0 | clv04 | ecl4 | yst4 |
---|---|---|---|---|---|---|---|---|

Accuracy | 66.46 | 99.17 | 89.97 | 68.49 | 50.00 | 95.57 | 83.91 | 72.29 |

GM | 60.50 | 99.16 | 89.93 | 63.19 | 0.00 | 95.50 | 83.90 | 72.07 |

TP_{rate} | 69.68 | 98.33 | 88.62 | 73.31 | 0.00 | 94.01 | 83.78 | 73.79 |

TN_{rate} | 63.26 | 100.00 | 91.31 | 63.68 | 100.00 | 97.14 | 84.04 | 70.80 |

Metrics | Algorithms | STS | p | NH |
---|---|---|---|---|

Accuracy | SMOTE - GSA (γ = 0) | −1.96 | 0.05 | Reject |

SMOTE - GSA (γ = 1) | −1.96 | 0.05 | Reject | |

SMOTE - GSA (γ = 0.5) | −1.96 | 0.05 | Reject | |

GM | SMOTE - GSA (γ = 0) | 0.84 | 0.401 | Retain |

SMOTE - GSA (γ = 1) | −1.4 | 0.161 | Retain | |

SMOTE - GSA (γ = 0.5) | −0.84 | 0.401 | Retain | |

TPrate | SMOTE - GSA (γ = 0) | 1.4 | 0.161 | Retain |

SMOTE - GSA (γ = 1) | −0.14 | 0.889 | Retain | |

SMOTE - GSA (γ = 0.5) | 0.84 | 0.401 | Retain | |

TNrate | SMOTE - GSA (γ = 0) | −1.26 | 0.208 | Retain |

SMOTE - GSA (γ = 1) | −0.84 | 0.401 | Retain | |

SMOTE - GSA (γ = 0.5) | −1.12 | 0.263 | Retain |

**Table 11.**Results of fuzzy classifier construction on features selected after using the SMOTE algorithm.

Metrics | vhc0 | nth2 | sgm0 | pbl0 | vwl0 | clv04 | ecl4 | yst4 |
---|---|---|---|---|---|---|---|---|

F. | 8.40 | 3.00 | 8.80 | 1.20 | 5.60 | 1.00 | 6.20 | 2.00 |

Acc. | 60.51 | 97.78 | 86.00 | 62.05 | 49.38 | 90.98 | 86.52 | 50.70 |

GM | 44.98 | 97.75 | 85.75 | 48.90 | 14.62 | 90.91 | 86.20 | 11.57 |

TP_{rate} | 69.22 | 100.00 | 92.26 | 71.35 | 6.25 | 93.39 | 87.67 | 40.77 |

TN_{rate} | 51.78 | 95.56 | 79.75 | 52.74 | 92.50 | 88.57 | 85.38 | 60.63 |

**Table 12.**Results of constructing fuzzy classifiers on subsets of features found by the filter after using the oversampling algorithm.

Metrics | vhc0 | nth2 | sgm0 | pbl0 | vwl0 | clv04 | ecl4 | yst4 |
---|---|---|---|---|---|---|---|---|

F. | 8.60 | 1.80 | 12.00 | 4.00 | 6.00 | 5.00 | 5.20 | 5.00 |

Acc. | 69.47 | 94.72 | 90.40 | 53.81 | 52.19 | 91.30 | 89.48 | 71.49 |

GM | 66.65 | 94.55 | 90.34 | 27.49 | 9.84 | 90.86 | 89.44 | 68.40 |

TP_{rate} | 72.75 | 89.44 | 89.08 | 62.83 | 5.00 | 86.75 | 88.78 | 71.49 |

TN_{rate} | 66.20 | 100.00 | 91.72 | 44.78 | 99.38 | 95.87 | 90.18 | 71.50 |

**Table 13.**Comparison of the results of constructing fuzzy classifiers on oversampled and origin data using the selection of features.

Metrics | Algorithm 1 | Algorithm 2 | Standardized Test Statistic | p-Value | Null Hypothesis |
---|---|---|---|---|---|

Features | SMOTE + RS | GSA (γ = 0) | −0.841 | 0.4 | Retain |

GSA (γ = 1) | −0.631 | 0.528 | Retain | ||

GSA (γ = 0.5) | −0.7 | 0.484 | Retain | ||

SMOTE + MI | GSA (γ = 0) | 0.983 | 0.326 | Retain | |

GSA (γ = 1) | 0.771 | 0.441 | Retain | ||

GSA (γ = 0.5) | 0.84 | 0.401 | Retain | ||

Accuracy | SMOTE + RS | GSA (γ = 0) | −2.521 | 0.012 | Reject |

GSA (γ = 1) | −2.521 | 0.012 | Reject | ||

GSA (γ = 0.5) | −2.24 | 0.025 | Reject | ||

SMOTE + MI | GSA (γ = 0) | −2.521 | 0.012 | Reject | |

GSA (γ = 1) | −2.521 | 0.012 | Reject | ||

GSA (γ = 0.5) | −2.38 | 0.017 | Reject | ||

GM | SMOTE + RS | GSA (γ = 0) | −0.84 | 0.401 | Retain |

GSA (γ = 1) | −1.823 | 0.068 | Retain | ||

GSA (γ = 0.5) | −1.963 | 0.05 | Reject | ||

SMOTE + MI | GSA (γ = 0) | −0.42 | 0.674 | Retain | |

GSA (γ = 1) | −1.82 | 0.069 | Retain | ||

GSA (γ = 0.5) | −1.54 | 0.123 | Retain | ||

TP_{rate} | SMOTE + RS | GSA (γ = 0) | 1.26 | 0.208 | Retain |

GSA (γ = 1) | −0.98 | 0.327 | Retain | ||

GSA (γ = 0.5) | 0 | 1 | Retain | ||

SMOTE + MI | GSA (γ = 0) | 0.98 | 0.327 | Retain | |

GSA (γ = 1) | −0.84 | 0.401 | Retain | ||

GSA (γ = 0.5) | 0.14 | 0.889 | Retain | ||

TN_{rate} | SMOTE + RS | GSA (γ = 0) | −2.521 | 0.012 | Reject |

GSA (γ = 1) | −2.521 | 0.012 | Reject | ||

GSA (γ = 0.5) | −2.521 | 0.012 | Reject | ||

SMOTE + MI | GSA (γ = 0) | −2.028 | 0.043 | Reject | |

GSA (γ = 1) | −1.68 | 0.093 | Retain | ||

GSA (γ = 0.5) | −1.68 | 0.093 | Retain |

Data Sets | Classification Algorithms | Fuzzy Classifiers | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

vhc0 | GNB | LR | DT | MLP | LSV | 3NN | AB | RF | GB | γ = 0 | γ = 1 | γ = 0.5 |

Acc. | 64.9 | 96.6 | 93.6 | 98.1 | 96.8 | 94.8 | 96.2 | 95.6 | 96.5 | 85.1 | 78.3 | 84.0 |

GM | 70.7 | 95.6 | 91.7 | 97.4 | 96.0 | 92.3 | 95.6 | 94.1 | 95.0 | 66.9 | 82.9 | 80.6 |

TP_{rate} | 85.4 | 94.0 | 88.4 | 96.0 | 94.5 | 87.9 | 94.5 | 91.5 | 92.5 | 46.4 | 93.3 | 75.9 |

TN_{rate} | 58.6 | 97.4 | 95.2 | 98.8 | 97.5 | 96.9 | 96.8 | 96.9 | 97.7 | 97.0 | 73.6 | 86.5 |

nth2 | GNB | LR | DT | MLP | LSV | 3NN | AB | RF | GB | γ = 0 | γ = 1 | γ = 0.5 |

Acc. | 96.3 | 98.1 | 95.8 | 97.7 | 98.1 | 98.1 | 99.1 | 96.7 | 97.2 | 99.5 | 99.8 | 99.5 |

GM | 96.5 | 95.3 | 91.1 | 95.0 | 96.5 | 95.1 | 98.2 | 91.5 | 91.6 | 98.5 | 99.5 | 99.3 |

TP_{rate} | 97.1 | 91.4 | 85.7 | 91.4 | 94.3 | 91.4 | 97.1 | 85.7 | 85.7 | 97.1 | 99.0 | 99.0 |

TN_{rate} | 96.1 | 99.4 | 97.8 | 98.9 | 98.9 | 99.4 | 99.4 | 98.9 | 99.4 | 100.0 | 100.0 | 99.6 |

sgm0 | GNB | LR | DT | MLP | LSV | 3NN | AB | RF | GB | γ = 0 | γ = 1 | γ = 0.5 |

Acc. | 83.3 | 99.7 | 99.2 | 99.7 | 96.8 | 99.3 | 99.6 | 99.4 | 99.3 | 98.9 | 99.1 | 99.0 |

GM | 89.2 | 99.3 | 98.4 | 98.4 | 97.6 | 99.1 | 99.1 | 98.5 | 98.3 | 98.2 | 99.1 | 98.7 |

TP_{rate} | 98.5 | 98.8 | 97.3 | 99.1 | 99.1 | 98.8 | 98.5 | 97.3 | 97.0 | 97.3 | 99.1 | 98.2 |

TN_{rate} | 80.7 | 99.9 | 99.5 | 99.8 | 96.4 | 99.4 | 99.7 | 99.8 | 99.7 | 99.2 | 99.1 | 99.2 |

pbl0 | GNB | LR | DT | MLP | LSV | 3NN | AB | RF | GB | γ = 0 | γ = 1 | γ = 0.5 |

Acc. | 88.7 | 94.1 | 95.3 | 95.4 | 93.9 | 95.3 | 94.3 | 95.7 | 96.5 | 94.8 | 91.8 | 92.9 |

GM | 65.4 | 74.9 | 85.1 | 86.1 | 71.8 | 83.4 | 84.1 | 86.3 | 87.7 | 77.1 | 84.9 | 81.5 |

TP_{rate} | 47.4 | 58.1 | 74.6 | 76.0 | 53.3 | 71.2 | 73.5 | 76.4 | 79.2 | 60.3 | 77.2 | 70.6 |

TN_{rate} | 93.4 | 98.1 | 97.7 | 97.6 | 98.5 | 98.1 | 96.6 | 97.9 | 98.4 | 98.7 | 93.4 | 95.4 |

vwl0 | GNB | LR | DT | MLP | LSV | 3NN | AB | RF | GB | γ = 0 | γ = 1 | γ = 0.5 |

Acc. | 93.7 | 91.2 | 95.2 | 94.7 | 89.3 | 94.4 | 96.2 | 95.5 | 97.0 | 96.4 | 98.2 | 97.7 |

GM | 87.3 | 71.0 | 86.2 | 80.5 | 65.8 | 78.3 | 81.7 | 78.5 | 84.2 | 82.8 | 98.1 | 97.4 |

TP_{rate} | 81.1 | 58.9 | 77.8 | 73.3 | 55.6 | 63.3 | 68.9 | 63.3 | 74.4 | 70.0 | 98.1 | 97.0 |

TN_{rate} | 95.0 | 94.4 | 97.0 | 96.9 | 92.7 | 97.6 | 98.9 | 98.8 | 99.2 | 99.0 | 98.2 | 97.8 |

clv04 | GNB | LR | DT | MLP | LSV | 3NN | AB | RF | GB | γ = 0 | γ = 1 | γ = 0.5 |

Acc. | 87.9 | 95.4 | 91.3 | 95.9 | 92.5 | 95.9 | 93.1 | 93.1 | 93.0 | 94.7 | 93.0 | 93.0 |

GM | 84.9 | 82.0 | 60.1 | 80.2 | 60.8 | 80.2 | 55.2 | 37.0 | 45.5 | 55.0 | 85.2 | 86.2 |

TP_{rate} | 84.6 | 69.2 | 46.2 | 69.2 | 46.2 | 69.2 | 38.5 | 23.1 | 53.8 | 41.0 | 76.9 | 82.1 |

TN_{rate} | 88.1 | 97.5 | 95.0 | 98.1 | 96.3 | 98.1 | 97.5 | 98.8 | 96.3 | 99.0 | 94.3 | 93.9 |

ecl4 | GNB | LR | DT | MLP | LSV | 3NN | AB | RF | GB | γ = 0 | γ = 1 | γ = 0.5 |

Acc. | 81.2 | 93.4 | 94.6 | 94.0 | 94.0 | 93.4 | 95.8 | 96.7 | 96.7 | 98.7 | 96.1 | 97.9 |

GM | 83.9 | 85.7 | 74.6 | 83.6 | 88.6 | 85.7 | 81.8 | 78.2 | 84.5 | 88.9 | 90.1 | 93.9 |

TP_{rate} | 95.0 | 80.0 | 60.0 | 75.0 | 85.0 | 80.0 | 70.0 | 65.0 | 75.0 | 80.0 | 85.0 | 90.0 |

TN_{rate} | 80.4 | 94.3 | 96.8 | 95.3 | 94.6 | 94.3 | 97.5 | 98.7 | 98.1 | 99.9 | 96.8 | 98.4 |

yst4 | GNB | LR | DT | MLP | LSV | 3NN | AB | RF | GB | γ = 0 | γ = 1 | γ = 0.5 |

Acc. | 16.0 | 96.6 | 96.0 | 95.3 | 96.5 | 96.8 | 96.4 | 96.0 | 96.4 | 95.6 | 84.1 | 91.2 |

GM | 34.6 | 30.1 | 54.7 | 44.7 | 6.3 | 47.5 | 47.2 | 30.1 | 45.1 | 19.7 | 79.9 | 80.4 |

TP_{rate} | 96.1 | 11.8 | 31.4 | 27.5 | 2.0 | 23.5 | 23.5 | 11.8 | 23.5 | 9.8 | 76.5 | 70.6 |

TN_{rate} | 13.1 | 99.7 | 98.3 | 97.7 | 99.9 | 99.4 | 99.0 | 99.0 | 99.0 | 98.7 | 84.3 | 91.9 |

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Bardamova, M.; Hodashinsky, I.; Konev, A.; Shelupanov, A.
Application of the Gravitational Search Algorithm for Constructing Fuzzy Classifiers of Imbalanced Data. *Symmetry* **2019**, *11*, 1458.
https://doi.org/10.3390/sym11121458

**AMA Style**

Bardamova M, Hodashinsky I, Konev A, Shelupanov A.
Application of the Gravitational Search Algorithm for Constructing Fuzzy Classifiers of Imbalanced Data. *Symmetry*. 2019; 11(12):1458.
https://doi.org/10.3390/sym11121458

**Chicago/Turabian Style**

Bardamova, Marina, Ilya Hodashinsky, Anton Konev, and Alexander Shelupanov.
2019. "Application of the Gravitational Search Algorithm for Constructing Fuzzy Classifiers of Imbalanced Data" *Symmetry* 11, no. 12: 1458.
https://doi.org/10.3390/sym11121458