Improving the Performance of an Associative Classifier in the Context of Class-Imbalanced Classification
Abstract
:1. Introduction
2. Previous Works
2.1. Lernmatrix
2.1.1. Learning Phase
2.1.2. Recalling Phase
2.2. Linear Associator
2.2.1. Learning Phase of the Linear Associator
- For each input pattern xµ, compute the matrix as detailed in Equations (8) and (9).
- Sum the p matrices to get the following memory:
2.2.2. Recalling Phase of Linear Associator
2.3. CHAT
Algorithm 1. CHAT algorithm. |
Input: dataset, input_pattern, p |
Output: recovered_pattern |
Initialize translation_vector = [] |
for pattern in dataset |
assign one-hot vector according to its class |
end for |
for pattern in dataset |
translation_vector = translation_vector + pattern |
end for |
translation_vector = translation_vector/p |
pattern_translation = dataset—translation_vector |
Learning phase from Linear associator ( ) |
Lernmatrix recalling phase ( ) |
Sub-routine 1: Learning phase from Linear associator |
Input: dataset |
Output: associative_memory |
Initialize associative_memory = [[]] |
for pattern in dataset |
matrix = one-hot_vector * transpose(pattern) |
associative_memory = associative_memory + matrixend for |
Sub-routine 2: Lernmatrix recalling phase |
Input: associative_memory, input_pattern |
Output: recovered_pattern |
Initialize recovered_pattern = [] |
pattern = multiply_matrix (input_pattern, associative_memory) |
max_value = maximunValue(pattern) |
for i in range (0, len(pattern) |
if pattern [i] == max_value |
recovered_pattern [i] = 1 |
else |
recovered_pattern [i] = 0 |
end if |
end for |
3. Proposed Methodology
Extreme Center Points
- 1.
- Generate the extreme center points for each attribute of the training dataset. The generation of these points represents the construction of a mesh for exploring a wide range of values per attribute in order to select the translation vector that best fits the CHAT algorithm for pattern classification. The 7-point version of the ECP model, ECP (7), considers the following points:
- Minimum value of attribute;
- Minimum value of attribute + one standard deviation;
- Mean value of attribute − one standard deviation;
- Mean value of attribute;
- Mean value of attribute + one standard deviation;
- Maximum value of attribute − one standard deviation;
- Maximum value of attribute.
For the ECP (9), the attribute points for the search space are the following:- Minimum value of attribute − one standard deviation;
- Minimum value of attribute;
- Minimum value of attribute + one standard deviation;
- Mean value of attribute − one standard deviation;
- Mean value of attribute;
- Mean value of attribute + one standard deviation;
- Maximum value of attribute − one standard deviation;
- Maximum value of attribute;
- Minimum value of attribute + one standard deviation
Note that the proposed methodology is capable of obtaining better results than the original CHAT model, or at least the same performance, as the mean vector, i.e., the original election of translation vector, is included in the search space of translation vector. - 2.
- Generate all the possible combinations using the ECP over n attributes in the training dataset. Every combination represents a possible translation vector to be used in the original CHAT algorithm.
- 3.
- Test all possible solutions in the obtained search space, i.e., evaluate the CHAT algorithm using each of the points generated in the previous step, as the translation vector. Additionally, select the point that better improves the classification results. this point is called center point (CP).
- 4.
- With the aim of refining the values used to generate the translation vector, a neighborhood of the center point is then analyzed. A more fine-grained spatial search is performed around a neighborhood of ±1 standard deviation from the CP. That is, for each attribute in the Center Point, n more points are equally distributed around it (for this work in particular, n = 10). These new set of points are called deep points (DP) and are distributed for each attribute as depicted in Figure 1.
- 5.
- Afterwards, a reevaluation of the CHAT using the DP as the translation vector is carried out. To this end, steps 2 and 3 of the proposed method are repeatedly performed with the recently obtained deep points. Finally, the best point is selected as the translation vector to be used for classification of unknown patterns using the CHAT algorithm.
4. Results and Discussion
4.1. Datasets
4.2. Classifiers
4.3. Validation Method
4.4. Performance Evaluation Metrics
4.5. Classification Results
5. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Burkart, N.; Huber, M.F. A Survey on the Explainability of Supervised Machine Learning. J. Artif. Intell. Res. 2021, 70, 245–317. [Google Scholar] [CrossRef]
- Duda, R.O.; Hart, P.E.; Stork, D.G. Pattern Classification, 2nd ed.; John Wiley & Sons: New York, NY, USA, 2001; pp. 20–450. [Google Scholar]
- Wolpert, D.H.; Macready, W.G. No free lunch theorems for optimization. IEEE Trans. Evol. Comput. 1997, 1, 67–82. [Google Scholar] [CrossRef] [Green Version]
- Adam, S.P.; Alexandropoulos, S.-A.N.; Pardalos, P.M.; Vrahatis, M.N. No Free Lunch Theorem: A Review. In Dynamics of Disasters; Springer Science and Business Media LLC: Berlin/Heidelberg, Germany, 2019; Volume 145, pp. 57–82. [Google Scholar]
- Ruan, S.; Li, H.; Li, C.; Song, K. Class-Specific Dee: Feature Weighting for Naïve Bayes Text Classifiers. IEEE Access 2020, 8, 20151–20159. [Google Scholar] [CrossRef]
- Paranjape, P.; Dhabu, M.; Deshpande, P. A novel classifier for multivariate instance using graph class signatures. Front. Comput. Sci. 2020, 14, 144307. [Google Scholar] [CrossRef]
- Fernández, A.; López, V.; Galar, M.; del Jesus, M.J.; Herrera, F. Analysing the classification of unbalanced data-sets with multiple classes: Binarization techniques and ad-hoc approaches. Knowl.-Based Syst. 2013, 42, 97–110. [Google Scholar] [CrossRef]
- Mullick, S.S.; Datta, S.; Dhekane, S.G.; Das, S. Appropriateness of performance indices for imbalanced data classification: An analysis. Pattern Recognit. 2020, 102, 107197. [Google Scholar] [CrossRef]
- Karpov, Y.L.; Karpov, L.E.; Smetanin, Y.G. Some Aspects of Associative Memory Construction Based on a Hopfield Net-work. Program. Comput. Softw. 2020, 46, 305–311. [Google Scholar] [CrossRef]
- Steinbuch, K. Die Lernmatrix. Biol. Cybern. 1961, 1, 36–45. [Google Scholar] [CrossRef]
- Kohonen, T. Correlation Matrix Memories. IEEE Trans. Comput. 1972, 21, 353–359. [Google Scholar] [CrossRef]
- Anderson, J.A. A simple neural network generating an interactive memory. Math. Biosci. 1972, 14, 197–220. [Google Scholar] [CrossRef]
- Reid, R.; Frame, J. Convergence in Iteratively Formed Correlation Matrix Memories. IEEE Trans. Comput. 1975, C-24, 827–830. [Google Scholar] [CrossRef]
- Turner, M.; Austin, J. Matching performance of binary correlation matrix memories. Neural Netw. 1997, 10, 1637–1648. [Google Scholar] [CrossRef]
- Austin, J.; Lees, K. A search engine based on neural correlation matrix memories. Neurocomputing 2000, 35, 55–72. [Google Scholar] [CrossRef]
- Santiago-Montero, R. Clasificador Híbrido de Patrones Basado en la Lernmatrix de Steinbuch y en el Linear Associator de Anderson-Kohonen. Master Thesis, Centro de Investigación en Computación del Instituto Politécnico Nacional, Ciudad de México, México, 2003. [Google Scholar]
- Uriarte-Arcia, A.V.; López-Yáñez, I.; Yáñez-Márquez, C. One-hot vector hybrid associative classifier for medical data classification. PLoS ONE 2014, 9, e95715. [Google Scholar]
- Cleofas-Sánchez, L.; Sánchez, J.S.; García, V.; Valdovinos, R.M. Associative learning on imbalanced environments: An empirical study. Expert Syst. Appl. 2016, 54, 387–397. [Google Scholar] [CrossRef] [Green Version]
- Zhang, S. Cost-sensitive KNN classification. Neurocomputing 2020, 391, 234–242. [Google Scholar] [CrossRef]
- Gopi, A.P.; Jyothi, R.N.S.; Narayana, V.L.; Sandeep, K.S. Classification of tweets data based on polarity using improved RBF kernel of SVM. Int. J. Inf. Technol. 2020, 1–16. [Google Scholar] [CrossRef]
- Shi, B.; Liu, J. Nonlinear metric learning for kNN and SVMs through geometric transformations. Neurocomputing 2018, 318, 18–29. [Google Scholar] [CrossRef] [Green Version]
- Zhao, Z.; Xu, S.; Kang, B.H.; Kabir, M.M.J.; Liu, Y.; Wasinger, R. Investigation and improvement of multi-layer perceptron neural networks for credit scoring. Expert Syst. Appl. 2015, 42, 3508–3516. [Google Scholar] [CrossRef]
- Hassoun, M.H. Associative Neural Memories, 1st ed.; Oxford University Press, Inc: Ann Arbor, MI, USA, 1993. [Google Scholar]
- Velázquez-Rodríguez, J.-L.; Villuendas-Rey, Y.; Camacho-Nieto, O.; Yáñez-Márquez, C. A Novel and Simple Mathematical Transform Improves the Perfomance of Lernmatrix in Pattern Classification. Mathematics 2020, 8, 732. [Google Scholar] [CrossRef]
- Hall, M.; Frank, E.; Holmes, G.; Pfahringer, B.; Reutemann, P.; Witten, I.H. The WEKA data mining software: An update. ACM SIGKDD Explor. Newsl. 2009, 11, 10–18. [Google Scholar] [CrossRef]
- Tsalera, E.; Papadakis, A.; Samarakou, M. Monitoring, profiling and classification of urban environmental noise using sound characteristics and the KNN algorithm. Energy Rep. 2020, 6, 223–230. [Google Scholar] [CrossRef]
- Luo, Y.; Xiong, Z.; Xia, S.; Tan, H.; Gou, J. Classification noise detection based SMO algorithm. Optik 2016, 127, 7021–7029. [Google Scholar] [CrossRef]
- Hoffmann, L.F.S.; Bizarria, F.C.P.; Bizarria, J.W.P. Detection of liner surface defects in solid rocket motors using multi-layer perceptron neural networks. Polym. Test. 2020, 88, 106559. [Google Scholar] [CrossRef]
- Toneva, D.H.; Nikolova, S.Y.; Agre, G.P.; Zlatareva, D.K.; Hadjidekov, V.G.; Lazarov, N.E. Data mining for sex estima-tion based on cranial measurements. Forensic Sci. Int. 2020, 315, 110441. [Google Scholar] [CrossRef] [PubMed]
- Andrejiova, M.; Grincova, A. Classification of impact damage on a rubber-textile conveyor belt using Naïve-Bayes method-ology. Wear 2018, 414–415, 59–67. [Google Scholar] [CrossRef]
- Mohanty, M.; Sahoo, S.; Biswal, P.; Sabut, S. Efficient classification of ventricular arrhythmias using feature selection and C4.5 classifier. Biomed. Signal Process. Control. 2018, 44, 200–208. [Google Scholar] [CrossRef]
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
- Friedman, M. The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J. Am. Stat. Assoc. 1937, 32, 675–701. [Google Scholar] [CrossRef]
Dataset | Attributes | Patterns | IR* |
---|---|---|---|
Haberman | 3 | 306 | 2.78 |
New-thyroid1 | 5 | 215 | 5.14 |
Iris0 | 4 | 150 | 2 |
E. coli (imbalanced: 0–4–6 vs. 5) | 6 | 203 | 9.15 |
E. coli (imbalanced: 0–1 vs. 5) | 6 | 240 | 11 |
E. coli (imbalanced: 0–6–7 vs. 5) | 6 | 220 | 10 |
E. coli (imbalanced: 0–1–4–7 vs. 5–6) | 6 | 332 | 12.28 |
E. coli (imbalanced: 0–1–4–6 vs. 5) | 6 | 280 | 13 |
E. coli (imbalanced: 2–6 vs. 0–1–3–7) | 7 | 281 | 39.14 |
LED display domain (imbalanced: 0–2–4–5–6–7–8–9 vs. 1) | 7 | 443 | 10.97 |
Hayes-Roth | 5 | 160 | 1.7 |
Balance scale | 4 | 625 | 5.88 |
Dataset | CHAT ECP (9) | CHAT ECP (7) | CHAT (Original) | IB1 | IB3 | IB5 | JRip | Random Forest | SMO | Naive Bayes | MLP | J48 (C4.5) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Haberman | 0.635 | 0.635 | 0.632 | 0.557 | 0.556 | 0.529 | 0.598 | 0.541 | 0.498 | 0.588 | 0.595 | 0.635 |
New-thyroid1 | 0.886 | 0.886 | 0.746 | 0.98 | 0.937 | 0.937 | 0.926 | 0.929 | 0.786 | 0.989 | 0.966 | 0.98 |
Iris0 | 0.98 | 0.99 | 0.96 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0.99 |
E. coli (imbalanced: 0–4–6 vs. 5) | 0.897 | 0.897 | 0.809 | 0.872 | 0.922 | 0.922 | 0.839 | 0.87 | 0.847 | 0.881 | 0.892 | 0.756 |
E. coli (imbalanced: 0–1 vs. 5) | 0.923 | 0.923 | 0.775 | 0.87 | 0.923 | 0.898 | 0.832 | 0.868 | 0.85 | 0.923 | 0.895 | 0.782 |
E. coli (imbalanced: 0–6–7 vs. 5) | 0.89 | 0.89 | 0.798 | 0.84 | 0.82 | 0.848 | 0.868 | 0.873 | 0.8 | 0.878 | 0.87 | 0.843 |
E. coli (imbalanced: 0–1–4–7 vs. 5–6) | 0.918 | 0.918 | 0.792 | 0.87 | 0.873 | 0.915 | 0.789 | 0.817 | 0.778 | 0.853 | 0.897 | 0.917 |
E. coli (imbalanced: 0–1–4–6 vs. 5) | 0.925 | 0.925 | 0.777 | 0.873 | 0.923 | 0.898 | 0.763 | 0.844 | 0.798 | 0.888 | 0.84 | 0.752 |
E. coli (imbalanced: 2–6 vs. 0–1–3–7) | 0.857 | 0.857 | 0.772 | 0.848 | 0.852 | 0.853 | 0.852 | 0.784 | 0.852 | 0.853 | 0.855 | 0.855 |
LED display domain (imbalanced: 0–2–4–5–6–7–8–9 vs. 1) | 0.87 | 0.87 | 0.823 | 0.909 | 0.91 | 0.91 | 0.893 | 0.896 | 0.88 | 0.849 | 0.863 | 0.88 |
Hayes-Roth | 0.556 | 0.556 | 0.516 | 0.744 | 0.373 | 0.288 | 0.784 | 0.817 | 0.567 | 0.732 | 0.751 | 0.784 |
Balance scale | 0.618 | 0.618 | 0.618 | 0.63 | 0.63 | 0.643 | 0.582 | 0.58 | 0.639 | 0.657 | 0.828 | 0.563 |
Classifier | Mean Ranks 1 |
---|---|
CHAT-ECP (7) | 4.833 |
CHAT-ECP (9) | 4.958 |
Naive Bayes | 5.125 |
MLP | 5.167 |
IB5 | 5.375 |
IB3 | 5.667 |
IB1 | 6.208 |
Random Forest | 7.042 |
J48 (C4.5) | 7.042 |
JRip | 7.583 |
SMO | 8.583 |
CHAT (original) | 10.417 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Rolón-González, C.A.; Castañón-Méndez, R.; Alarcón-Paredes, A.; López-Yáñez, I.; Yáñez-Márquez, C. Improving the Performance of an Associative Classifier in the Context of Class-Imbalanced Classification. Electronics 2021, 10, 1095. https://doi.org/10.3390/electronics10091095
Rolón-González CA, Castañón-Méndez R, Alarcón-Paredes A, López-Yáñez I, Yáñez-Márquez C. Improving the Performance of an Associative Classifier in the Context of Class-Imbalanced Classification. Electronics. 2021; 10(9):1095. https://doi.org/10.3390/electronics10091095
Chicago/Turabian StyleRolón-González, Carlos Alberto, Rodrigo Castañón-Méndez, Antonio Alarcón-Paredes, Itzamá López-Yáñez, and Cornelio Yáñez-Márquez. 2021. "Improving the Performance of an Associative Classifier in the Context of Class-Imbalanced Classification" Electronics 10, no. 9: 1095. https://doi.org/10.3390/electronics10091095
APA StyleRolón-González, C. A., Castañón-Méndez, R., Alarcón-Paredes, A., López-Yáñez, I., & Yáñez-Márquez, C. (2021). Improving the Performance of an Associative Classifier in the Context of Class-Imbalanced Classification. Electronics, 10(9), 1095. https://doi.org/10.3390/electronics10091095