# Intelligent Clustering and Dynamic Incremental Learning to Generate Multi-Codebook Fuzzy Neural Network for Multi-Modal Data Classification

## Abstract

**:**

## 1. Introduction

## 2. Related Works

## 3. Fuzzy-Neuro Generalized Learning Vector Quantization (FNGLVQ)

_{1}) as the class in which its weight (codebook/reference vector) is the closest to the input vector. During the training process, the weight of the winner vector is updated. If the winner class is the same as the input vector, then its weight is adjusted closer to the input vector. Otherwise, its weight is adjusted further from the input vector. LVQ2.1 modifies LVQ by adding an update rule for the runner up class. In the LVQ2.1 algorithm, the runner up class (w

_{2}) is a class whose weight is closest to input vector but comes from different class labels. GLVQ and FNGLVQ are enhancements of LVQ2.1 that use the same approach. These algorithms use the winner and runner up classes in their training processes. During the training process, the methods minimize misclassification errors.

_{1}, w

_{2}, w

_{3}, …, w

_{n}) where n is the number of features of the dataset. Please note that w

_{1}and w

_{2}in this context are the weights of features 1 and 2 of their class, not the symbols of the winner and runner up vector, as discussed before. Each weight has three values (w

_{i min}, w

_{i mean}, w

_{i max}) because the method uses fuzzy membership. In the learning process, the method computes the distance between input vectors and the weights of each class. As the data have n features, the measurement of the distances is conducted for all features, and then the mean of the distances is calculated. Then, the method finds the winner and runner up classes based on the mean distance from the classes to the input vector. Last, the method updates the weight of the winner and runner up classes. The GLVQ classifier defines misclassification error as written in the equation below:

_{1}is the similarity value between the input sample (input vector) and the winner vector, whereas µ

_{2}is the similarity between the input sample (input vector) and the runner up vector. The winner vector is the closest existing reference vector (codebook) from the same class C

_{x}= C

_{w1}, whereas the runner up vector is the closest existing reference vector from a different class C

_{x}= C

_{w2}. The similarity value is computed by the equation below:

_{min}, w

_{mean}, w

_{max}). The derivation of the update rule to (w

_{mean}) is divided into three conditions and leads to the FNGLVQ learning (training) formula, as follows:

- If ${w}_{min}<x\le {w}_{mean}$$${w}_{1}\left(t+1\right)\leftarrow {w}_{1}\left(t\right)-\alpha .\frac{\delta f}{\delta \phi}.\frac{2.\left(1-{\mu}_{2}\right)}{{\left(2-{\mathsf{\mu}}_{1}-{\mathsf{\mu}}_{2}\right)}^{2}}.\left(\frac{x-{w}_{min}}{{\left({w}_{mean}-{w}_{min}\right)}^{2}}\right)$$$${w}_{2}\left(t+1\right)\leftarrow {w}_{2}\left(t\right)+\alpha .\frac{\delta f}{\delta \phi}.\frac{2.\left(1-{\mu}_{1}\right)}{{\left(2-{\mathsf{\mu}}_{1}-{\mathsf{\mu}}_{2}\right)}^{2}}.\left(\frac{x-{w}_{min}}{{\left({w}_{mean}-{w}_{min}\right)}^{2}}\right)$$
- If ${w}_{mean}<x<{w}_{max}$$${w}_{1}\left(t+1\right)\leftarrow {w}_{1}\left(t\right)+\alpha .\frac{\delta f}{\delta \phi}.\frac{2.\left(1-{\mu}_{2}\right)}{{\left(2-{\mathsf{\mu}}_{1}-{\mathsf{\mu}}_{2}\right)}^{2}}.\left(\frac{{w}_{max}-x}{{\left({w}_{max}-{w}_{mean}\right)}^{2}}\right)$$$${w}_{2}\left(t+1\right)\leftarrow {w}_{2}\left(t\right)-\alpha .\frac{\delta f}{\delta \phi}\mathrm{x}\frac{2.\left(1-{\mu}_{1}\right)}{{\left(2-{\mathsf{\mu}}_{1}-{\mathsf{\mu}}_{2}\right)}^{2}}.\left(\frac{{w}_{max}-x}{{\left({w}_{max}-{w}_{mean}\right)}^{2}}\right)$$
- If $x\le {w}_{min}$ and $x\ge {w}_{max}$$${w}_{i}\left(t+1\right)\leftarrow {w}_{i}\left(t\right),i=1,2$$
_{1}(winner class) is the closest reference vector (codebook) from the same class as the input vector C_{x}= C_{w1}, and w_{2}(runner up class) is the closest reference vector (codebook) from a different class. The update rules for the other two values in fuzzy membership (w_{min}and w_{max}) are conducted by using the equation below:$${w}_{min}\leftarrow {w}_{mean}\left(t+1\right)-\left({w}_{mean}\left(t\right)-{w}_{min}\left(t\right)\right)$$$${w}_{max}\leftarrow {w}_{mean}\left(t+1\right)-\left({w}_{mean}\left(t\right)-{w}_{min}\left(t\right)\right)$$

_{min}and w

_{max}to gain better performance as follows. If (μ1 > 0 or μ2 > 0) and $\phi $ < 0, then the method increases fuzzy triangular width by using the equation below:

- Given that the training set consist of m record instances (X1, X2,…, Xm), each instance is a vector of n elements because the data have n features.
- Initiate the codebook (reference vector/weight) of each class (w) by a random selection of training set for the respective class.
- For each instance of the training data, train the weights of the classes by using Equation (3) and Equations (7)–(20)
- Repeat step 2 until N number of iterations (epoch)

## 4. Intelligent Clustering and Dynamic Incremental Learning

#### 4.1. Intelligent Clustering

#### 4.1.1. Intelligent K-means Based on Anomalous Pattern

Algorithm 1: Intelligent K-Means Clustering based on Anomalous Pattern | |

01: | procedure IK-MEANS-AnomalousPattern |

02: | setting: |

03: | t = 1, E_{t} = original data set on feature space |

04: | denote R as the thresholds for small cluster removal |

05: | find Anomalous pattern: |

06: | apply procedure Find Anomalous Pattern for each t or at t = 1 (recommended) |

07: | Control statement: |

08: | If S_{t} != E_{t} then//There is a possibility to build new cluster |

09: | E_{t} = E_{t} − S_{t} |

10: | t += 1 |

11: | go to step setting (line 02): |

12: | small cluster removal: |

13: | for i = 1:T do//T is the number of clusters |

14: | If |C_{i}| < R then |

15: | remove C_{i} |

16: | end for |

17: | Denote the remaining cluster C_{1}, C_{2}, C_{3}, ….C_{n}, and their centroids c_{1}, c_{2}, c_{3}, ….c_{n} |

18: | for i = 1:T’ do //T’ is the number of the remaining cluster |

19: | do K-Means for K_{i} with c_{i} as initial seed |

20: | end for |

21: | |

22: | procedure Find Anomalous Pattern |

23: | Preprocessing step: |

24: | Denote the reference point x = x_{1}, x_{2}, x_{3}, ….x_{n} |

25: | Normalize the original data following the defined standard//This step is not mandatory |

26: | initial setup: |

27: | find most distant point k as the tentative centroid |

28: | cluster update: |

29: | determine cluster list L around k against the only other centroid, y _{t} is assigned to S if d(y_{t}, c_{t}) < d(y_{t}, c_{0}) |

30: | centroid update: |

31: | Calculate (k’), the mean value within L |

32: | If k’ ! = k then |

33: | k = k’ |

34: | go to step cluster update |

35: | else |

36: | go to step output |

37: | output: |

38: | return list L and its centroid k as the anomalous pattern |

#### 4.1.2. Intelligent K-Means Based on Histogram Information

Algorithm 2: Intelligent K-Means Clustering based on Histogram Information | |

01: | procedure IK-MEANS-HistogramInformation |

02: | denote f_{1}, f_{2}, f_{3}, ….f_{m} as data features |

03: | maxPeak = 0 as the current maximum peak |

04: | for c = 1:m do |

05: | peak_{i} = Approximate Histogram Peak (f_{i}) |

06: | If peak_{i} > maxPeak then |

07: | maxPeak = peak_{i} |

08: | end for |

09: | do K-Means clustering with K = maxPeak |

10: | |

11: | procedure Approximate Histogram Peak (feature) |

12: | sign1 = 0, sign2 = 0, numPeak = 0 |

13: | histVal = histogram(feature)//array of histogram values |

14: | for i = 2: lengthof(histVal)-1 do |

15: | sign2 = sign1 |

16: | If histVal(i) > histVal(i − 1) then |

17: | sign1 = 1 |

18: | else If histVal(i) < histVal(i − 1) then |

19: | sign1 = −1 |

20: | else |

21: | sign1 = 0 |

22: | if sign2 == 1 and sign1 == 1 then |

23: | numPeak = numPeak + 1 |

24: | end for |

25: | return numPeak |

#### 4.2. Dynamic Incremental Learning

- Too low similarity value between the input sample and closest codebook from the same class.
- The similarity between the input sample and the most distant codebook from the different classes is higher than the similarity between the input sample and the closest codebook from the same class.

- Too high similarity value between the input sample and the closest codebook from different classes.
- The similarity between the input sample and the most distant reference vector from the same class is less than the similarity between input sample and closest codebook from different classes.

## 5. Proposed Method: Multi-Codebook Fuzzy Neuro Generalized Learning Vector Quantization (Multi-codebook FNGLVQ)

#### 5.1. The Problem, Motivation, and Idea

_{12}. Each fuzzy triangle has three values (min, mean, and max), e.g., triangle 1 has a

_{1}, b

_{1}, and c

_{1}representing its min, mean, and max. FNGLVQ uses a balance triangle so that c1–b1 is the same as b1–a1. In multi-codebook approach, triangle 1 is substituted by triangles 3 and 4, and the similarity between classes is noted as h

_{23}and h

_{24}. Since there are two codebooks, the worst case overlapping value is max (h

_{23}, h

_{24}) The figure shows that h

_{23}and h

_{24}are less than h

_{12}. By using the fuzzy membership function, as used in FNLVQ [37], we could measure h

_{12}, h

_{23}, and h

_{24.}In the triangle 1 perspective, we could compute h

_{12}as follows:

_{12}as follows:

_{12}as follows:

_{1}= 0, c

_{1}= 1, l

_{1}= 0.5, 0 < a

_{2}< 0.5, 0.5 < c

_{2}< 1, 0.5 < c

_{3}< 1, a

_{4}= c

_{3}, a

_{3}= a

_{1}, c

_{4}= c

_{1}, l

_{4}+l

_{3}= l

_{1}= 0.5, c

_{3}< c

_{2}, 0 < l

_{2}< 0.5, and 0 < l

_{3}< 0.5. Table 1 shows several probable values of the triangles that result in values of h

_{12}, h

_{23}, and h

_{24}that satisfy the conditions above. Table 1 shows that the value of h

_{23}and h

_{24}were less than h

_{12}. The table supports visual description in Figure 5 that by using the multi-codebook approach, the similarity (overlapping) between class A and class B in FNGLVQ was less than the similarity (overlapping) between class A and class B when using single-codebook approach.

#### 5.2. Architecture

#### 5.3. Proposed Method: Multi-codebook Fuzzy Neuro Generalized Learning Vector Quantization by Using Intelligent Clustering

Algorithm 3: Multi-codebook Fuzzy Neuro Generalized Vector Quantization By Using Intelligent Clustering | |

01: | procedure MC-FNGLVQ-CLUSTERING |

02: | denote x_{1}, x_{2}, x_{3}, ….x_{n} as instances |

03: | denote f_{1}, f_{2}, f_{3}, ….f_{m} as data features |

04: | denote c_{1}, c_{2}, c_{3}, ….c_{y} as class labels |

05: | for c_{i} = c_{1}:c_{y} do |

06: | denote C as number of clusters |

07: | denote f_{c} = f_{1}:f_{m} where c_{c} = c_{i} |

08: | do clustering to f_{c}, where C is a number of clusters as the result of intelligent clustering |

09: | for j = 1:C do//C = number of cluster |

10: | denote C_{j} as member of cluster-j |

11: | find min, mean, meax of C_{j} |

12: | Use the min, mean, max to generate FNGLVQ codebook |

13: | end for |

14: | end for |

15: | for x_{i} = x_{1}:x_{n} do |

16: | train x by using FNGLVQ method |

17: | end for |

18 | Repeat step 15–17 until the number of iteration (epoch) is satisfied |

#### 5.4. Proposed Method: Multi-Codebook FNGLVQ by Using Dynamic Incremental Learning

- Multi-codebook FNGLVQ that uses static incremental learning: In this version, a new codebook is generated if the similarity of the winner vector is less than the given thresholds. The winner vector is the nearest codebook/reference vector to the class that is the same as the input vector class. The condition is written in the equation below, and the threshold is a constant that is defined before training:$${\mathsf{\mu}}_{1}<threshold$$
- Multi-codebook FNGLVQ that uses dynamic incremental learning 1: In this version, a new codebook is generated if the similarity of the input vector to its codebooks is less than the minimum similarity of the input vector to the codebooks of other classes. The condition is written in the equation below:$${\mathsf{\mu}}_{1}<{\mathsf{\mu}}_{\mathrm{i}\text{}\mathrm{min}},\text{}{\mathsf{\mu}}_{\mathrm{i}\text{}\mathrm{min}}=\mathrm{min}\left({\mathsf{\mu}}_{\mathrm{i}}\right),\text{}i\text{}is\text{}the\text{}class\text{}label,\text{}i\ne input\text{}class\text{}label$$
- Multi-codebook FNGLVQ that uses dynamic incremental learning 2: In this version, a new codebook is generated if the similarity of the input vector to its codebooks is less than the mean similarity of the input vector to the codebooks of other classes. The condition is written in the equation below:$${\mathsf{\mu}}_{1}<{\mathsf{\mu}}_{\mathrm{i}\text{}\mathrm{mean}},\text{}{\mathsf{\mu}}_{\mathrm{i}\text{}\mathrm{mean}}=\mathrm{mean}\left({\mathsf{\mu}}_{\mathrm{i}}\right),\text{}i\text{}is\text{}class\text{}label,\text{}i\ne input\text{}class\text{}label$$

Algorithm 4: Multi Codebook Fuzzy Neuro Generalized Vector Quantization Using Incremental Learning | |

01: | procedure MC-FNGLVQ-CLUSTERING |

02: | denote x_{1}, x_{2}, x_{3}, ….x_{n} as instances |

03: | denote w_{1}, w_{2}, w_{3}, ….w_{p} as reference vector of class 1,2,…p |

04: | initiate w_{1}, w_{2}, w_{3}, ….w_{p} |

05: | denote T as the condition of generating a new codebook as defined in eq 34 or 35//eq 33 for the static version |

06: | for x_{i} = x_{1}:x_{n} do |

07: | denote y as the class label of x |

08: | compute µ of x using equation 3, compute the similarity of x to codebooks of all classes |

09: | if µ does not satisfy T then//T can be equation 34 or 35 |

10: | train x using FNGLVQ method |

11: | else |

12: | state w_{y}new as new codebook for class y |

13: | w_{y}new_{mean} = x |

14: | w_{y}new_{min} = x − average(w_{ymean} − w_{ymin}) |

15: | w_{y}new_{max} = x − average(w_{ymax} − w_{ymean}) |

16: | end for |

## 6. Experiment Result and Analysis

#### 6.1. Dataset

#### 6.2. Experiment Setup

- CNN2L: Input-Padd-Conv-Padd-Conv-Pool-Flat-Dense-Drop-Dense-Drop–Output.
- CNN10L: Input-Padd-Conv-Padd-Conv-Padd-Conv-Padd-Conv-Padd-Conv-Padd-Conv-Padd-Conv-Padd-Conv-Padd-Conv-Padd-Conv-Pool-Flat-Dense-Drop-Dense-Drop-Output.
- Dense2L: Dense-Drop-Dense-Drop-Output.
- Dense10L: Dense-Drop-Dense-Drop-Dense-Drop-Dense-Drop-Dense-Drop-Dense-Drop-Dense-Drop-Dense-Drop-Dense-Drop-Dense-Drop–Output.
- Dense10L02: Dense-Drop-Dense-Drop-Dense-Drop-Dense-Drop-Dense-Drop-Dense-Drop-Dense-Drop-Dense-Drop-Dense-Drop-Dense-Drop-Output, dropout value = 0.2.

#### 6.3. Result of Scenario 1: Experiment on SyntheticA Dataset

#### 6.4. Result of Scenario 2: Experiment on SyntheticB Dataset

#### 6.5. Result of Scenario 3: Experiment on Benchmark Dataset

#### 6.6. Improvement of Proposed Method Compared to Original FNGLVQ

#### 6.7. Analysis of Variance (ANOVA) Test for Significance Testing

#### 6.8. Discussion and Recommendation

## 7. Conclusions

## 8. Future Works

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## Appendix A. The Distribution of SyntheticA Dataset

## Appendix B. The Distribution of SyntheticB Dataset

## Appendix C. The Distribution of Benchmark Dataset

## References

- Ma’Sum, M.A.; Arrofi, M.K.; Jati, G.; Arifin, F.; Kurniawan, M.N.; Mursanto, P.; Jatmiko, W. Simulation of intelligent unmanned aerial vehicle (uav) for military surveillance. In Proceedings of the IEEE 2013 International Conference on Advanced Computer Science and Information Systems (ICACSIS), Kuta, Bali, 28–29 September 2013; pp. 161–166. [Google Scholar]
- Nüchter, A.; Joachim, H. Towards semantic maps for mobile robots. Robot. Auton. Syst.
**2008**, 56, 915–926. [Google Scholar] [CrossRef] [Green Version] - Martinez-Cantin, R.; Nando de, F.; Arnaud, D.; José, A.C. Active policy learning for robot planning and exploration under uncertainty. In Robotics: Science and Systems; MIT Press: Cambridge, MA, USA, 2007; Volume 3, pp. 334–341. [Google Scholar]
- Baltrušaitis, T.; Chaitanya, A.; Louis-Philippe, M. multi-modal machine learning: A survey and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell.
**2018**. [Google Scholar] [CrossRef] [Green Version] - Corneanu, C.A.; Simón, M.O.; Cohn, J.F.; Guerrero, S.E. Survey on rgb, 3d, thermal, and multimodal approaches for facial expression recognition: History, trends, and affect-related applications. IEEE Trans. Pattern Anal. Mach. Intell.
**2016**, 38, 1548–1568. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Soleymani, M.; Garcia, D.; Jou, B.; Schuller, B.; Chang, S.F.; Pantic, M. A survey of multimodal sentiment analysis. Image Vis. Comput.
**2017**, 65, 3–14. [Google Scholar] [CrossRef] - Kumar, A.; Kim, J.; Cai, W.; Fulham, M.; Feng, D. Content-based medical image retrieval: A survey of applications to multidimensional and multimodality data. J. Digit. Imaging
**2013**, 26, 1025–1039. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Oskouie, P.; Alipour, S.; Eftekhari-Moghadam, A.M. Multimodal feature extraction and fusion for semantic mining of soccer video: A survey. Artif. Intell. Rev.
**2014**, 42, 173–210. [Google Scholar] [CrossRef] - Abidi, B.R.; Aragam, N.R.; Yao, Y.; Abidi, M.A. Survey and analysis of multimodal sensor planning and integration for wide area surveillance. ACM Comput. Surv. (CSUR)
**2009**, 41, 7. [Google Scholar] [CrossRef] - Kiela, D.; Grave, E.; Joulin, A.; Mikolov, T. Efficient large-scale multi-modal classification. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018. [Google Scholar]
- Vortmann, L.M.; Schult, M.; Benedek, M.; Walcher, S.; Putze, F. Real-Time Multimodal Classification of Internal and External Attention. In Proceedings of the Adjunct of the 2019 International Conference on Multimodal Interaction, ACM, Suzhou, China, 14–18 October 2019; p. 14. [Google Scholar]
- Poria, S.; Cambria, E.; Hussain, A.; Huang, G.B. Towards an intelligent framework for multimodal affective data analysis. Neural Netw.
**2015**, 63, 104–116. [Google Scholar] [CrossRef] - Atrey, P.K.; Hossain, M.A.; El Saddik, A.; Kankanhalli, M.S. Multimodal fusion for multimedia analysis: A survey. Multimed. Syst.
**2010**, 16, 345–379. [Google Scholar] [CrossRef] - Ma’sum, M.; Sanabila Anwar, H.R.; Jatmiko, W. Multi codebook lvq-based artificial neural networks using clustering approach. In Proceedings of the IEEE 2015 International Conference on Advanced Computer Science and Information Systems (ICACSIS), Depok, Indonesia, 10–11 October 2015; pp. 263–268. [Google Scholar]
- Anwar Ma’sum, M.; Wisnu, J. Multi-codebook Fuzzy Neural Network Using Incremental Learning for Multimodal Data Classification. In Proceedings of the IEEE 2019 4th Asia-Pacific Conference on Intelligent Robot Systems (ACIRS), Nagoya, Japan, 13–15 July 2019; pp. 205–210. [Google Scholar]
- Losing, V.; Barbara, H.; Heiko, W. Incremental on-line learning: A review and comparison of state of the art algorithms. Neurocomputing
**2018**, 275, 1261–1274. [Google Scholar] [CrossRef] [Green Version] - Krawczyk, B.; Minku, L.L.; Gama, J.; Stefanowski, J.; Woźniak, M. Ensemble learning for data stream analysis: A survey. Inf. Fusion
**2017**, 37, 132–156. [Google Scholar] [CrossRef] - Hartigan, J.A.; Manchek, A.W. Algorithm AS 136: A k-means clustering algorithm. J. R. Stat. Soc. Ser. C (Appl. Stat.)
**1979**, 28, 100–108. [Google Scholar] [CrossRef] - Banfield, J.D.; Adrian, E. Model-based Gaussian and non-Gaussian clustering. Biometrics
**1993**, 49, 803–821. [Google Scholar] [CrossRef] - Chiang, M.M.-T.; Boris, M. Intelligent choice of the number of clusters in k-means clustering: An experimental study with different cluster spreads. J. Classif.
**2010**, 27, 3–40. [Google Scholar] [CrossRef] [Green Version] - Anwar Ma’sum, M.; Dewa, M.S.A.; Indra, H.; Wisnu, J.; Adi, N. Multicodebook Neural Network Using Intelligent K-Means Clustering Based on Histogram Information for Multimodal Data Classification. In Proceedings of the IEEE 2018 International Workshop on Big Data and Information Security (IWBIS), Jakarta, Indonesia, 12–13 May 2018; pp. 129–135. [Google Scholar]
- Safavian, S.R.; David, L. A survey of decision tree classifier methodology. IEEE Trans. Syst. Man Cybern.
**1991**, 21, 660–674. [Google Scholar] [CrossRef] [Green Version] - Rish, I. An empirical study of the naive Bayes classifier. In Proceedings of the IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence, Seattle, DC, USA, 4 August 2001; Volume 3, pp. 41–46. [Google Scholar]
- Chung, K.-M.; Kao, W.-C.; Sun, C.-L.; Wang, L.-L.; Lin, C.-J. Radius margin bounds for support vector machines with the RBF kernel. Neural Comput.
**2003**, 15, 2643–2681. [Google Scholar] [CrossRef] - Pal, S.K.; Sushmita, M. Multilayer perceptron, fuzzy sets, and classification. IEEE Trans. Neural Netw.
**1992**, 3, 683–697. [Google Scholar] [CrossRef] - Breiman, L. Bagging predictors. Mach. Learn.
**1996**, 24, 123–140. [Google Scholar] [CrossRef] [Green Version] - Breiman, L. Random forests. Mach. Learn.
**2001**, 45, 5–32. [Google Scholar] [CrossRef] [Green Version] - Simonyan, K.; Andrew, Z. Very deep convolutional networks for large-scale image recognition. arXiv
**2014**, arXiv:1409.1556. [Google Scholar] - Setiawan, I.M.A.; Imah, E.M.; Jatmiko, W. Arrhytmia classification using Fuzzy-Neuro Generalized Learning Vector Quantization. In Proceedings of the 2011 International Conference on Advanced Computer Science and Information System (ICACSIS), Jakarta, Indonesia, 17–18 December 2011; pp. 385–390. [Google Scholar]
- Kohonen, G.T. Learning Vector Quantization for Pattern Recognition. In Report TKK-F-A601; Helsinki University of Technology: Espoo, Finland, 1986. [Google Scholar]
- Sato, A.; Yamada, K. A formulation of learning vector quantization using a new misclassification measure. In Proceedings of the IEEE Computer Society 14th International Conference on Pattern Recognition-, ICPR ’98, Washington, DC, USA, 16–20 August 1998; Volume 1, p. 322. [Google Scholar]
- Zhang, D.; Wang, Y.; Zhou, L.; Yuan, H.; Shen, D.; Initiative, A.D.N. Multimodal classification of Alzheimer’s disease and mild cognitive impairment. Neuroimage
**2011**, 55, 856–867. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Cauwenberghs, G.; Tomaso, P. Incremental and decremental support vector machine learning. Adv. Neural Inf. Process. Syst.
**2001**, 409–415. Available online: http://papers.nips.cc/paper/1814-incremental-and-decremental-support-vector-machine-learning.pdf (accessed on 1 January 2020). - Molina, J.F.G.; Zheng, L.; Sertdemir, M.; Dinter, D.J.; Schönberg, S.; Rädle, M. Incremental learning with SVM for multimodal classification of prostatic adenocarcinoma. PLoS ONE
**2014**, 9, e93600. [Google Scholar] - Huang, G.-B.; Chen, L. Convex incremental extreme learning machine. Neurocomputing
**2007**, 70, 3056–3062. [Google Scholar] [CrossRef] - Anwar Ma’sum, M.; Dewa, M.S.A.; Novian, H.; Wisnu, J. Enhance generalized learning vector quantization using unsupervised extreme learning machine and intelligent k-means clustering. In Proceedings of the IEEE 2017 International Workshop on Big Data and Information Security (IWBIS), Jakarta, Indonesia, 23–24 September 2017; pp. 77–83. [Google Scholar]
- Roszkowska, E.; Tomasz, W. Application of fuzzy TOPSIS to scoring the negotiation offers in ill-structured negotiation problems. Eur. J. Oper. Res.
**2015**, 242, 920–932. [Google Scholar] [CrossRef] - Kusumoputro, B.; Hary, B.; Wisnu, J. Fuzzy-neuro LVQ and its comparison with fuzzy algorithm LVQ in artificial odor discrimination system. ISA Trans.
**2002**, 41, 395–407. [Google Scholar] [CrossRef] - Jatmiko, W.; Rochmatullah, R.; Kusumoputro, B.; Sekiyama, K.; Fukuda, T. Fuzzy learning vector quantization based on particle swarm optimization for artificial odor discrimination system. WSEAS Trans. Syst.
**2009**, 8, 1239–1252. [Google Scholar] - Imah, E.M.; Wisnu, J.; Basaruddin, T. Adaptive Multilayer Generalized Learning Vector Quantization (AMGLVQ) as new algorithm with integrating feature extraction and classification for Arrhythmia heartbeats classification. In Proceedings of the 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Seoul, Korea, 14–17 October 2012; pp. 150–155. [Google Scholar]
- Krizhevsky, A.; Ilya, S.; Geoffrey, E.H. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst.
**2017**, 1097–1105. [Google Scholar] [CrossRef] - Huang, G.; Zhuang, L.; Laurens, V.D.M.; Kilian, Q. Weinberger. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 770–778. [Google Scholar]
- Zoph, B.; Vijay, V.; Jonathon, S.; Quoc, V.L. Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 8697–8710. [Google Scholar]
- Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv
**2017**, arXiv:1704.04861. [Google Scholar] - Losing, V.; Barbara, H.; Heiko, W. Interactive online learning for obstacle classification on a mobile robot. In Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN), Killarney, Ireland, 12–16 July 2015; pp. 1–8. [Google Scholar]
- Liang, N.-Y.; Huang, G.-B.; Saratchandran, P.; Sundararajan, N. A fast and accurate online sequential learning algorithm for feedforward networks. IEEE Trans. Neural Netw.
**2006**, 17, 1411–1423. [Google Scholar] [CrossRef] - Saffari, A.; Christian, L.; Jakob, S.; Martin, G.; Horst, B. On-line random forests. In Proceedings of the 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, Kyoto, Japan, 27 September–4 October 2009; pp. 1393–1400. [Google Scholar]
- Glarner, T.; Patrick, H.; Janek, E.; Reinhold, H.-U. Full Bayesian Hidden Markov Model Variational Autoencoder for Acoustic Unit Discovery. In Proceedings of the Interspeech, Hyderabad, India, 2–6 September 2018; pp. 2688–2692. [Google Scholar]
- Howell, D.C. Statistical methods for psychology. In Wadsworth Cengage Learning; Cengage Wadsworth: Belmont, CA, USA, 2009. [Google Scholar]

**Figure 5.**The similarity between classes in single-codebook vs. multi-codebook for multi-modal data fitting.

**Figure 7.**Distribution of the SyntheticA dataset: (

**a**) instance 1, (

**b**) instance 2, (

**c**) instance 3, and (

**d**) instance 4.

**Figure 8.**Results of the SyntheticA dataset: (

**a**) instance 1, (

**b**) instance 2, (

**c**) instance 3, and (

**d**) instance 4. X-axis = epoch; y-axis = accuracy.

**Table 1.**Comparison of Overlapping Values between the Single-Codebook and Multi-codebook Approaches Given the Triangles Parameters.

a_{2} | c_{2} | l_{2} | c_{3} = a_{4} | l_{3} | l_{4} = 0.5 − l_{3} | h_{12} | h_{23} | h_{24} |
---|---|---|---|---|---|---|---|---|

0.4 | 0.9 | 0.25 | 0.8 | 0.4 | 0.1 | 0.80 | 0.62 | 0.29 |

0.3 | 0.9 | 0.3 | 0.7 | 0.3 | 0.2 | 0.88 | 0.67 | 0.40 |

0.25 | 0.9 | 0.325 | 0.6 | 0.2 | 0.3 | 0.91 | 0.67 | 0.48 |

0.4 | 0.8 | 0.2 | 0.7 | 0.4 | 0.1 | 0.86 | 0.50 | 0.33 |

0.3 | 0.8 | 0.25 | 0.6 | 0.3 | 0.2 | 0.93 | 0.55 | 0.44 |

0.25 | 0.8 | 0.275 | 0.5 | 0.2 | 0.3 | 0.97 | 0.53 | 0.52 |

0.4 | 0.75 | 0.175 | 0.65 | 0.4 | 0.1 | 0.89 | 0.43 | 0.36 |

0.3 | 0.75 | 0.225 | 0.6 | 0.3 | 0.2 | 0.97 | 0.57 | 0.35 |

0.25 | 0.75 | 0.25 | 0.55 | 0.2 | 0.3 | 1.00 | 0.67 | 0.36 |

No | Dataset | Information | #Classes | #Features | #Instances |
---|---|---|---|---|---|

1 | SyntheticA-i1 | 2peak;5class;2feature | 5 | 2 | 10,000 |

2 | SyntheticA-i2 | 2peak;5class;2feature | 5 | 2 | 10,000 |

3 | SyntheticA-i3 | 2peak;5class;2feature | 5 | 2 | 10,000 |

4 | SyntheticA-i4 | 2peak-5class;2feature | 5 | 2 | 10,000 |

5 | SyntheticB-i1 | 2peak;2class;5feature | 2 | 5 | 4000 |

6 | SyntheticB-i2 | 2peak;3class;5feature | 3 | 5 | 6000 |

7 | SyntheticB-i3 | 2peak;4class;5feature | 4 | 5 | 8000 |

8 | SyntheticB-i4 | 2peak;5class;5feature | 5 | 5 | 10,000 |

9 | SyntheticB-i5 | 3peak;2class;5feature | 2 | 5 | 6000 |

10 | SyntheticB-i6 | 3peak;3class;5feature | 3 | 5 | 9000 |

11 | SyntheticB-i7 | 3peak;4class;5feature | 4 | 5 | 12,000 |

12 | SyntheticB-i8 | 3peak;5class;5feature | 5 | 5 | 15,000 |

13 | Pinwheel | [49] | 5 | 2 | 5000 |

14 | Glass | UCI | 6 | 9 | 214 |

15 | Ionosphere | UCI | 2 | 33 | 351 |

16 | Breast Cancer Coimbra | UCI | 2 | 9 | 116 |

17 | Wall Following | UCI | 3 | 2 | 5456 |

18 | Segment | UCI | 7 | 19 | 2310 |

19 | Ecoli | UCI | 4 | 7 | 336 |

20 | Odor | [37] | 12 | 8 | 2400 |

Dataset | FNGLVQ ORIGINAL | MC FNGLVQ IL STATIC 0.1 | MC FNGLVQ IL STATIC 0.2 | MC FNGLVQ IL STATIC 0.25 | MC FNGLVQ IL DYNAMIC1 | MC FNGLVQ IL DYNAMIC2 | MC FNGLVQ IK-MEANS ANO | MC FNGLVQ IK-MEANS HIST |
---|---|---|---|---|---|---|---|---|

SyntheticB-i1 | 70.1 | 70.02 | 70.6 | 71.15 | 85.97 | 84.8 | 85.88 | 86.8 |

SyntheticB-i2 | 52.35 | 52.3 | 74.28 | 75.35 | 82.1 | 85.58 | 87.3 | 87.85 |

SyntheticB-i3 | 53.46 | 50 | 59.94 | 59.59 | 78.61 | 83.34 | 86.46 | 85.32 |

SyntheticB-i4 | 46.17 | 46.67 | 45.91 | 45.59 | 58.63 | 70.6 | 73.02 | 71.63 |

SyntheticB-i5 | 62.62 | 63.1 | 74.56 | 86.97 | 89.32 | 89.48 | 87.21 | 91.5 |

SyntheticB-i6 | 54.3 | 54.23 | 66.3 | 73.14 | 82.8 | 84.51 | 83.59 | 89.29 |

SyntheticB-i7 | 53.54 | 51.41 | 56.99 | 61.71 | 64.84 | 82.47 | 78.76 | 84.48 |

SyntheticB-i8 | 48.17 | 51.6 | 51.53 | 52.31 | 60.37 | 73.5 | 71.93 | 74.43 |

Average | 55.09 | 54.92 | 62.51 | 65.73 | 75.33 | 81.79 | 81.77 | 83.91 |

Dataset | Pinwheel | Glass | Ionosphere | Breast Cancer Coimbra | Wall Following | Segment | Ecoli | Odor | Average |
---|---|---|---|---|---|---|---|---|---|

FNGLVQ-ORIGINAL | 92.24 | 59.25 | 88.60 | 64.67 | 78.42 | 80.86 | 89.28 | 75.45 | 78.60 |

MC-FNGLVQ-IL-STATIC-0.1 | 92.96 | 57.01 | 90.31 | 64.67 | 83.76 | 79.69 | 86.28 | 75.21 | 78.74 |

MC-FNGLVQ-IL-STATIC-0.2 | 94.78 | 60.20 | 90.60 | 64.67 | 79.46 | 85.67 | 90.46 | 81.54 | 80.92 |

MC-FNGLVQ-IL-STATIC-0.25 | 95.24 | 60.74 | 90.31 | 66.30 | 79.03 | 87.01 | 91.63 | 85.08 | 81.92 |

MC-FNGLVQ-IL-DYNAMIC1 | 94.76 | 62.20 | 91.17 | 68.20 | 86.44 | 87.88 | 94.05 | 86.58 | 83.91 |

MC-FNGLVQ-IL-DYNAMIC2 | 99.44 | 71.43 | 90.04 | 62.86 | 86.99 | 93.68 | 75.46 | 85.92 | 83.23 |

MC-FNGLVQ-IK-MEANS-ANO | 94.41 | 55.72 | 93.43 | 63.96 | 73.15 | 85.58 | 87.20 | 77.58 | 78.88 |

MC-FNGLVQ-IK-MEANS-HIST | 91.08 | 59.85 | 91.15 | 69.18 | 84.04 | 86.70 | 86.00 | 78.25 | 80.78 |

Naïve Bayes | 90.00 | 53.00 | 84.00 | 61.00 | 90.00 | 79.00 | 63.00 | 71.00 | 73.88 |

SVM | 100.00 | 56.00 | 90.00 | 52.00 | 93.00 | 60.00 | 85.00 | 89.00 | 78.13 |

MLP | 99.00 | 35.00 | 90.00 | 48.00 | 40.00 | 13.00 | 72.00 | 58.00 | 56.88 |

Bagging Tree | 73.00 | 72.00 | 89.00 | 61.00 | 84.00 | 95.00 | 82.00 | 87.00 | 80.38 |

Random Forest | 93.00 | 58.00 | 86.00 | 57.00 | 94.00 | 78.00 | 82.00 | 45.00 | 74.13 |

CNN2L | 99.9 | 24.34 | 59.14 | 53.04 | 97.51 | 13.55 | 99.4 | 99.37 | 68.28 |

CNN10L | 39.72 | 31.44 | 70.29 | 55.65 | 40.42 | 15.29 | 52.23 | 24.42 | 41.18 |

Dense2L | 98.16 | 21.11 | 57.14 | 44.35 | 98.55 | 13.82 | 99.7 | 60.23 | 56.41 |

Dense10L | 23.49 | 31.37 | 55.99 | 46.96 | 66.95 | 14.42 | 42.69 | 15.71 | 37.20 |

Dense10L02 | 62.28 | 23.52 | 61.14 | 46.96 | 40.42 | 14.02 | 34.92 | 88.38 | 46.46 |

Dataset | MC FNGLVQ IL STATIC 0.1 | MC FNGLVQ IL STATIC 0.2 | MC FNGLVQ IL STATIC 0.25 | MC FNGLVQ IL DYNAMIC1 | MC FNGLVQ IL DYNAMIC2 | MC FNGLVQ IK-MEANS ANO | MC FNGLVQ IK-MEANS HIST |
---|---|---|---|---|---|---|---|

SyntheticA-i1 | 2.63 | 3.56 | 6.39 | 15.43 | 18.83 | 21.12 | 23.66 |

SyntheticA-i2 | 1.1 | −1.02 | 0.01 | 1.56 | 2.48 | 7.83 | 8.16 |

SyntheticA-i3 | 3.83 | 2.91 | 1.99 | 6.7 | 13.47 | 14.33 | 15.12 |

SyntheticA-i4 | 0.52 | 0.39 | −1.14 | 2.16 | 4.62 | 10.2 | 9.32 |

SyntheticB-i1 | −0.08 | 0.5 | 1.05 | 15.87 | 14.7 | 15.78 | 16.7 |

SyntheticB-i2 | −0.05 | 21.93 | 23 | 29.75 | 33.23 | 34.95 | 35.5 |

SyntheticB-i3 | −3.46 | 6.48 | 6.13 | 25.15 | 29.88 | 33 | 31.86 |

SyntheticB-i4 | 0.5 | −0.26 | −0.58 | 12.46 | 24.43 | 26.85 | 25.46 |

SyntheticB-i5 | 0.48 | 11.94 | 24.35 | 26.7 | 26.86 | 24.59 | 28.88 |

SyntheticB-i6 | −0.07 | 12 | 18.84 | 28.5 | 30.21 | 29.29 | 34.99 |

SyntheticB-i7 | −2.13 | 3.45 | 8.17 | 11.3 | 28.93 | 25.22 | 30.94 |

SyntheticB-i8 | 3.43 | 3.36 | 4.14 | 12.2 | 25.33 | 23.76 | 26.26 |

Pinwheel | 0.72 | 2.54 | 3 | 2.52 | 7.2 | 2.17 | −1.16 |

Glass | −2.24 | 0.95 | 1.49 | 2.95 | 12.18 | −3.53 | 0.6 |

Ionosphere | 1.71 | 2 | 1.71 | 2.57 | 1.44 | 4.83 | 2.55 |

Breast Cancer Coimbra | 0 | 0 | 1.631 | 3.531 | −1.811 | −0.714 | 4.51 |

Wall Following | 5.34 | 1.04 | 0.61 | 8.02 | 8.57 | −5.27 | 5.62 |

Segment | −1.17 | 4.81 | 6.15 | 7.02 | 12.82 | 4.72 | 5.84 |

Ecoli | −3 | 1.18 | 2.35 | 4.77 | −13.82 | −2.08 | −3.28 |

Odor | −0.24 | 6.09 | 9.63 | 11.133 | 10.47 | 2.13 | 2.8 |

Average Synthetic | 0.56 | 5.44 | 7.70 | 15.65 | 21.08 | 22.24 | 23.90 |

Average Benchmark | 0.14 | 2.33 | 3.32 | 5.31 | 4.63 | 0.28 | 2.19 |

Average All | 0.39 | 4.16 | 5.91 | 11.42 | 14.35 | 13.26 | 15.02 |

F | Fc | p-Value | Significance | |
---|---|---|---|---|

MC_FNGLVQ_STATIC_0.1 | 0.005 | 4.098 | 0.9439 | No |

MC_FNGLVQ_STATIC_0.2 | 0.559 | 4.098 | 0.4593 | No |

MC_FNGLVQ_STATIC_0.25 | 1.083 | 4.098 | 0.3047 | No |

MC_FNGLVQ_DYNAMIC1 | 4.355 | 4.098 | 0.0437 | Yes |

MC_FNGLVQ_DYNAMIC2 | 7.63 | 4.098 | 0.0088 | Yes |

MC_FNGLVQ_IKMEANS_ANO | 7.066 | 4.098 | 0.0114 | Yes |

MC_FNGLVQ_IKMEANS_HIST | 9.215 | 4.098 | 0.0043 | Yes |

© 2020 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Ma’sum, M.A.
Intelligent Clustering and Dynamic Incremental Learning to Generate Multi-Codebook Fuzzy Neural Network for Multi-Modal Data Classification. *Symmetry* **2020**, *12*, 679.
https://doi.org/10.3390/sym12040679

**AMA Style**

Ma’sum MA.
Intelligent Clustering and Dynamic Incremental Learning to Generate Multi-Codebook Fuzzy Neural Network for Multi-Modal Data Classification. *Symmetry*. 2020; 12(4):679.
https://doi.org/10.3390/sym12040679

**Chicago/Turabian Style**

Ma’sum, Muhammad Anwar.
2020. "Intelligent Clustering and Dynamic Incremental Learning to Generate Multi-Codebook Fuzzy Neural Network for Multi-Modal Data Classification" *Symmetry* 12, no. 4: 679.
https://doi.org/10.3390/sym12040679