FP-Conv-CM: Fuzzy Probabilistic Convolution C-Means
Abstract
:1. Introduction
2. Methodology Overview
- By summing it up as a set , where ; these vectors will be called referents (centers) throughout the rest of the article;
- By defining an assignment function, χ, which is an application of in the set of indices , this function makes it possible to realize a partition of in subsets, .
- Data is the dataset under study;
- Centers is the set of the centers to be determined;
- is the allocation function of data to the groups represented by the centers ;
- , are the groups determined based on ;
- is the membership function of group where m is a real number strictly greater than 1;
- is the probability of x being in group , where is the covariance of the component .
- Silhouette index:
- Let N be the number of patterns. The Silhouette index [43] finds the optimal clustering effect using the difference between the average distance within the cluster and the minimum distance between the clusters, i.e., the silhouette coefficient is given as follows:
- represents the average distance from sample to other samples in the cluster;
- represents the minimum distance from sample to the other clusters.
- The silhouette coefficient ranges from −1 to 1, where −1 denotes that the data point is not assigned to the relevant cluster, 0 denotes that the clusters are overlapping, and 1 denotes that the cluster is dense and well-separated. This metric is one of the most popular measurements in clustering. It can distinguish between objects that were placed wisely within their cluster and those that are in the outlier zone between clusters.
- (a)
- Mean Squared Error (MSE), which calculates the error between the initial image and the compressed image. It is in fact the distance between two matrices that represent the images to be compared.
- (b)
- Peak Signal-to-Noise Ratio (PSNR), which implements the following equation:
- (c)
- Structural SIMilarity (SSIM) index, which is calculated on various windows of an image. The measure between two windows x and y of common size is:
3. Drawbacks of Fuzzy and Probabilistic Approaches
3.1. K-Means
3.2. Probabilistic K-Means
- The prior probabilities are all equal to ;
- The k normal functions have identical variance–covariance matrices equal to , where represents the unit matrix and is the standard deviation considered constant for all these normal distributions.
- In that case, the density function has the following expression:
- The k-means probabilistic version involves estimating the vectors and the typical standard deviation trying to realize the sample as much as possible. This method, known as the maximum likelihood method, involves maximizing the probability of these observations.
- Maximizing the classifying likelihood amounts to minimizing:
- Probabilistic k-means has a running time of [52], where is the number of -dimensional vectors, is the number of clusters, and is the number of iterations required to reach convergence; CONST is a constant that depends only on the data.
3.3. Fuzzy C-Means (FCM)
3.4. Fuzzy Reasoning and Probabilistic Reasoning Are Complementary
- (a)
- The fuzzy model based on the membership functions and , presented in Figure 1, predicts as an element of C1.
- (b)
- The probabilistic model based on the densities and , presented in Figure 1, predicts as an element of C2
- (a)
- The fuzzy model based on the membership functions and given below predicts as an element of C1.
- (b)
- The probabilistic model based on the membership functions and given below predicts as an element of C2.
4. Proposed Approach
4.1. Fuzzy Probability Convolution Measure
4.2. Fuzzy Probabilistic Convolution C-Means
4.3. Fuzzy Probability Convolution for the Clustering Task
Algorithm 1. Fuzzy-Probabilistic-Convolution-C-Means. |
Requires: Data =, (number of groups), , , m (memberships parameter), b (mini-batch), ITER (maximum number of iterations). |
Ensure: centers matrix, memberships matrix of to the groups. |
Initialization: t = 0, , are randomly chosen; For all t = 1, …, ITER Do For all j = 1, …, k Do For all d = 1, …, N Do |
End For End For End For |
END |
5. Experimental Results
5.1. Datasets
5.2. Clustering Results
5.3. Image Compression
5.4. Compression Results
6. Conclusions and Future Perspectives
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Rokach, L.; Maimon, O. Clustering methods. In Data Mining and Knowledge Discovery Handbook; Springer: Boston, MA, USA, 2005; pp. 321–352. [Google Scholar] [CrossRef]
- MacQueen, J. Classification and analysis of multivariate observations. In Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, CA, USA, 1 January 1967; pp. 281–297. [Google Scholar]
- Kriegel, H.P.; Kröger, P.; Sander, J.; Zimek, A. Density-based clustering. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2011, 1, 231–240. [Google Scholar] [CrossRef]
- Govaert, G.; Nadif, M. Block clustering with Bernoulli mixture models: Comparison of different approaches. Comput. Stat. Data Anal. 2008, 52, 3233–3245. [Google Scholar] [CrossRef]
- Mirkin, B. Mathematical Classification and Clustering; Springer Science & Business Media: New York, NY, USA, 1996; Volume 11, ISBN 0-7923-4159-7. [Google Scholar]
- Hartuv, E.; Shamir, R. A clustering algorithm based on graph connectivity. Inf. Process. Lett. 2000, 76, 175–181. [Google Scholar] [CrossRef]
- Kohonen, T. Self-organized formation of topologically correct feature maps. Biol. Cybern. 1982, 43, 59–69. [Google Scholar] [CrossRef]
- Venkatkumar, I.A.; Shardaben, S.J.K. Comparative study of data mining clustering algorithms. In Proceedings of the 2016 International Conference on Data Science and Engineering (ICDSE), Cochin, India, 23–25 August 2016; pp. 1–7. [Google Scholar] [CrossRef]
- Rueda, A.; Krishnan, S. Clustering Parkinson’s and age-related voice impairment signal features for unsupervised learning. Adv. Data Sci. Adapt. Anal. 2018, 10, 1840007. [Google Scholar] [CrossRef]
- Mahdavi, M.; Chehreghani, M.H.; Abolhassani, H.; Forsati, R. Novel meta-heuristic algorithms for clustering web documents. Appl. Math. Comput. 2008, 201, 441–451. [Google Scholar] [CrossRef]
- Schubert, E.; Rousseeuw, P.J. Faster k-medoids clustering: Improving the PAM, CLARA, and CLARANS algorithms. In Similarity Search and Applications, Proceedings of the International Conference on Similarity Search and Applications, Newark, NJ, USA, 2–4 October 2019; Springer: Cham, Switzerland, 2019; pp. 171–187. [Google Scholar]
- Samudi, S.; Widodo, S.; Brawijaya, H. The K-Medoids clustering method for learning applications during the COVID-19 pandemic. Sinkron 2020, 5, 116–121. [Google Scholar] [CrossRef]
- Cao, F.; Liang, J.; Li, D.; Bai, L.; Dang, C. A dissimilarity measure for the k-Modes clustering algorithm. Knowl.-Based Syst. 2012, 26, 120–127. [Google Scholar] [CrossRef]
- Oyewole, G.J.; Thopil, G.A. Data clustering: Application and trends. Artif. Intell. Rev. 2022, in press. [Google Scholar] [CrossRef]
- Li, T.; Cai, Y.; Zhang, Y.; Cai, Z.; Liu, X. Deep mutual information subspace clustering network for hyperspectral images. IEEE Geosci. Remote Sens. Lett. 2022, 19, 6009905. [Google Scholar] [CrossRef]
- Zhou, Z.; Dong, X.; Li, Z.; Yu, K.; Ding, C.; Yang, Y. Spatio-temporal feature encoding for traffic accident detection in VANET environment. IEEE Trans. Intell. Transp. Syst. 2022, 23, 19772–19781. [Google Scholar] [CrossRef]
- Kang, Z.; Zhao, X.; Peng, C.; Zhu, H.; Zhou, J.T.; Peng, X.; Chen, W.; Xu, Z. Partition level multiview subspace clustering. Neural Netw. 2020, 122, 279–288. [Google Scholar] [CrossRef]
- Dunn, J.C. A fuzzy relative of the ISODATA process and its use in detecting compactwell-separated clusters. J. Cybern. 1973, 3, 32–57. [Google Scholar] [CrossRef]
- Bezdek, J.C. Pattern Recognition whit Fuzzy Objective Function Algorithms, 2nd ed.; Springer: New York, NY, USA, 1987. [Google Scholar]
- Liao, T.W.; Celmins, A.K.; Hammell, R.J., II. A fuzzy c-means variant for the generation of fuzzy term sets. Fuzzy Sets Syst. 2003, 135, 241–257. [Google Scholar] [CrossRef]
- Krishnapuram, R.; Keller, J.M. A possibilistic approach to clustering. IEEE Trans. Fuzzy Syst. 1993, 1, 98–110. [Google Scholar] [CrossRef]
- Duda, R.O.; Hart, P.E.; Stork, D.G. Pattern Classification and Scene Analysis; Wiley: New York, NY, USA, 1973; Volume 3, pp. 731–739. [Google Scholar]
- Alon, N.; Spencer, J.H. The Probabilistic Method; John Wiley & Sons: Hoboken, NJ, USA, 2016. [Google Scholar]
- Pal, N.R.; Pal, K.; Bezdek, J.C. A mixed c-means clustering model. In Proceedings of the 6th International Fuzzy Systems Conference, Barcelona, Spain, 5 July 1997; Volume 1, pp. 11–21. [Google Scholar]
- Timm, H.; Kruse, R. A modification to improve possibilistic fuzzy cluster analysis. In Proceedings of the 2002 IEEE World Congress on Computational Intelligence, Honolulu, HI, USA, 12–17 May 2002; Volume 2, pp. 1460–1465. [Google Scholar] [CrossRef]
- Timm, H.; Borgelt, C.; Döring, C.; Kruse, R. An extension to possibilistic fuzzy cluster analysis. Fuzzy Sets Syst. 2004, 147, 3–16. [Google Scholar] [CrossRef]
- Zhang, J.S.; Leung, Y.W. Improved possibilistic c-means clustering algorithms. IEEE Trans. Fuzzy Syst. 2004, 12, 209–217. [Google Scholar] [CrossRef]
- Jafar, O.M.; Sivakumar, R. A study on possibilistic and fuzzy possibilistic c-means clustering algorithms for data clustering. In Proceedings of the 2012 International Conference on Emerging Trends in Science, Engineering and Technology (INCOSET), Tamilnadu, India, 13–14 December 2012; pp. 90–95. [Google Scholar] [CrossRef]
- Pal, N.R.; Pal, K.; Keller, J.M.; Bezdek, J.C. A new hybrid c-means clustering model. In Proceedings of the 2004 IEEE International Conference on Fuzzy Systems (IEEE Cat. No. 04CH37542), Budapest, Hungary, 25–29 July 2004; Volume 1, pp. 179–184. [Google Scholar] [CrossRef]
- Pal, N.R.; Pal, K.; Keller, J.M.; Bezdek, J.C. A possibilistic fuzzy c-means clustering algorithm. IEEE Trans. Fuzzy Syst. 2005, 13, 517–530. [Google Scholar] [CrossRef]
- Azzouzi, S.; El-Mekkaoui, J.; Hjouji, A.; El Khalfi, A. An effective modified possibilistic Fuzzy C-Means clustering algorithm for noisy data problems. In Proceedings of the 2021 Fifth International Conference on Intelligent Computing in Data Sciences (ICDS), Fez, Morocco, 20–22 October 2021; pp. 1–7. [Google Scholar] [CrossRef]
- Guo, Y.; Sengur, A. NCM: Neutrosophic c-means clustering algorithm. Pattern Recognit. 2015, 48, 2710–2724. [Google Scholar] [CrossRef]
- Guo, Y.; Sengur, A. NECM: Neutrosophic evidential c-means clustering algorithm. Neural Comput. Appl. 2015, 26, 561–571. [Google Scholar] [CrossRef]
- Akbulut, Y.; Şengür, A.; Guo, Y.; Polat, K. KNCM: Kernel neutrosophic c-means clustering. Appl. Soft Comput. 2017, 52, 714–724. [Google Scholar] [CrossRef]
- Chiang, J.H.; Hao, P.Y. A new kernel-based fuzzy clustering approach: Support vector clustering with cell growing. IEEE Trans. Fuzzy Syst. 2003, 11, 518–527. [Google Scholar] [CrossRef]
- Graves, D.; Pedrycz, W. Kernel-based fuzzy clustering and fuzzy clustering: A comparative experimental study. Fuzzy Sets Syst. 2010, 161, 522–543. [Google Scholar] [CrossRef]
- Huang, H.C.; Chuang, Y.Y.; Chen, C.S. Multiple kernel fuzzy clustering. IEEE Trans. Fuzzy Syst. 2011, 20, 120–134. [Google Scholar] [CrossRef]
- Chen, L.; Chen, C.P.; Lu, M. A multiple-kernel fuzzy c-means algorithm for image segmentation. IEEE Trans. Syst. Man Cybern. Part B 2011, 41, 1263–1274. [Google Scholar] [CrossRef] [PubMed]
- Crespo, F.; Weber, R. A methodology for dynamic data mining based on fuzzy clustering. Fuzzy Sets Syst. 2005, 150, 267–284. [Google Scholar] [CrossRef]
- Munusamy, S.; Murugesan, P. Modified dynamic fuzzy c-means clustering algorithm–application in dynamic customer segmentation. Appl. Intell. 2020, 50, 1922–1942. [Google Scholar] [CrossRef]
- Ruspini, E.H.; Bezdek, J.C.; Keller, J.M. Fuzzy clustering: A historical perspective. IEEE Comput. Intell. Mag. 2019, 14, 45–55. [Google Scholar] [CrossRef]
- El Moutaouakil, K.; Touhafi, A. A New Recurrent Neural Network Fuzzy Mean Square Clustering Method. In Proceedings of the 2020 5th International Conference on Cloud Computing and Artificial Intelligence: Technologies and Applications (CloudTech), Marrakesh, Morocco, 28–30 May 2020; pp. 1–5. [Google Scholar] [CrossRef]
- El Moutaouakil, K.; Yahyaouy, A.; Chellak, S.; Baizri, H. An Optimized Gradient Dynamic-Neuro-Weighted-Fuzzy Clustering Method: Application in the Nutrition Field. Int. J. Fuzzy Syst. 2022, 24, 3731–3744. [Google Scholar] [CrossRef]
- El Moutaouakil, K.; El Ouissari, A.; Hicham, B.; Saliha, C.; Cheggour, M. Multi-objectives optimization and convolution fuzzy C-means: Control of diabetic population dynamic. RAIRO-Oper. Res. 2022, 56, 3245–3256. [Google Scholar] [CrossRef]
- Saberi, H.; Sharbati, R.; Farzanegan, B. A gradient ascent algorithm based on possibilistic fuzzy C-Means for clustering noisy data. Expert Syst. Appl. 2021, 191, 116153. [Google Scholar] [CrossRef]
- Surono, S.; Putri, R.D.A. Optimization of Fuzzy C-Means Clustering Algorithm with Combination of Minkowski and Chebyshev Distance Using Principal Component Analysis. Int. J. Fuzzy Syst. 2021, 23, 139–144. [Google Scholar] [CrossRef]
- Xu, W.; Xu, Y. An improved index for clustering validation based on Silhouette index and Calinski-Harabasz index. IOP Conf. Ser. Mater. Sci. Eng. 2019, 569, 052024. [Google Scholar]
- Pérez-Ortega, J.; Roblero-Aguilar, S.S.; Almanza-Ortega, N.N.; Frausto Solís, J.; Zavala-Díaz, C.; Hernández, Y.; Landero-Nájera, V. Hybrid Fuzzy C-Means Clustering Algorithm Oriented to Big Data Realms. Axioms 2022, 11, 377. [Google Scholar] [CrossRef]
- Gu, Y.; Ni, T.; Jiang, Y. Deep Possibilistic C-means Clustering Algorithm on Medical Datasets. Comput. Math. Methods Med. 2022, 2022, 3469979. [Google Scholar] [CrossRef] [PubMed]
- Inaba, M.; Katoh, N.; Imai, H. Applications of weighted Voronoi diagrams and randomization to variance-based k-clustering. In Proceedings of the Tenth Annual Symposium on Computational Geometry, New York, NY, USA, 6–8 June 1994; pp. 332–339. [Google Scholar]
- Wang, Z.; Simoncelli, E.P.; Bovik, A.C. Multiscale structural similarity for image quality assessment. In Proceedings of the Conference Record of the Thirty-Seventh Asilomar Conference on Signals, Systems and Computers, California, CA, USA, 9–12 November 2003; Volume 2, pp. 1398–1402. [Google Scholar]
- Hartigan, J.A.; Wong, M.A. Algorithm AS 136: A k-means clustering algorithm. J. R. Stat. Soc. Ser. C 1979, 28, 100–108. [Google Scholar] [CrossRef]
- Bezdek, J.C. Pattern Recognition with Fuzzy Objective Function Algorithms; Springer Science & Business Media: New York, NY, USA, 2013. [Google Scholar]
- Ngomo, M.; Kimbonguila, A. Caracterisation des agglomerats des fines particules par combinaison des techniques numeriques de la geometrie algorithmique et la methode de monte-carlo: Determination de la morphologie, de la compacite et de la porosite. Ann. Sci. Tech. 2022, 21, 1–18. [Google Scholar]
- Machine Learning Repository UCI. Available online: http://archive.ics.uci.edu/ml/datasets.html (accessed on 10 January 2022).
- Hancer, E.; Bing, X.; Mengjie, Z. Differential evolution for filter feature selection based on information theory and feature ranking. Knowl.-Based Syst. 2018, 140, 103–119. [Google Scholar] [CrossRef]
- Ahourag, A.; Chellak, S.; Cheggour, M.; Baizri, H.; Bahri, A. Quadratic Programming and Triangular Numbers Ranking to an Optimal Moroccan Diet with Minimal Glycemic Load. Stat. Optim. Inf. Comput. 2023, 11, 85–94. [Google Scholar]
- El Moutaouakil, K.; Baizri, H.; Chellak, S. Optimal fuzzy deep daily nutrients requirements representation: Application to optimal Morocco diet problem. Math. Model. Comput. 2022, 9, 607–615. [Google Scholar] [CrossRef]
- Abdellatif, E.O.; Karim, E.M.; Saliha, C.; Hicham, B. Genetic algorithms for optimal control of a continuous model of a diabetic population. In Proceedings of the 2022 IEEE 3rd International Conference on Electronics, Control, Optimization and Computer Science (ICECOCS), Kenitra, Morocco, 1–2 December 2022. [Google Scholar]
- Charroud, A.; El Moutaouakil, K.; Palade, V.; Yahyaouy, A. XDLL: Explained Deep Learning LiDAR-Based Localization and Mapping Method for Self-Driving Vehicles. Electronics 2023, 12, 567. [Google Scholar] [CrossRef]
Dataset | Features | Samples |
---|---|---|
Iris | 4 | 150 |
Pima | 8 | 768 |
Foods | 19 | 177 |
Abalone | 8 | 326 |
MAGIC. Gamma. Telescope | 3 | 306 |
Cloud | 4 | 625 |
Data Set | FCM | GMM | FP-Conv-CM | |||
---|---|---|---|---|---|---|
Silhouette | Dunn’s Index | Silhouette | Dunn’s Index | Silhouette | Dunn’s Index | |
Iris | 120.883 | 0.0468 | 117.402 | 0.041 | 117.872 | 0.065 |
Foods | 131.175 | 0.012 | −113.774 | 0.002 | 131.175 | 0.012 |
Abalone | 6.07 × 103 | 0.0025 | 6.078 × 103 | 0.0038 | 6.054 × 103 | 0.0042 |
Pima | 284.783 | 0.0120 | 50.691 | 0.0111 | 278.28 | 0.0142 |
MAGIC.Gamma.Telescope | 4.05 × 103 | 6.46 × 10-4 | 3.52 × 103 | 5.59 × 10-4 | 3.60 × 103 | 0.0011 |
Cloud | 1.3 × 103 | 0.0022 | 366.8076 | 0.0027 | 1.3 × 103 | 0.0052 |
Attributes Selection Evaluates the Worth of a Subset of Attributes | FP-Conv-CM Full Features | FP-Conv-CM Reduced Features | ||||||
---|---|---|---|---|---|---|---|---|
Data Set | Selected Attributes | Silhouette | Dunn’s Index | Cpu Time(s) | Silhouette | Dunn’s Index | Cpu Time(s) | |
Iris | Petal length, petal width | 117.87 | 0.065 | 0.321 | 123.995 | 0.0936 | 0.453 | |
Foods | VA, VC, VE, VB6, V12, Iron, Proteins, Carbohydrates, Lipids (Tf) | 131.18 | 0.012 | 0.386 | 120.1 | 33.4 × 10−3 | 0.482 | |
Abalone | Shell weight, Diameter, Height, Length, Whole weight, Viscera weight | 6.05 × 103 | 4.2 × 10−3 | 9.433 | 1.493 × 103 | 2.4 × 10−3 | 6.205 | |
Pima | Plas, mass, pedi, age | 278.28 | 14.2 × 10−3 | 0.540 | 208.77 | 16.3 × 10−3 | 2.291 | |
MAGIC | fLength | 3.60 × 103 | 1.1 × 10−3 | 40.031 | 1.215 × 104 | 1.0 × 10−3 | 48.401 | |
Gamma | fWidth | |||||||
Telescope | fAlpha | |||||||
Cloud | North Control Area, South Control Area | 1.3 × 103 | 5.2 × 10−3 | 1.628 | 1.0 × 103 | 5.0 × 10−3 | 0.697 | |
Mean | 4.3 | 1,913.56 | 0.017 | 8.72 | 2,516.13 | 0.025 | 9.76 |
Method | MSE | PEAKSR | SSIM |
---|---|---|---|
Cameraman | |||
GMM | 1.163 | 17.4748 | 0.6507 |
FCM | 597.6039 | 20.3667 | 0.7325 |
FP-Conv-CM | 597.4229 | 20.3680 | 0.7308 |
Khwarizmi | |||
GMM | 3.0 × 10+3 | 13.3192 | 0.2574 |
FCM | 1.2 × 10+3 | 17.2057 | 0.6609 |
FP-Conv-CM | 1.2 × 10+3 | 17.1000 | 0.6846 |
Archimedes | |||
GMM | 5.262 × 10+3 | 10.92 | 0.46 |
FCM | 455.66 | 21.54 | 0.88 |
FP-Conv-CM | 456.13 | 22.00 | 0.90 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
El Moutaouakil, K.; Palade, V.; Safouan, S.; Charroud, A. FP-Conv-CM: Fuzzy Probabilistic Convolution C-Means. Mathematics 2023, 11, 1931. https://doi.org/10.3390/math11081931
El Moutaouakil K, Palade V, Safouan S, Charroud A. FP-Conv-CM: Fuzzy Probabilistic Convolution C-Means. Mathematics. 2023; 11(8):1931. https://doi.org/10.3390/math11081931
Chicago/Turabian StyleEl Moutaouakil, Karim, Vasile Palade, Safaa Safouan, and Anas Charroud. 2023. "FP-Conv-CM: Fuzzy Probabilistic Convolution C-Means" Mathematics 11, no. 8: 1931. https://doi.org/10.3390/math11081931
APA StyleEl Moutaouakil, K., Palade, V., Safouan, S., & Charroud, A. (2023). FP-Conv-CM: Fuzzy Probabilistic Convolution C-Means. Mathematics, 11(8), 1931. https://doi.org/10.3390/math11081931