# Silhouette Analysis for Performance Evaluation in Machine Learning with Applications to Clustering

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Background

#### 2.1. K-Means

#### 2.2. Kernel K-Means

#### 2.3. Silhouette Index

## 3. Weighted Clustering Method Using Silhouette Index

- Let ${\Omega}_{d}$ be a set of $d$ kernel functions ${h}_{1}$ to ${h}_{d}$. Perform kernel k-means method using kernels in the kernel set and generate $d$ clustering results $\begin{array}{cc}{\Gamma}_{j},\mathrm{for}& j=\left\{1,2,\dots ,d\right\}\end{array}$;
- Compute average Silhouette width ${\gamma}_{j}$ for clustering results $\begin{array}{cc}{\Gamma}_{j},\mathrm{for}& j=\left\{1,2,\dots ,d\right\}\end{array}$ obtained by kernel ${h}_{j}$ for all kernels $j=\left\{1,2,\dots ,d\right\}$;
- Shift Silhouette values from {−1 to 1} to {0 to 2} to compute non-negative weights ${\delta}_{j}$ for each kernel;
- For each data point, use the computed weights ${\delta}_{j}$ in step (3) to combine the clustering results $\begin{array}{cc}{\Gamma}_{j},\mathrm{for}& j=\left\{1,2,\dots ,d\right\}\end{array}$ as follows:
- Sum up the weights corresponding to the kernels that assign the same cluster label to the data point;
- Compare the total weight of each cluster for the data point;
- Group the data point to the cluster with the highest total weight.

## 4. Simulation

- Bensaid data: it contains 49 two-dimensional elements grouped in three classes [16].
- Dunn data: it contains 90 two-dimensional elements grouped in two classes [17].
- Iris data: it contains 150 four-dimensional elements grouped in three classes [18].
- Seed data: it contains 210 seven-dimensional elements grouped in three classes [19].

## 5. Results

## 6. Conclusions and Future Research

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Gentleman, R.; Carey, V.; Huber, W.; Irizarry, R.; Dudoit, S. (Eds.) Bioinformatics and Computational Biology Solutions Using R and Bioconductor; Springer Science & Business Media: New York, NY, USA, 2006. [Google Scholar]
- Monti, S.; Tamayo, P.; Mesirov, J.; Golub, T. Consensus clustering: A resampling-based method for class discovery and visualization of gene expression microarray data. Mach. Learn.
**2003**, 52, 91–118. [Google Scholar] [CrossRef] - Wu, J. Advances in K-Means Clustering: A Data Mining Thinking; Springer Science & Business Media: New York, NY, USA, 2012. [Google Scholar]
- Kassambara, A. Practical Guide to Cluster Analysis in R: Unsupervised Machine Learning; CreateSpace Independent Publishing Platform: Scotts Valley, CA, USA, 2017; Volume 1. [Google Scholar]
- Brocard, D.; Gillet, F.; Legendre, P. Numerical Ecology with R (Use R!); Springer: New York, NY, USA, 2011; pp. 978–981. [Google Scholar] [CrossRef]
- Nguyen, N.; Caruana, R. Consensus clusterings. In Proceedings of the Seventh IEEE International Conference on Data Mining (ICDM 2007), Omaha, NE, USA, 28–31 October 2007; pp. 607–612. [Google Scholar]
- Jain, A.K.; Murty, M.N.; Flynn, P.J. Data clustering: A review. ACM Comput. Surv. (CSUR)
**1999**, 31, 264–323. [Google Scholar] [CrossRef] - Arthur, D.; Vassilvitskii, S. Society for Industrial and Applied Mathematics. k-means++: The advantages of careful seeding. In Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, New Orleans, LA, USA, 7–9 January 2007; pp. 1027–1035. [Google Scholar]
- Kachouie, N.N.; Shutaywi, M. Weighted Mutual Information for Aggregated Kernel Clustering. Entropy
**2020**, 22, 351. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Dhillon, I.S.; Guan, Y.; Kulis, B. Kernel k-means: Spectral clustering and normalized cuts. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, Australia, 10–13 August 2015; ACM: New York, NY, USA, 2004; pp. 551–556. [Google Scholar]
- Camps-Valls, G. (Ed.) Kernel Methods in Bioengineering, Signal and Image Processing; Igi Global: Hershey, PA, USA, 2006. [Google Scholar]
- Campbell, C. An introduction to kernel methods. Stud. Fuzziness Soft Comput.
**2001**, 66, 155–192. [Google Scholar] - Yeung, K.Y.; Haynor, D.R.; Ruzzo, W.L. Validating clustering for gene expression data. Bioinformatics
**2001**, 17, 309–318. [Google Scholar] [CrossRef] [PubMed] - Hubert, L.; Arabie, P. Comparing partitions. J. Classif.
**1985**, 2, 193–218. [Google Scholar] [CrossRef] - Rousseeuw, P.J. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math.
**1987**, 20, 53–65. [Google Scholar] [CrossRef] [Green Version] - Bensaid, A.M.; Hall, L.O.; Bezdek, J.C.; Clarke, L.P.; Silbiger, M.L.; Arrington, J.A.; Murtagh, R.F. Validity-guided (re) clustering with applications to image segmentation. IEEE Trans. Fuzzy Syst.
**1996**, 4, 112–123. [Google Scholar] [CrossRef] [Green Version] - Bezdek, J.C.; Keller, J.; Krisnapuram, R.; Pal, N. Fuzzy Models and Algorithms for Pattern Recognition and Image Processing; Springer Science & Business Media: New York, NY, USA, 1999; Volume 4. [Google Scholar]
- Kozak, M.; Łotocka, B. What should we know about the famous Iris data? Curr. Sci.
**2013**, 104, 579–580. [Google Scholar] - Charytanowicz, M.; Niewczas, J.; Kulczycki, P.; Kowalski, P.A.; Łukasik, S.; Żak, S. Complete Gradient Clustering Algorithm for Features Analysis of X-Ray Images. In Information Technologies in Biomedicine; Springer: Berlin/Heidelberg, Germany, 2010; pp. 15–24. [Google Scholar]

**Figure 1.**Procedure to calculate Silhouette index for a typical data point X. Red lines show the distances between X and every data point clustered in the same group. Blue lines show the distances between X and every data point clustered in the nearest group.

**Figure 2.**Silhouette index obtained for clustering results of Bensaid dataset using kernel k-means with three different kernel functions (Gaussian, polynomial, and tangent), majority voting, and the proposed weighted method.

**Figure 3.**Original groups in the Bensaid data and the clustering results of kernel k-means, obtained using three different kernel functions (Gaussian, polynomial, and tangent), majority voting, and the proposed weighted method. Results are visualized using 1st and 2nd PCA.

**Figure 4.**Silhouette index obtained for clustering results of the Dunn dataset using kernel k-means with three different kernel functions (Gaussian, polynomial, and tangent), majority voting, and the proposed weighted method.

**Figure 5.**Original groups in the Dunn data and the clustering results of kernel k-means obtained using three different kernel functions (Gaussian, polynomial, and tangent), majority voting, and the proposed weighted method. Results are visualized using 1st and 2nd PCA.

**Figure 6.**Silhouette index obtained for clustering results of the Iris dataset using kernel k-means with three different kernel functions (Gaussian, polynomial, and tangent), majority voting, and the proposed weighted method.

**Figure 7.**Original groups in the Iris data and the clustering results of kernel k-means obtained using three different kernel functions (Gaussian, polynomial, and tangent), majority voting, and the proposed weighted method. Results are visualized using 1st and 2nd PCA.

**Figure 8.**Silhouette index obtained for clustering results of the Seed dataset using kernel k-means with three different kernel functions (Gaussian, polynomial, and tangent), majority voting, and the proposed weighted method.

**Figure 9.**Original groups in the Seed data and the clustering results of kernel k-means obtained using three different kernel functions (Gaussian, polynomial, and tangent), majority voting, and the proposed weighted method. Results are visualized using 1st and 2nd PCA.

**Table 1.**Monte Carlo average of the average Silhouette index (ASI) and Monte Carlo average of true rate (TR) for the clustering results obtained using kernel k-means with three different kernel functions (Gaussian, polynomial, and tangent) along with majority voting, and the proposed weighted majority voting based on average Silhouette.

DATA | Gaussian | Polynomial | Tangent | Majority Voting | Weighted Majority Voting | |
---|---|---|---|---|---|---|

Bensaid | ASI | 0.159 | 0.453 | −0.127 | 0.132 | 0.167 |

TR | 0.528 | 0.641 | 0.413 | 0.592 | 0.611 | |

DUNN | ASI | 0.269 | 0.418 | −0.003 | 0.293 | 0.293 |

TR | 0.453 | 0.521 | 0.417 | 0.474 | 0.474 | |

IRIS | ASI | 0.609 | 0.609 | −0.068 | 0.482 | 0.589 |

TR | 0.856 | 0.850 | 0.381 | 0.833 | 0.873 | |

SEED | ASI | 0.594 | 0.601 | −0.052 | 0.537 | 0.570 |

TR | 0.869 | 0.842 | 0.376 | 0.852 | 0.862 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Shutaywi, M.; Kachouie, N.N.
Silhouette Analysis for Performance Evaluation in Machine Learning with Applications to Clustering. *Entropy* **2021**, *23*, 759.
https://doi.org/10.3390/e23060759

**AMA Style**

Shutaywi M, Kachouie NN.
Silhouette Analysis for Performance Evaluation in Machine Learning with Applications to Clustering. *Entropy*. 2021; 23(6):759.
https://doi.org/10.3390/e23060759

**Chicago/Turabian Style**

Shutaywi, Meshal, and Nezamoddin N. Kachouie.
2021. "Silhouette Analysis for Performance Evaluation in Machine Learning with Applications to Clustering" *Entropy* 23, no. 6: 759.
https://doi.org/10.3390/e23060759