Among many clustering algorithms, the K-means clustering algorithm is widely used because of its simple algorithm and fast convergence. However, the K-value of clustering needs to be given in advance and the choice of K-value directly affect the convergence result. To solve this problem, we mainly analyze four K-value selection algorithms, namely Elbow Method, Gap Statistic, Silhouette Coefficient, and Canopy; give the pseudo code of the algorithm; and use the standard data set Iris for experimental verification. Finally, the verification results are evaluated, the advantages and disadvantages of the above four algorithms in a K-value selection are given, and the clustering range of the data set is pointed out.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited