# Efficiency of Cluster Validity Indexes in Fuzzy Clusterwise Generalized Structured Component Analysis

^{1}

^{2}

^{3}

^{4}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Theoretical Backgrounds

#### 2.1. GSCA Model Specification

#### 2.2. Estimation of GSCA

#### 2.3. Optimal Scaling for Categorical Variables

#### 2.4. Fuzzy Clustering Algorithm

#### 2.5. Fuzzy Clusterwise GSCA for Latent Class Analysis

#### 2.6. Model Evaluation

## 3. Method

#### 3.1. Cluster Validity Indexes

- (a)
- Dave’s modified partition coefficient (MPC; [15]): Using partition coefficient defined by $PC=\frac{1}{N}{\sum}_{k=1}^{C}{\sum}_{i=1}^{N}{u}_{ki}^{2}\phantom{\rule{4pt}{0ex}}and\phantom{\rule{4pt}{0ex}}PC\in \left[\frac{1}{C},1\right]$, Dave defined the MPC, also known as fuzzy performance index (FPI), as$$MPC=1-\frac{C\xb7(1-PC)}{C-1},$$
- (b)
- Bezdek’s normalized classification entropy (NCE; [19]): Using partition entropy $\left(PE=-\frac{1}{N}{\sum}_{k=1}^{C}{\sum}_{i=1}^{N}{u}_{ki}log{u}_{ki}\right)$, Bezdek defined the NCE as$$NCE=\frac{PE}{logC},$$$$NPE=\frac{N\xb7PE}{N-C}.$$The same criterion as in the NCE is applied to NPE.
- (c)
- Chen and Linkens’ validity index (CLVI; [20]): They defined the CLVI as$$CLVI=\frac{1}{N}\sum _{i=1}^{N}\underset{k}{max}\left({u}_{ki}\right)-\frac{1}{{C}^{*}}\sum _{k=1}^{C-1}\sum _{l=k+1}^{C}\left[\frac{1}{N}\sum _{i=1}^{N}min({u}_{ki},{u}_{li})\right]$$
- (d)
- Fukuyama and Sugeno proposed an index (FS; [21]):$$\begin{array}{cc}\hfill FS& ={J}_{m}(u,v)-{K}_{m}(u,v)\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\sum _{k=1}^{C}\sum _{i=1}^{N}{u}_{ki}^{m}\left|\right|{x}_{i}-{v}_{k}{\left|\right|}^{2}-\sum _{k=1}^{C}\sum _{i=1}^{N}{u}_{ki}^{m}\left|\right|{v}_{k}-\overline{v}{\left|\right|}^{2},\hfill \end{array}$$
- (e)
- Gath and Geva’s fuzzy hypervolume validity index (FHV; [22]): They defined the FHV as$$FHV=\sum _{k=1}^{C}{\left[det\left({F}_{k}\right)\right]}^{2},$$

#### 3.2. Holistic Approach to Enumerate the Number of Clusters

#### 3.3. Simulation Design

- three levels of sample size (N),
- three levels of the number of latent classes/clusters (K),
- two levels of the number of indicators (V),
- three levels of prevalence of the cluster membership (T), and
- three levels of error rate in the cluster structure (ER).

#### 3.3.1. Sample Size

#### 3.3.2. The Number of Classes/Clusters

#### 3.3.3. The Number of Indicators

#### 3.3.4. Prevalence of Class Membership

#### 3.3.5. Error Rates

## 4. Results

#### 4.1. FIT-FHV Method

**FIT-FHV**method. Only 11 conditions out of 162 simulation conditions did not follow the FIT-FHV criterion. However, those 11 conditions were associated with a small sample size of $N=200$ where we rarely fit the fuzzy clusterwise GSCA into such relatively small data.

#### 4.2. Stability of the FIT-FHV Method

#### 4.3. Prevalence of Clusters When C Is Assumed to be the Optimal Number of Clusters

#### 4.4. Real World Application in the Field of Public Health

## 5. Discussion and Conclusions

- (Step 1:) Find a drop point on FIT and AFIT. The last point of the higher levels indicate the max of the range where the true number of clusters is located.
- (Step 2:) Find the smallest FHV within the range found in Step 1, which gives the optimal number of clusters.
- (Step 3:) Explore the prevalence distribution of clusters and confirm that none of the prevalence rates are too low.

#### 5.1. Limitation

#### 5.2. Concluding Remarks

## Author Contributions

## Funding

## Conflicts of Interest

## Abbreviations

CLVI | Chen and Linkens’ validity index |

FHV | Gath and Geva’s fuzzy hypervolume validity index |

FS | Fukuyama and Sugeno’s validity index |

GSCA | Generalized structured component analysis |

LCA | Latent class analysis |

MPC | Modified partition coefficient |

NCE | Normalized classification entropy |

NPE | Normalized partition entropy |

SEM | Structural equation modeling |

## References

- Muthén, B. Latent variable mixture modeling. In New Developments and Techniques in Structural Equation Modeling; Marcoulides, G.A., Schumaker, R.E., Eds.; Erlbaum: Mahwah, NY, USA, 2001; pp. 1–33. [Google Scholar]
- Hwang, H.; Takane, Y. Generalized Structured Component Analysis: A Component-Based Approach to Structural Equation Modeling; CRC Press: Boca Raton, FL, USA, 2014. [Google Scholar]
- Hwang, H.; DeSarbo, S.W.; Takane, Y. Fuzzy clusterwise generalized structured component analysis. Psychometrika
**2007**, 72, 181–198. [Google Scholar] [CrossRef] [Green Version] - Ryoo, J.H.; Park, S.; Kim, S. Categorical latent variable modeling utilizing fuzzy clustering generalized structured component analysis as an alternative to latent class analysis. Behaviormetrika
**2020**, 47, 291–306. [Google Scholar] [CrossRef] - Roubens, M. Fuzzy clustering algorithms and their cluster validity. Eur. J. Oper. Res.
**1982**, 10, 294–301. [Google Scholar] [CrossRef] - Wang, W.; Zhang, Y. On fuzzy cluster validity indices. Fuzzy Sets Syst.
**2007**, 158, 2095–2117. [Google Scholar] [CrossRef] - Bezdek, J.C. Numerical taxonomy with fuzzy sets. J. Math. Biol.
**1974**, 1, 57–71. [Google Scholar] [CrossRef] - Jöreskog, K.G. A general method for estimating a linear structural equation system. In Structural Equation Models in the Social Sciences; Goldberger, A.S., Duncan, O.D., Eds.; Seminar Press: New York, NY, USA, 1973. [Google Scholar]
- Hwang, H.; Takane, Y. Nonlinear generalized structured component analysis. Psychometrika
**2010**, 37, 1–14. [Google Scholar] [CrossRef] - McDonald, R.P. Test Theory: A Unified Treatment; Lawrence Erlbaum Associates: Mahwah, NY, USA, 1999. [Google Scholar]
- de Leeuw, J.; Young, F.W.; Takane, Y. Additive structure in qualitative data: An alternating least squares method with optimal scaling features. Psychometrika
**1976**, 41, 471–503. [Google Scholar] [CrossRef] - Young, F.W. Quantitative analysis of qualitative data. Psychometrika
**1981**, 46, 347–388. [Google Scholar] [CrossRef] - Bezdek, J.C. Pattern Recognition with Fuzzy Objective Function Algorithms; Plenum Press: New York, NY, USA, 1981. [Google Scholar]
- Hwang, H.; Takane, Y.; Jung, K. Generalized structured component analysis with uniqueness terms for accommodating measurement error. Front. Psychol.
**2017**, 8, 2137. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Dave, R.N. Validating fuzzy partitions obtained through c-shells clustering. Pattern Recognit. Lett.
**1996**, 17, 613–623. [Google Scholar] [CrossRef] - Dayton, C.M.; Macready, G.B. Concomitant-variable latent-class models. J. Am. Stat. Assoc.
**1988**, 83, 173–178. [Google Scholar] [CrossRef] - DeSarbo, W.S.; Oliver, R.L.; Rangaswamy, A. A simulated annealing methodology for clusterwise linear regression. Psychometrika
**1989**, 54, 707–736. [Google Scholar] [CrossRef] - Van der Heijden, P.G.M.; Dessens, J.; Bockenholt, U. Estimating the concomitant-variable latent-class model with the EM algorithm. J. Educ. Behav. Stat.
**1996**, 21, 215–229. [Google Scholar] [CrossRef] - Bezdek, J.C. Mathematical models for systematics and taxonomy. In Proceedings of the 8th International Conference on Numerical Taxonomy; Freeman: San Francisco, CA, USA, 1975; pp. 143–166. [Google Scholar]
- Chen, M.Y.; Linkens, D.A. Rule-base self-generation and simplication for data-driven fuzzy models. Fuzzy Sets Syst.
**2004**, 142, 243–265. [Google Scholar] [CrossRef] [Green Version] - Fukiyama, Y.; Sugeno, M. A new method of choosing the number of clusters for the fuzzy c-means method. In Proceedings of the Fifth Fuzzy Systems Symposium, Kobe, Japan, June 1989; pp. 247–250. Available online: https://jglobal.jst.go.jp/en/detail?JGLOBAL_ID=200902072543924485 (accessed on 14 September 2020).
- Gath, I.; Geva, A.B. Unsupervised optimal fuzzy clustering. IEEE Trans. Pattern Anal. Mach. Intell.
**1989**, 11, 773–781. [Google Scholar] [CrossRef] - Brusco, M.J.; Shireman, E.; Steinley, D. A comparison of latent class, K-means, and K-median methods for clustering dichotomous data. Psychol. Methods
**2017**, 22, 563–580. [Google Scholar] [CrossRef] [PubMed] - Dimitriadou, E.; Dolničar, S.; Weingessel, A. An examination of indices for determining the number of clusters in binary data sets. Psychometrika
**2002**, 67, 137–159. [Google Scholar] [CrossRef] [Green Version] - Ryoo, J.; Park, S.; Kim, S.; Hwang, H. gscaLCA: Generalized Structure Component Analysis—Latent Class Analysis & Latent Class Regression. R Package Version 0.0.5. Available online: https://CRAN.R-project.org/package=gscaLCA (accessed on 8 June 2020).
- Harris, K.M. The National Longitudinal Study of Adolescent to Adult Health (Add Health), Waves I & II, 1994–1996; Wave III, 2001–2002; Wave IV, 2007–2009 (Machine-Readable Data File and Documentation); Carolina Population Center, University of North Carolina at Chapel Hill: Chapel Hill, NC, USA; Available online: https://www.icpsr.umich.edu/web/DSDR/studies/21600/versions/V21 (accessed on 8 June 2020).
- Zhang, Y.; Martinez-Garcia, M.; Latimer, A. Estimating gas turbine compressor discharge temperature using Bayesian neuro-fuzzy modelling. In Proceedings of the 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Banff, AB, Canada, 5–8 October 2017; pp. 3619–3623. [Google Scholar] [CrossRef] [Green Version]
- Zhang, Y.; Martínez-García, M.; Latimer, A. Selecting Optimal Features for Cross-Fleet Analysis and Fault Diagnosis of Industrial Gas Turbines. In Proceedings of the ASME Turbo Expo 2018: Turbomachinery Technical Conference and Exposition, Oslo, Norway, 11–15 June 2018. [Google Scholar] [CrossRef]

**Figure 2.**Performance of validation indexes over $K=4$, $N=500$, $V=6$, and $ER=15\%$. $Note$: The first row is T1, the second row is T2, and the third row is T3.

**Figure 3.**Performance of validation indexes over $K=6$, $N=200$, $V=6$, and $ER=15\%$. $Note$: The first row is T1, the second row is T2, and the third row is T3.

Variable | Number of Conditions | Condition |
---|---|---|

Sample size (N) | 3 | 200, 500, and 1000 |

Number of cluster (K) | 3 | 2, 4, and 6 |

Number of indicators (V) | 2 | 6 and 9 |

Prevalence of | 3 | (T1) Equally clustered: |

cluster membership (T) | ${\lambda}_{k}=\frac{1}{K}$ where $1\le k\le K$ | |

(T2) One cluster has large proportion: | ||

${\lambda}_{1}=0.6$ and ${\lambda}_{k}=\frac{0.4}{K-1}$ where $1\le k\le K$ | ||

(T3) One cluster has small proportion: | ||

${\lambda}_{1}=0.1$ and ${\lambda}_{k}=\frac{0.9}{K-1}$ where $1\le k\le K$ | ||

Error rates (ER) | 3 | 5%, 10%, and 15% |

**Table 2.**Response patterns for combinations of the number of classes (K) and the number of item indicators (V).

Number of Cluster | Number of Indicators (V = 6) | Number of Indicators (V = 9) |
---|---|---|

K = 2 | (1, 1, 1, 1, 0, 0) | (1, 1, 1, 1, 1, 1, 0, 0, 0) |

(0, 0, 1, 1, 1, 1) | (0, 0, 0, 1, 1, 1, 1, 1, 1) | |

K = 4 | (1, 1, 1, 1, 0, 0) | (1, 1, 1, 1, 1, 1, 0, 0, 0) |

(0, 0, 1, 1, 1, 1) | (0, 0, 0, 1, 1, 1, 1, 1, 1) | |

(0, 0, 0, 0, 1, 1) | (0, 0, 0, 0, 0, 0, 1, 1, 1) | |

(1, 1, 0, 0, 0, 0) | (1, 1, 1, 0, 0, 0, 0, 0, 0) | |

K = 6 | (1, 1, 1, 1, 0, 0) | (1, 1, 1, 1, 1, 1, 0, 0, 0) |

(0, 0, 1, 1, 1, 1) | (0, 0, 0, 1, 1, 1, 1, 1, 1) | |

(0, 0, 0, 0, 1, 1) | (0, 0, 0, 0, 0, 0, 1, 1, 1) | |

(1, 1, 0, 0, 0, 0) | (1, 1, 1, 0, 0, 0, 0, 0, 0) | |

(0, 0, 1, 1, 0, 0) | (0, 0, 0, 1, 1, 1, 0, 0, 0) | |

(1, 1, 0, 0, 1, 1) | (1, 1, 1, 0, 0, 0, 1, 1, 1) |

T | # of Clusters | FIT | AFIT | MPC | NCE | NPE | CLVI | FS | FHV |
---|---|---|---|---|---|---|---|---|---|

C = 2 | 0.999 | 0.999 | 0.478 | 0.579 | 0.403 | 0.616 | 14,398.987 | 40.668 | |

C = 3 | 0.996 | 0.996 | 0.847 | 0.159 | 0.176 | 0.884 | 16,516.787 | 71.985 | |

T1 | C = 4 | 0.997 | 0.997 | 0.568 | 0.395 | 0.552 | 0.697 | −744.482 | 22.694 |

C = 5 | 0.783 | 0.766 | 0.682 | 0.282 | 0.458 | 0.780 | −5009.347 | 18.840 | |

C = 6 | 0.820 | 0.803 | 0.751 | 0.212 | 0.384 | 0.831 | −6074.661 | 41.356 | |

C = 2 | 0.996 | 0.996 | 0.838 | 0.210 | 0.146 | 0.901 | 27,725.400 | 36.559 | |

C = 3 | 0.996 | 0.995 | 0.656 | 0.343 | 0.379 | 0.773 | 20,992.742 | 43.942 | |

T2 | C = 4 | 0.995 | 0.995 | 0.706 | 0.273 | 0.382 | 0.802 | 11,024.970 | 28.450 |

C = 5 | 0.883 | 0.874 | 0.648 | 0.287 | 0.466 | 0.762 | −1184.398 | 22.112 | |

C = 6 | 0.920 | 0.912 | 0.733 | 0.211 | 0.383 | 0.825 | −7349.990 | 15.296 | |

C = 2 | 0.999 | 0.999 | 0.520 | 0.541 | 0.377 | 0.657 | 15,769.265 | 42.760 | |

C = 3 | 0.998 | 0.998 | 0.817 | 0.201 | 0.222 | 0.861 | 5193.876 | 27.043 | |

T3 | C = 4 | 0.995 | 0.994 | 0.792 | 0.203 | 0.284 | 0.852 | 363.110 | 13.909 |

C = 5 | 0.791 | 0.776 | 0.833 | 0.156 | 0.254 | 0.881 | −3904.634 | 10.771 | |

C = 6 | 0.856 | 0.843 | 0.863 | 0.126 | 0.228 | 0.904 | −7034.518 | 18.275 |

T | # of Clusters | FIT | AFIT | MPC | NCE | NPE | CLVI | FS | FHV |
---|---|---|---|---|---|---|---|---|---|

C = 4 | 0.980 | 0.977 | 0.648 | 0.335 | 0.474 | 0.757 | 4360.678 | 70.794 | |

C = 5 | 0.981 | 0.977 | 0.741 | 0.243 | 0.401 | 0.818 | 1459.266 | 55.370 | |

T1 | C = 6 | 0.988 | 0.985 | 0.758 | 0.231 | 0.426 | 0.820 | −2711.276 | 9.458 |

C = 7 | 0.962 | 0.950 | 0.867 | 0.127 | 0.255 | 0.904 | −3627.897 | 26.785 | |

C = 8 | 0.911 | 0.877 | 0.913 | 0.081 | 0.175 | 0.937 | −4281.379 | 43.793 | |

C = 4 | 0.982 | 0.979 | 0.651 | 0.326 | 0.461 | 0.769 | 4563.684 | 51.954 | |

C = 5 | 0.971 | 0.965 | 0.720 | 0.252 | 0.416 | 0.813 | 1278.064 | 43.150 | |

T2 | C = 6 | 0.971 | 0.963 | 0.804 | 0.177 | 0.327 | 0.869 | −2523.121 | 34.236 |

C = 7 | 0.912 | 0.884 | 0.859 | 0.121 | 0.243 | 0.907 | −4044.144 | 31.770 | |

C = 8 | 0.882 | 0.836 | 0.897 | 0.089 | 0.192 | 0.932 | −4714.370 | 26.868 | |

C = 4 | 0.988 | 0.986 | 0.704 | 0.293 | 0.414 | 0.791 | 3051.090 | 59.722 | |

C = 5 | 0.990 | 0.988 | 0.804 | 0.198 | 0.327 | 0.851 | −591.218 | 31.221 | |

T3 | C = 6 | 0.989 | 0.986 | 0.829 | 0.170 | 0.314 | 0.859 | −3049.061 | 5.911 |

C = 7 | 0.927 | 0.903 | 0.883 | 0.113 | 0.229 | 0.905 | −3481.078 | 20.337 | |

C = 8 | 0.906 | 0.869 | 0.908 | 0.088 | 0.190 | 0.927 | −3957.682 | 35.981 |

T | # of Clusters | C = 1 | C = 2 | C = 3 | C = 4 | C = 5 | C = 6 |
---|---|---|---|---|---|---|---|

C = 2 | 96.28 | 3.72 | |||||

C = 3 | 53.39 | 46.60 | 0.01 | ||||

T1 | C = 4 | 38.73 | 29.21 | 18.78 | 13.28 | ||

C = 5 | 31.48 | 25.67 | 22.36 | 19.36 | 1.13 | ||

C = 6 | 29.38 | 25.22 | 22.04 | 19.08 | 3.21 | 1.07 | |

C = 2 | 99.71 | 0.29 | |||||

C = 3 | 71.94 | 20.78 | 7.28 | ||||

T2 | C = 4 | 64.86 | 16.95 | 13.86 | 4.33 | ||

C = 5 | 55.93 | 18.16 | 14.80 | 9.69 | 1.42 | ||

C = 6 | 53.03 | 16.88 | 14.48 | 11.34 | 2.64 | 1.62 | |

C = 2 | 91.52 | 8.48 | |||||

C = 3 | 37.72 | 33.55 | 28.74 | ||||

T3 | C = 4 | 35.34 | 32.15 | 28.27 | 4.24 | ||

C = 5 | 34.71 | 30.80 | 26.74 | 6.69 | 1.07 | ||

C = 6 | 33.50 | 29.02 | 26.08 | 8.00 | 2.40 | 1.00 |

T | # of Clusters | C = 1 | C = 2 | C = 3 | C = 4 | C = 5 | C = 6 | C = 7 | C = 8 |
---|---|---|---|---|---|---|---|---|---|

C = 4 | 50.98 | 29.86 | 15.56 | 3.61 | |||||

C = 5 | 37.09 | 24.22 | 17.86 | 13.53 | 7.31 | ||||

T1 | C = 6 | 23.31 | 19.28 | 17.15 | 15.12 | 13.62 | 11.54 | ||

C = 7 | 19.32 | 17.28 | 15.93 | 14.99 | 13.77 | 12.98 | 5.75 | ||

C = 8 | 17.62 | 16.20 | 15.31 | 14.68 | 13.84 | 12.94 | 6.34 | 3.08 | |

C = 4 | 63.70 | 21.00 | 10.47 | 4.85 | |||||

C = 5 | 57.73 | 17.92 | 11.46 | 7.93 | 4.97 | ||||

T2 | C = 6 | 53.02 | 13.29 | 10.73 | 9.23 | 7.86 | 5.88 | ||

C = 7 | 52.21 | 11.55 | 9.65 | 8.53 | 7.47 | 6.65 | 3.95 | ||

C = 8 | 51.93 | 10.43 | 8.94 | 8.13 | 7.41 | 6.59 | 4.07 | 2.53 | |

C = 4 | 44.52 | 26.87 | 17.92 | 10.70 | |||||

C = 5 | 31.09 | 20.82 | 18.28 | 16.19 | 13.63 | ||||

T3 | C = 6 | 23.05 | 19.88 | 17.91 | 16.22 | 14.80 | 8.16 | ||

C = 7 | 20.59 | 18.64 | 17.37 | 16.08 | 14.70 | 8.96 | 3.67 | ||

C = 8 | 19.65 | 17.66 | 16.74 | 15.82 | 14.57 | 9.09 | 4.22 | 2.27 |

Sample Sizes | # of Clusters | FIT | AFIT | MPC | NCE | NPE | CLVI | FS | FHV |
---|---|---|---|---|---|---|---|---|---|

N = 250 | C = 2 | 0.9924 | 0.9920 | 0.226 | 0.824 | 0.576 | 0.421 | 3843.301 | 40.657 |

C = 3 | 0.9811 | 0.9797 | 0.522 | 0.506 | 0.562 | 0.669 | 2533.668 | 28.853 | |

C = 4 | 0.9771 | 0.9747 | 0.559 | 0.458 | 0.646 | 0.665 | 1013.757 | 25.313 | |

C = 5 | 0.9606 | 0.9551 | 0.792 | 0.209 | 0.343 | 0.868 | 1364.492 | 33.586 | |

C = 6 | 0.9361 | 0.9252 | 0.730 | 0.261 | 0.478 | 0.787 | −427.198 | 15.310 | |

C = 7 | 0.9364 | 0.9234 | 0.817 | 0.172 | 0.345 | 0.865 | −584.003 | 21.159 | |

C = 8 | 0.9474 | 0.9348 | 0.941 | 0.054 | 0.117 | 0.960 | −2022.800 | 19.674 | |

N = 500 | C = 2 | 0.9973 | 0.9973 | 0.338 | 0.728 | 0.507 | 0.532 | 4369.003 | 22.563 |

C = 3 | 0.9937 | 0.9935 | 0.574 | 0.464 | 0.513 | 0.720 | 4115.353 | 31.469 | |

C = 4 | 0.9887 | 0.9881 | 0.597 | 0.424 | 0.593 | 0.678 | −884.863 | 18.487 | |

C = 5 | 0.9791 | 0.9778 | 0.736 | 0.264 | 0.430 | 0.802 | −3310.829 | 11.697 | |

C = 6 | 0.9721 | 0.9699 | 0.783 | 0.213 | 0.386 | 0.833 | −3623.326 | 10.524 | |

C = 7 | 0.9699 | 0.9671 | 0.922 | 0.076 | 0.150 | 0.947 | −4895.632 | 22.753 | |

C = 8 | 0.9727 | 0.9698 | 0.898 | 0.097 | 0.205 | 0.918 | −8078.892 | 2.895 | |

N = 1000 | C = 2 | 0.9987 | 0.9986 | 0.306 | 0.761 | 0.529 | 0.529 | 9036.031 | 22.017 |

C = 3 | 0.9961 | 0.9960 | 0.577 | 0.463 | 0.510 | 0.709 | 6404.905 | 27.542 | |

C = 4 | 0.9949 | 0.9948 | 0.617 | 0.404 | 0.562 | 0.700 | −6839.396 | 9.283 | |

C = 5 | 0.9870 | 0.9865 | 0.746 | 0.251 | 0.407 | 0.803 | −7828.256 | 9.667 | |

C = 6 | 0.9859 | 0.9854 | 0.801 | 0.195 | 0.352 | 0.851 | −4721.540 | 15.512 | |

C = 7 | 0.9854 | 0.9847 | 0.856 | 0.136 | 0.266 | 0.892 | −12,560.950 | 9.850 | |

C = 8 | 0.9850 | 0.9842 | 0.899 | 0.095 | 0.198 | 0.917 | −15,097.641 | 2.638 | |

N = 5114 | C = 2 | 0.9997 | 0.9997 | 0.318 | 0.750 | 0.520 | 0.533 | 50,391.382 | 22.212 |

C = 3 | 0.9993 | 0.9993 | 0.540 | 0.500 | 0.550 | 0.669 | 28,373.742 | 21.772 | |

C = 4 | 0.9983 | 0.9983 | 0.668 | 0.342 | 0.475 | 0.760 | −3925.974 | 17.109 | |

C = 5 | 0.9976 | 0.9975 | 0.747 | 0.251 | 0.405 | 0.808 | −39,992.537 | 9.340 | |

C = 6 | 0.9974 | 0.9974 | 0.805 | 0.191 | 0.343 | 0.848 | −42,944.767 | 10.994 | |

C = 7 | 0.9974 | 0.9974 | 0.993 | 0.009 | 0.018 | 0.995 | −37,415.660 | 20.548 | |

C = 8 | 0.9966 | 0.9966 | 0.905 | 0.088 | 0.183 | 0.930 | −76,661.158 | 5.493 |

Sample Sizes | # of Clusters | C = 1 | C = 2 | C = 3 | C = 4 | C = 5 | C = 6 | C = 7 | C = 8 |
---|---|---|---|---|---|---|---|---|---|

N = 250 | C = 2 | 56.68 | 43.32 | ||||||

C = 3 | 55.47 | 29.55 | 14.98 | ||||||

C = 4 | 37.65 | 23.08 | 21.46 | 17.81 | |||||

C = 5 | 33.20 | 24.29 | 18.62 | 14.17 | 9.72 | ||||

C = 6 | 24.70 | 19.84 | 19.03 | 16.60 | 14.17 | 5.67 | |||

C = 7 | 27.13 | 16.60 | 15.38 | 14.98 | 14.57 | 7.29 | 4.05 | ||

C = 8 | 20.24 | 17.81 | 15.38 | 14.57 | 14.17 | 8.91 | 5.67 | 3.24 | |

N = 500 | C = 2 | 56.57 | 43.43 | ||||||

C = 3 | 46.67 | 34.75 | 18.59 | ||||||

C = 4 | 32.93 | 28.48 | 20.00 | 18.59 | |||||

C = 5 | 31.11 | 23.03 | 18.59 | 15.96 | 11.31 | ||||

C = 6 | 22.22 | 20.81 | 18.99 | 16.36 | 13.94 | 7.68 | |||

C = 7 | 27.27 | 19.39 | 13.74 | 13.33 | 10.71 | 9.49 | 6.06 | ||

C = 8 | 20.81 | 18.59 | 16.97 | 13.94 | 10.71 | 7.68 | 6.06 | 5.25 | |

N = 1000 | C = 2 | 55.16 | 44.84 | ||||||

C = 3 | 47.47 | 33.70 | 18.83 | ||||||

C = 4 | 34.82 | 27.23 | 19.13 | 18.83 | |||||

C = 5 | 29.45 | 25.10 | 18.83 | 14.88 | 11.74 | ||||

C = 6 | 26.01 | 21.96 | 19.03 | 15.38 | 11.74 | 5.87 | |||

C = 7 | 24.39 | 18.93 | 14.88 | 14.57 | 14.37 | 6.98 | 5.87 | ||

C = 8 | 21.05 | 18.42 | 17.71 | 14.07 | 10.93 | 7.09 | 5.87 | 4.86 | |

N = 5114 | C = 2 | 55.09 | 44.91 | ||||||

C = 3 | 47.73 | 32.02 | 20.25 | ||||||

C = 4 | 29.59 | 26.69 | 24.14 | 19.58 | |||||

C = 5 | 29.45 | 25.05 | 19.52 | 15.42 | 10.56 | ||||

C = 6 | 23.02 | 22.38 | 19.64 | 18.38 | 10.56 | 6.02 | |||

C = 7 | 28.54 | 19.58 | 15.77 | 13.98 | 9.91 | 6.85 | 5.37 | ||

C = 8 | 21.08 | 20.23 | 15.79 | 14.11 | 10.56 | 6.89 | 6.61 | 4.72 |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Ryoo, J.H.; Park, S.; Kim, S.; Ryoo, H.S.
Efficiency of Cluster Validity Indexes in Fuzzy Clusterwise Generalized Structured Component Analysis. *Symmetry* **2020**, *12*, 1514.
https://doi.org/10.3390/sym12091514

**AMA Style**

Ryoo JH, Park S, Kim S, Ryoo HS.
Efficiency of Cluster Validity Indexes in Fuzzy Clusterwise Generalized Structured Component Analysis. *Symmetry*. 2020; 12(9):1514.
https://doi.org/10.3390/sym12091514

**Chicago/Turabian Style**

Ryoo, Ji Hoon, Seohee Park, Seongeun Kim, and Hyun Suk Ryoo.
2020. "Efficiency of Cluster Validity Indexes in Fuzzy Clusterwise Generalized Structured Component Analysis" *Symmetry* 12, no. 9: 1514.
https://doi.org/10.3390/sym12091514