Incremental Beta Distribution Weighted Fuzzy C-Ordered Means Clustering
Abstract
1. Introduction
2. Related Work
2.1. Beta Distribution Weighted Fuzzy C-Ordered-Means Clustering (BDFCOM)
2.2. Incremental Frameworks (Single-Pass and Online)
3. Incremental Beta Distribution Fuzzy C-Ordered Means Clustering
3.1. Single-Pass Beta Distribution Fuzzy C-Ordered Means Clustering (SPBDFCOM)
3.2. Online Beta Distribution Weighted Fuzzy C-Ordered-Means (OBDFCOM)
4. Analysis and Results
4.1. Experimental Datasets
4.2. Evaluation Metric
4.2.1. F1-Score
4.2.2. Rand Index (RI)/Adjusted Rand Index (ARI)
4.2.3. Fowlkes–Mallows Index (FMI)
4.2.4. Jaccard Index (JI)
4.2.5. Time Cost
4.3. Experimental Results and Analysis
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
Appendix A.1. F1-Score
Dataset | Proportion of Data Chunks (%) | SPFCM | SPFCOM | SPFRFCM | SPBDFCOM |
---|---|---|---|---|---|
BT | 10 | 0.10 | 0.10 | 0.17 | 0.19 |
20 | 0.18 | 0.09 | 0.20 | 0.25 | |
30 | 0.17 | 0.12 | 0.15 | 0.20 | |
40 | 0.17 | 0.22 | 0.14 | 0.25 | |
50 | 0.17 | 0.22 | 0.19 | 0.24 | |
Zoo | 10 | 0.42 | 0.08 | 0.37 | 0.43 |
20 | 0.26 | 0.27 | 0.28 | 0.48 | |
30 | 0.27 | 0.21 | 0.26 | 0.36 | |
40 | 0.20 | 0.14 | 0.26 | 0.28 | |
50 | 0.29 | 0.27 | 0.27 | 0.37 | |
Mice | 10 | 0.20 | 0.54 | 0.54 | 0.56 |
20 | 0.19 | 0.17 | 0.17 | 0.20 | |
30 | 0.52 | 0.17 | 0.17 | 0.56 | |
40 | 0.19 | 0.22 | 0.16 | 0.24 | |
50 | 0.17 | 0.20 | 0.57 | 0.23 | |
HCV | 10 | 0.21 | 0.11 | 0.17 | 0.32 |
20 | 0.24 | 0.19 | 0.11 | 0.24 | |
30 | 0.26 | 0.12 | 0.20 | 0.32 | |
40 | 0.18 | 0.10 | 0.12 | 0.22 | |
50 | 0.18 | 0.10 | 0.17 | 0.22 | |
SBN | 10 | 0.16 | 0.25 | 0.29 | 0.36 |
20 | 0.16 | 0.25 | 0.29 | 0.36 | |
30 | 0.16 | 0.25 | 0.28 | 0.36 | |
40 | 0.16 | 0.25 | 0.28 | 0.36 | |
50 | 0.16 | 0.25 | 0.29 | 0.36 | |
CKD | 10 | 0.40 | 0.42 | 0.22 | 0.44 |
20 | 0.22 | 0.24 | 0.22 | 0.44 | |
30 | 0.21 | 0.23 | 0.22 | 0.47 | |
40 | 0.39 | 0.23 | 0.40 | 0.46 | |
50 | 0.39 | 0.46 | 0.39 | 0.47 | |
MD | 10 | 0.21 | 0.21 | 0.21 | 0.46 |
20 | 0.44 | 0.24 | 0.21 | 0.45 | |
30 | 0.21 | 0.21 | 0.21 | 0.46 | |
40 | 0.44 | 0.45 | 0.44 | 0.46 | |
50 | 0.21 | 0.21 | 0.44 | 0.46 |
Datasets | Proportion of Data Chunks (%) | SPFCM | SPFCOM | SPFRFCM | SPBDFCOM |
---|---|---|---|---|---|
BT | 10 | 0.05 | 0.05 | 0.05 | 0.22 |
20 | 0.18 | 0.19 | 0.16 | 0.22 | |
30 | 0.17 | 0.22 | 0.16 | 0.27 | |
40 | 0.17 | 0.13 | 0.19 | 0.28 | |
50 | 0.17 | 0.05 | 0.10 | 0.24 | |
Zoo | 10 | 0.27 | 0.32 | 0.30 | 0.33 |
20 | 0.27 | 0.41 | 0.39 | 0.42 | |
30 | 0.23 | 0.21 | 0.29 | 0.32 | |
40 | 0.27 | 0.26 | 0.29 | 0.40 | |
50 | 0.25 | 0.26 | 0.34 | 0.45 | |
Mice | 10 | 0.18 | 0.21 | 0.17 | 0.24 |
20 | 0.18 | 0.17 | 0.18 | 0.22 | |
30 | 0.18 | 0.17 | 0.18 | 0.24 | |
40 | 0.16 | 0.17 | 0.17 | 0.20 | |
50 | 0.17 | 0.20 | 0.18 | 0.23 | |
HCV | 10 | 0.12 | 0.11 | 0.10 | 0.23 |
20 | 0.16 | 0.11 | 0.11 | 0.27 | |
30 | 0.16 | 0.11 | 0.11 | 0.23 | |
40 | 0.17 | 0.08 | 0.11 | 0.23 | |
50 | 0.15 | 0.11 | 0.11 | 0.28 | |
SBN | 10 | 0.16 | 0.25 | 0.31 | 0.36 |
20 | 0.16 | 0.25 | 0.30 | 0.36 | |
30 | 0.16 | 0.36 | 0.28 | 0.36 | |
40 | 0.16 | 0.25 | 0.31 | 0.36 | |
50 | 0.16 | 0.25 | 0.30 | 0.36 | |
CKD | 10 | 0.39 | 0.47 | 0.21 | 0.50 |
20 | 0.39 | 0.22 | 0.21 | 0.44 | |
30 | 0.41 | 0.45 | 0.39 | 0.48 | |
40 | 0.22 | 0.41 | 0.21 | 0.46 | |
50 | 0.23 | 0.42 | 0.39 | 0.48 | |
MD | 10 | 0.21 | 0.20 | 0.44 | 0.45 |
20 | 0.21 | 0.42 | 0.21 | 0.45 | |
30 | 0.44 | 0.22 | 0.44 | 0.45 | |
40 | 0.21 | 0.20 | 0.44 | 0.46 | |
50 | 0.44 | 0.45 | 0.44 | 0.45 |
Appendix A.2. Rand Index (RI)/Adjusted Rand Index (ARI)
Datasets | Proportion of Data Chunks (%) | SPFCM | SPFCOM | SPFRFCM | SPBDFCOM |
---|---|---|---|---|---|
BT | 10 | 0.42 | 0.59 | 0.68 | 0.68 |
20 | 0.65 | 0.58 | 0.71 | 0.71 | |
30 | 0.61 | 0.61 | 0.67 | 0.69 | |
40 | 0.63 | 0.72 | 0.61 | 0.72 | |
50 | 0.63 | 0.68 | 0.59 | 0.71 | |
Zoo | 10 | 0.86 | 0.23 | 0.85 | 0.90 |
20 | 0.87 | 0.67 | 0.86 | 0.88 | |
30 | 0.80 | 0.73 | 0.83 | 0.88 | |
40 | 0.74 | 0.70 | 0.80 | 0.84 | |
50 | 0.85 | 0.71 | 0.84 | 0.88 | |
Mice | 10 | 0.50 | 0.50 | 0.50 | 0.51 |
20 | 0.50 | 0.50 | 0.50 | 0.50 | |
30 | 0.50 | 0.50 | 0.50 | 0.51 | |
40 | 0.50 | 0.50 | 0.50 | 0.51 | |
50 | 0.50 | 0.50 | 0.51 | 0.51 | |
HCV | 10 | 0.86 | 0.53 | 0.52 | 0.89 |
20 | 0.54 | 0.56 | 0.55 | 0.73 | |
30 | 0.77 | 0.56 | 0.43 | 0.79 | |
40 | 0.50 | 0.35 | 0.49 | 0.79 | |
50 | 0.49 | 0.45 | 0.58 | 0.78 | |
SBN | 10 | 0.50 | 0.50 | 0.50 | 0.50 |
20 | 0.50 | 0.50 | 0.50 | 0.50 | |
30 | 0.50 | 0.50 | 0.50 | 0.50 | |
40 | 0.50 | 0.50 | 0.50 | 0.50 | |
50 | 0.50 | 0.50 | 0.50 | 0.50 | |
CKD | 10 | 0.49 | 0.50 | 0.50 | 0.53 |
20 | 0.50 | 0.51 | 0.50 | 0.53 | |
30 | 0.49 | 0.50 | 0.50 | 0.59 | |
40 | 0.49 | 0.50 | 0.50 | 0.55 | |
50 | 0.49 | 0.54 | 0.49 | 0.60 | |
MD | 10 | 0.49 | 0.50 | 0.50 | 0.50 |
20 | 0.49 | 0.50 | 0.50 | 0.50 | |
30 | 0.49 | 0.50 | 0.49 | 0.50 | |
40 | 0.49 | 0.50 | 0.49 | 0.50 | |
50 | 0.49 | 0.50 | 0.49 | 0.50 |
Datasets | Proportion of Data Chunks (%) | SPFCM | SPFCOM | SPFRFCM | SPBDFCOM |
---|---|---|---|---|---|
BT | 10 | 0.16 | 0.16 | 0.16 | 0.71 |
20 | 0.64 | 0.65 | 0.65 | 0.71 | |
30 | 0.62 | 0.70 | 0.66 | 0.73 | |
40 | 0.63 | 0.62 | 0.66 | 0.74 | |
50 | 0.63 | 0.16 | 0.58 | 0.71 | |
Zoo | 10 | 0.77 | 0.89 | 0.85 | 0.92 |
20 | 0.85 | 0.81 | 0.84 | 0.91 | |
30 | 0.85 | 0.75 | 0.85 | 0.87 | |
40 | 0.82 | 0.80 | 0.84 | 0.87 | |
50 | 0.77 | 0.80 | 0.84 | 0.87 | |
Mice | 10 | 0.49 | 0.50 | 0.49 | 0.51 |
20 | 0.49 | 0.50 | 0.18 | 0.51 | |
30 | 0.50 | 0.51 | 0.50 | 0.51 | |
40 | 0.50 | 0.50 | 0.50 | 0.50 | |
50 | 0.49 | 0.50 | 0.18 | 0.51 | |
HCV | 10 | 0.52 | 0.42 | 0.53 | 0.53 |
20 | 0.50 | 0.52 | 0.53 | 0.53 | |
30 | 0.49 | 0.37 | 0.53 | 0.53 | |
40 | 0.48 | 0.42 | 0.55 | 0.55 | |
50 | 0.40 | 0.38 | 0.52 | 0.52 | |
SBN | 10 | 0.50 | 0.50 | 0.50 | 0.50 |
20 | 0.50 | 0.50 | 0.50 | 0.50 | |
30 | 0.50 | 0.36 | 0.50 | 0.50 | |
40 | 0.50 | 0.50 | 0.50 | 0.50 | |
50 | 0.50 | 0.50 | 0.50 | 0.50 | |
CKD | 10 | 0.49 | 0.66 | 0.49 | 0.72 |
20 | 0.49 | 0.50 | 0.49 | 0.52 | |
30 | 0.50 | 0.56 | 0.49 | 0.81 | |
40 | 0.50 | 0.50 | 0.49 | 0.55 | |
50 | 0.50 | 0.50 | 0.49 | 0.64 | |
MD | 10 | 0.49 | 0.49 | 0.49 | 0.50 |
20 | 0.49 | 0.49 | 0.50 | 0.50 | |
30 | 0.49 | 0.50 | 0.50 | 0.50 | |
40 | 0.49 | 0.49 | 0.49 | 0.50 | |
50 | 0.49 | 0.50 | 0.49 | 0.50 |
Datasets | Proportion of Data Chunks (%) | SPFCM | SPFCOM | SPFRFCM | SPBDFCOM |
---|---|---|---|---|---|
BT | 10 | 0.03 | 0.12 | 0.06 | 0.18 |
20 | 0.17 | 0.12 | 0.08 | 0.19 | |
30 | 0.12 | 0.16 | 0.06 | 0.20 | |
40 | 0.14 | 0.22 | 0.16 | 0.22 | |
50 | 0.14 | 0.18 | 0.10 | 0.19 | |
Zoo | 10 | 0.59 | 0.00 | 0.60 | 0.73 |
20 | 0.64 | 0.17 | 0.58 | 0.54 | |
30 | 0.39 | 0.22 | 0.50 | 0.66 | |
40 | 0.28 | 0.22 | 0.41 | 0.55 | |
50 | 0.60 | 0.23 | 0.54 | 0.66 | |
Mice | 10 | 0.00 | 0.00 | 0.00 | 0.01 |
20 | 0.00 | 0.00 | 0.00 | 0.01 | |
30 | 0.00 | 0.01 | 0.00 | 0.01 | |
40 | 0.00 | 0.01 | 0.01 | 0.02 | |
50 | 0.00 | 0.00 | 0.02 | 0.02 | |
HCV | 10 | 0.59 | 0.10 | 0.15 | 0.69 |
20 | 0.16 | 0.22 | 0.15 | 0.37 | |
30 | 0.42 | 0.15 | 0.09 | 0.51 | |
40 | 0.13 | 0.01 | 0.11 | 0.49 | |
50 | 0.13 | 0.04 | 0.12 | 0.48 | |
SBN | 10 | −0.01 | −0.01 | −0.01 | 0.00 |
20 | −0.01 | −0.01 | −0.01 | 0.00 | |
30 | −0.01 | −0.01 | 0.00 | 0.00 | |
40 | −0.01 | −0.01 | −0.01 | 0.00 | |
50 | −0.01 | −0.01 | −0.01 | 0.00 | |
CKD | 10 | 0.00 | 0.00 | 0.00 | 0.00 |
20 | 0.00 | −0.01 | 0.00 | 0.00 | |
30 | −0.01 | −0.01 | 0.00 | 0.00 | |
40 | −0.01 | 0.00 | 0.00 | 0.00 | |
50 | −0.01 | 0.00 | −0.01 | 0.00 | |
MD | 10 | 0.00 | 0.00 | 0.00 | 0.00 |
20 | 0.00 | 0.00 | 0.00 | 0.00 | |
30 | 0.00 | 0.00 | −0.01 | 0.00 | |
40 | 0.00 | 0.00 | 0.00 | 0.00 | |
50 | 0.00 | 0.00 | 0.00 | 0.00 |
Datasets | Proportion of Data Chunks (%) | SPFCM | SPFCOM | SPFRFCM | SPBDFCOM |
---|---|---|---|---|---|
BT | 10 | 0.00 | 0.00 | 0.00 | 0.23 |
20 | 0.15 | 0.20 | 0.07 | 0.24 | |
30 | 0.12 | 0.18 | 0.04 | 0.18 | |
40 | 0.14 | 0.17 | 0.17 | 0.22 | |
50 | 0.14 | 0.00 | 0.13 | 0.19 | |
Zoo | 10 | 0.35 | 0.69 | 0.58 | 0.77 |
20 | 0.56 | 0.44 | 0.51 | 0.75 | |
30 | 0.58 | 0.27 | 0.56 | 0.64 | |
40 | 0.51 | 0.42 | 0.55 | 0.60 | |
50 | 0.34 | 0.43 | 0.54 | 0.61 | |
Mice | 10 | −0.01 | 0.00 | −0.01 | 0.03 |
20 | −0.01 | 0.01 | 0.02 | 0.02 | |
30 | 0.00 | 0.02 | 0.00 | 0.03 | |
40 | 0.00 | 0.01 | 0.00 | 0.01 | |
50 | −0.01 | 0.01 | 0.01 | 0.02 | |
HCV | 10 | 0.11 | 0.07 | 0.18 | 0.18 |
20 | 0.13 | 0.13 | 0.15 | 0.19 | |
30 | 0.13 | 0.02 | 0.16 | 0.18 | |
40 | 0.10 | 0.02 | 0.16 | 0.20 | |
50 | 0.07 | 0.02 | 0.17 | 0.18 | |
SBN | 10 | −0.01 | −0.01 | −0.01 | 0.00 |
20 | −0.01 | 0.00 | −0.01 | 0.00 | |
30 | −0.01 | 0.00 | 0.00 | 0.00 | |
40 | −0.01 | −0.01 | −0.01 | 0.00 | |
50 | −0.01 | −0.01 | −0.01 | 0.00 | |
CKD | 10 | −0.01 | −0.03 | −0.01 | 0.00 |
20 | −0.01 | −0.01 | −0.01 | 0.00 | |
30 | 0.00 | −0.02 | −0.01 | −0.02 | |
40 | 0.00 | 0.00 | −0.01 | 0.00 | |
50 | −0.01 | 0.00 | −0.01 | 0.00 | |
MD | 10 | 0.00 | −0.01 | 0.00 | 0.00 |
20 | 0.00 | −0.01 | 0.00 | 0.00 | |
30 | 0.00 | 0.00 | 0.00 | 0.00 | |
40 | 0.00 | −0.01 | 0.00 | 0.00 | |
50 | 0.00 | 0.00 | 0.00 | 0.00 |
Appendix A.3. Fowlkes–Mallows Index (FMI)
Datasets | Proportion of Data Chunks (%) | SPFCM | SPFCOM | SPFRFCM | SPBDFCOM |
---|---|---|---|---|---|
BT | 10 | 0.35 | 0.37 | 0.25 | 0.39 |
20 | 0.39 | 0.38 | 0.24 | 0.39 | |
30 | 0.36 | 0.40 | 0.26 | 0.40 | |
40 | 0.37 | 0.42 | 0.41 | 0.42 | |
50 | 0.37 | 0.39 | 0.35 | 0.40 | |
Zoo | 10 | 0.68 | 0.48 | 0.70 | 0.79 |
20 | 0.72 | 0.39 | 0.67 | 0.75 | |
30 | 0.52 | 0.40 | 0.61 | 0.74 | |
40 | 0.45 | 0.42 | 0.54 | 0.66 | |
50 | 0.70 | 0.43 | 0.64 | 0.73 | |
Mice | 10 | 0.51 | 0.50 | 0.50 | 0.51 |
20 | 0.50 | 0.50 | 0.50 | 0.51 | |
30 | 0.50 | 0.51 | 0.50 | 0.51 | |
40 | 0.50 | 0.62 | 0.51 | 0.62 | |
50 | 0.50 | 0.58 | 0.51 | 0.63 | |
HCV | 10 | 0.91 | 0.66 | 0.64 | 0.93 |
20 | 0.66 | 0.67 | 0.67 | 0.82 | |
30 | 0.85 | 0.67 | 0.54 | 0.86 | |
40 | 0.61 | 0.45 | 0.60 | 0.86 | |
50 | 0.61 | 0.57 | 0.70 | 0.86 | |
SBN | 10 | 0.64 | 0.64 | 0.63 | 0.64 |
20 | 0.64 | 0.64 | 0.63 | 0.64 | |
30 | 0.64 | 0.64 | 0.64 | 0.64 | |
40 | 0.64 | 0.64 | 0.64 | 0.64 | |
50 | 0.64 | 0.64 | 0.63 | 0.64 | |
CKD | 10 | 0.65 | 0.65 | 0.65 | 0.68 |
20 | 0.65 | 0.66 | 0.65 | 0.68 | |
30 | 0.65 | 0.65 | 0.65 | 0.73 | |
40 | 0.65 | 0.65 | 0.65 | 0.70 | |
50 | 0.65 | 0.69 | 0.65 | 0.74 | |
MD | 10 | 0.60 | 0.60 | 0.60 | 0.61 |
20 | 0.60 | 0.61 | 0.60 | 0.61 | |
30 | 0.60 | 0.60 | 0.60 | 0.61 | |
40 | 0.60 | 0.60 | 0.60 | 0.61 | |
50 | 0.60 | 0.60 | 0.60 | 0.61 |
Datasets | Proportion of Data Chunks (%) | SPFCM | SPFCOM | SPFRFCM | SPBDFCOM |
---|---|---|---|---|---|
BT | 10 | 0.40 | 0.40 | 0.40 | 0.42 |
20 | 0.37 | 0.42 | 0.28 | 0.43 | |
30 | 0.36 | 0.41 | 0.24 | 0.42 | |
40 | 0.37 | 0.41 | 0.39 | 0.42 | |
50 | 0.37 | 0.40 | 0.38 | 0.42 | |
Zoo | 10 | 0.50 | 0.77 | 0.67 | 0.82 |
20 | 0.65 | 0.56 | 0.62 | 0.81 | |
30 | 0.68 | 0.43 | 0.65 | 0.72 | |
40 | 0.62 | 0.55 | 0.65 | 0.69 | |
50 | 0.48 | 0.55 | 0.63 | 0.69 | |
Mice | 10 | 0.50 | 0.60 | 0.50 | 0.65 |
20 | 0.50 | 0.52 | 0.18 | 0.58 | |
30 | 0.50 | 0.55 | 0.51 | 0.66 | |
40 | 0.50 | 0.52 | 0.52 | 0.52 | |
50 | 0.50 | 0.51 | 0.18 | 0.63 | |
HCV | 10 | 0.64 | 0.53 | 0.61 | 0.65 |
20 | 0.61 | 0.64 | 0.61 | 0.64 | |
30 | 0.61 | 0.47 | 0.64 | 0.65 | |
40 | 0.60 | 0.53 | 0.66 | 0.66 | |
50 | 0.50 | 0.49 | 0.61 | 0.64 | |
SBN | 10 | 0.64 | 0.64 | 0.63 | 0.64 |
20 | 0.64 | 0.64 | 0.63 | 0.64 | |
30 | 0.64 | 0.25 | 0.64 | 0.64 | |
40 | 0.64 | 0.64 | 0.63 | 0.64 | |
50 | 0.64 | 0.64 | 0.63 | 0.64 | |
CKD | 10 | 0.65 | 0.78 | 0.65 | 0.83 |
20 | 0.65 | 0.65 | 0.65 | 0.67 | |
30 | 0.65 | 0.71 | 0.65 | 0.90 | |
40 | 0.65 | 0.65 | 0.65 | 0.70 | |
50 | 0.65 | 0.65 | 0.65 | 0.77 | |
MD | 10 | 0.60 | 0.60 | 0.60 | 0.61 |
20 | 0.60 | 0.60 | 0.60 | 0.61 | |
30 | 0.60 | 0.61 | 0.60 | 0.61 | |
40 | 0.60 | 0.60 | 0.60 | 0.61 | |
50 | 0.60 | 0.60 | 0.60 | 0.61 |
Appendix A.4. Jaccard Index (JI)
Datasets | Proportion of Data Chunks (%) | SPFCM | SPFCOM | SPFRFCM | SPBDFCOM |
---|---|---|---|---|---|
BT | 10 | 0.08 | 0.09 | 0.13 | 0.16 |
20 | 0.15 | 0.08 | 0.16 | 0.19 | |
30 | 0.13 | 0.12 | 0.12 | 0.16 | |
40 | 0.14 | 0.17 | 0.16 | 0.20 | |
50 | 0.14 | 0.18 | 0.16 | 0.19 | |
Zoo | 10 | 0.53 | 0.16 | 0.51 | 0.62 |
20 | 0.51 | 0.25 | 0.48 | 0.60 | |
30 | 0.39 | 0.19 | 0.42 | 0.54 | |
40 | 0.31 | 0.17 | 0.37 | 0.44 | |
50 | 0.48 | 0.28 | 0.45 | 0.52 | |
Mice | 10 | 0.22 | 0.37 | 0.37 | 0.39 |
20 | 0.21 | 0.18 | 0.18 | 0.22 | |
30 | 0.36 | 0.18 | 0.18 | 0.39 | |
40 | 0.21 | 0.27 | 0.17 | 0.24 | |
50 | 0.18 | 0.22 | 0.18 | 0.28 | |
HCV | 10 | 0.82 | 0.36 | 0.52 | 0.86 |
20 | 0.45 | 0.44 | 0.34 | 0.73 | |
30 | 0.77 | 0.34 | 0.36 | 0.79 | |
40 | 0.44 | 0.25 | 0.46 | 0.78 | |
50 | 0.43 | 0.30 | 0.61 | 0.77 | |
SBN | 10 | 0.12 | 0.23 | 0.28 | 0.34 |
20 | 0.12 | 0.23 | 0.28 | 0.34 | |
30 | 0.12 | 0.23 | 0.27 | 0.34 | |
40 | 0.12 | 0.23 | 0.26 | 0.34 | |
50 | 0.12 | 0.23 | 0.27 | 0.34 | |
CKD | 10 | 0.45 | 0.49 | 0.48 | 0.56 |
20 | 0.47 | 0.53 | 0.45 | 0.56 | |
30 | 0.44 | 0.48 | 0.47 | 0.66 | |
40 | 0.44 | 0.49 | 0.45 | 0.61 | |
50 | 0.45 | 0.59 | 0.44 | 0.67 | |
MD | 10 | 0.38 | 0.39 | 0.39 | 0.44 |
20 | 0.41 | 0.39 | 0.39 | 0.44 | |
30 | 0.38 | 0.40 | 0.38 | 0.44 | |
40 | 0.41 | 0.42 | 0.41 | 0.44 | |
50 | 0.38 | 0.40 | 0.41 | 0.44 |
Datasets | Proportion of Data Chunks (%) | SPFCM | SPFCOM | SPFRFCM | SPBDFCOM |
---|---|---|---|---|---|
BT | 10 | 0.03 | 0.15 | 0.14 | 0.14 |
20 | 0.03 | 0.19 | 0.20 | 0.14 | |
30 | 0.03 | 0.13 | 0.13 | 0.16 | |
40 | 0.19 | 0.20 | 0.22 | 0.25 | |
50 | 0.03 | 0.15 | 0.14 | 0.14 | |
Zoo | 10 | 0.03 | 0.19 | 0.20 | 0.14 |
20 | 0.03 | 0.13 | 0.13 | 0.16 | |
30 | 0.19 | 0.20 | 0.22 | 0.25 | |
40 | 0.03 | 0.15 | 0.14 | 0.14 | |
50 | 0.03 | 0.19 | 0.20 | 0.14 | |
Mice | 10 | 0.03 | 0.13 | 0.13 | 0.16 |
20 | 0.19 | 0.20 | 0.22 | 0.25 | |
30 | 0.03 | 0.15 | 0.14 | 0.14 | |
40 | 0.03 | 0.19 | 0.20 | 0.14 | |
50 | 0.03 | 0.13 | 0.13 | 0.16 | |
HCV | 10 | 0.19 | 0.20 | 0.22 | 0.25 |
20 | 0.03 | 0.15 | 0.14 | 0.14 | |
30 | 0.03 | 0.19 | 0.20 | 0.14 | |
40 | 0.03 | 0.13 | 0.13 | 0.16 | |
50 | 0.19 | 0.20 | 0.22 | 0.25 | |
SBN | 10 | 0.03 | 0.15 | 0.14 | 0.14 |
20 | 0.03 | 0.19 | 0.20 | 0.14 | |
30 | 0.03 | 0.13 | 0.13 | 0.16 | |
40 | 0.19 | 0.20 | 0.22 | 0.25 | |
50 | 0.03 | 0.15 | 0.14 | 0.14 | |
CKD | 10 | 0.03 | 0.19 | 0.20 | 0.14 |
20 | 0.03 | 0.13 | 0.13 | 0.16 | |
30 | 0.19 | 0.20 | 0.22 | 0.25 | |
40 | 0.03 | 0.15 | 0.14 | 0.14 | |
50 | 0.03 | 0.19 | 0.20 | 0.14 | |
MD | 10 | 0.03 | 0.13 | 0.13 | 0.16 |
20 | 0.19 | 0.20 | 0.22 | 0.25 | |
30 | 0.03 | 0.15 | 0.14 | 0.14 | |
40 | 0.03 | 0.19 | 0.20 | 0.14 | |
50 | 0.03 | 0.13 | 0.13 | 0.16 |
Appendix A.5. Time Cost
Dataset | SP FCM | SP FCOM | SP FRFCM | SPBD FCOM | O FCM | O FCOM | OFR FCM | OBD FCOM |
---|---|---|---|---|---|---|---|---|
BT | 0.15 | 6.58 | 1.28 | 1.02 | 0.14 | 8.94 | 1.32 | 1.20 |
Zoo | 0.34 | 2.68 | 5.76 | 1.84 | 0.45 | 2.19 | 5.74 | 2.16 |
Mice | 1.17 | 883.40 | 52.62 | 8.29 | 1.68 | 164.05 | 60.66 | 12.25 |
HCV | 0.88 | 26.19 | 12.37 | 5.96 | 1.05 | 38.57 | 15.60 | 6.81 |
SBN | 0.85 | 626.52 | 7.59 | 5.14 | 1.06 | 1422.28 | 9.11 | 7.77 |
CKD | 1.46 | 790.32 | 36.93 | 8.05 | 1.78 | 266.35 | 17.48 | 11.52 |
MD | 0.68 | 332.34 | 9.92 | 4.77 | 0.92 | 632.40 | 11.54 | 6.52 |
Appendix B
Comparison of Structural Characteristics
Algorithm | Incremental Framework | Ordered Mechanism | Feature Weighted | Algorithmic Features |
---|---|---|---|---|
SPFCM | Single-Pass | × | × | Initial Single-Pass fuzzy clustering. The algorithm is faster, but the various evaluation criteria are lower. |
SPFCOM | Single-Pass | √ | √ | Single-Pass fuzzy clustering with ordered mechanisms. This algorithm improves the performance of various evaluation criteria but the efficiency is low. |
SPFRFCM | Single-Pass | × | √ | Single-Pass fuzzy clustering with feature reduction. The algorithm improves the performance of each evaluation criterion, as well as the efficiency, but it is still to be improved. |
SPBDFCOM | Single-Pass | √ | √ | Single-Pass fuzzy clustering of ordered mechanisms for beta distributions. The algorithm inherits the ordered mechanism, improves efficiency and enhances the performance of each evaluation criterion. |
OFCM | Online | × | × | Initial online fuzzy clustering. The algorithm is faster, but the various evaluation criteria are lower. |
OFCOM | Online | √ | √ | Online fuzzy clustering with ordered mechanisms. This algorithm improves the performance of various evaluation criteria but the efficiency is low. |
OFRFCM | Online | × | √ | Online fuzzy clustering with feature reduction. The algorithm improves the performance of each evaluation criterion, as well as the efficiency, but it is still to be improved. |
OBDFCOM | Online | √ | √ | Online fuzzy clustering of ordered mechanisms for beta distributions. The algorithm inherits the ordered mechanism, improves the efficiency, and enhances the performance of each evaluation criterion. |
References
- Colletta, M.; Chang, R.; El Baggari, I.; Kourkoutis, L.F. Imaging of Chemical Structure from Low-signal-to-noise EELS Enabled by Diffusion Mapping. Microsc. Microanal. 2023, 29, 394–396. [Google Scholar] [CrossRef]
- Ezugwu, A.E.; Ikotun, A.M.; Oyelade, O.O.; Abualigah, L.; Agushaka, J.O.; Eke, C.I.; Akinyelu, A.A. A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects. Eng. Appl. Artif. Intell. 2022, 110, 104743. [Google Scholar] [CrossRef]
- Kumar, R.; Khepar, J.; Yadav, K.; Kareri, E.; Alotaibi, S.D.; Viriyasitavat, W.; Gulati, K.; Kotecha, K.; Dhiman, G. A systematic review on generalized fuzzy numbers and its applications: Past, present and future. Arch. Comput. Methods Eng. 2022, 29, 5213–5236. [Google Scholar] [CrossRef]
- Cardone, B.; Di Martino, F.; Senatore, S. Emotion-based classification through fuzzy entropy-enhanced FCM clustering. In Statistical Modeling in Machine Learning; Academic Press: Oxford, UK, 2023; pp. 205–225. [Google Scholar] [CrossRef]
- Salve, V.P.; Ghatule, M.P. Comprehensive Analysis of Clustering Methods: Focusing on Fuzzy Clustering. In Proceedings of the 2025 International Conference on Multi-Agent Systems for Collaborative Intelligence (ICMSCI), Erode, India, 20-22 January 2025; pp. 1109–1117. [Google Scholar] [CrossRef]
- Yu, H.; Jiang, L.; Fan, J.; Xie, S.; Lan, R. A feature-weighted suppressed possibilistic fuzzy c-means clustering algorithm and its application on color image segmentation. Expert Syst. Appl. 2024, 241, 122270. [Google Scholar] [CrossRef]
- Leski, J.M. Fuzzy c-ordered-means clustering. Fuzzy Sets Syst. 2016, 286, 114–133. [Google Scholar] [CrossRef]
- Wang, H.; Mohsin, M.F.M.; Pozi, M.S.M. Beta Distribution Weighted Fuzzy C-Ordered-Means Clustering. J. Inf. Commun. Technol. 2024, 23, 523–559. [Google Scholar] [CrossRef]
- Rakhonde, G.Y.; Ahale, S.; Reddy, N.K.; Purushotham, P.; Deshkar, A. Big data analytics for improved weather forecasting and disaster management. In Artificial Intelligence and Smart Agriculture: Technology and Applications; Springer Nature: Singapore, 2024; pp. 175–192. [Google Scholar] [CrossRef]
- Varshney, A.K.; Torra, V. Literature Review of various Fuzzy Rule based Systems. arXiv 2022. [Google Scholar] [CrossRef]
- Deng, T.; Bi, S.; Xiao, J. Transformer-based financial fraud detection with cloud-optimized real-time streaming. In Proceedings of the 2024 5th International Conference on Big Data Economy and Information Management, Zhengzhou, China, 13–15 December 2024; pp. 702–707. [Google Scholar] [CrossRef]
- Li, P.; Abouelenien, M.; Mihalcea, R.; Ding, Z.; Yang, Q.; Zhou, Y. Deception detection from linguistic and physiological data streams using bimodal convolutional neural networks. In Proceedings of the 2024 5th International Conference on Information Science, Parallel and Distributed Systems (ISPDS), Guangzhou, China, 31 May–2 June 2024; pp. 263–267. [Google Scholar] [CrossRef]
- Zhou, L.; Tu, W.; Li, Q.; Guan, D. A heterogeneous streaming vehicle data access model for diverse IoT sensor monitoring network management. IEEE Internet Things J. 2024, 11, 26929–26943. [Google Scholar] [CrossRef]
- Bahri, M.; Bifet, A.; Gama, J.; Gomes, H.M.; Maniu, S. Data stream analysis: Foundations, major tasks and tools. WIREs Data Min. Knowl. Discov. 2021, 11, e1405. [Google Scholar] [CrossRef]
- Verwiebe, J.; Grulich, P.M.; Traub, J.; Markl, V. Algorithms for windowed aggregations and joins on distributed stream processing systems. Datenbank-Spektrum 2022, 22, 99–107. [Google Scholar] [CrossRef]
- Aguiar, G.; Krawczyk, B.; Cano, A. A survey on learning from imbalanced data streams: Taxonomy, challenges, empirical study, and reproducible experimental framework. Mach. Learn. 2024, 113, 4165–4243. [Google Scholar] [CrossRef]
- Oyewole, G.J.; Thopil, G.A. Data clustering: Application and trends. Artif. Intell. Rev. 2023, 56, 6439–6475. [Google Scholar] [CrossRef] [PubMed]
- Zubaroğlu, A.; Atalay, V. Data stream clustering: A review. Artif. Intell. Rev. 2021, 54, 1201–1236. [Google Scholar] [CrossRef]
- Hore, P.; Hall, L.O.; Goldgof, D.B. Single pass fuzzy c means. In Proceedings of the 2007 IEEE International Fuzzy Systems Conference, London, UK, 23–26 July 2007; pp. 1–7. [Google Scholar] [CrossRef]
- Hore, P.; Hall, L.; Goldgof, D.; Cheng, W. Online fuzzy c means. In Proceedings of the NAFIPS 2008—2008 Annual Meeting of the North American Fuzzy Information Processing Society, New York, NY, USA, 19–22 May 2008; pp. 1–5. [Google Scholar]
- Tyler, D.E. Robust Statistics: Theory and methods. J. Am. Stat. Assoc. 2008, 103, 888–889. [Google Scholar] [CrossRef]
- Aishwarya, W.A. Shill Bidding Dataset (SBD). Kaggle. Available online: https://www.kaggle.com/datasets/aishu2218/shill-bidding-dataset (accessed on 30 July 2025).
- Mahmoud, L. Chronic Kidney Disease Dataset. Kaggle. Available online: https://www.kaggle.com/code/mahmoudlimam/chronic-kidney-disease-clustering-and-prediction (accessed on 30 July 2025).
- Awan, M. Manufacturing Defects Simulation Dataset. Kaggle. Available online: https://www.kaggle.com/code/ksmooi/manufacturing-defect-prediction-stacking (accessed on 30 July 2025).
- Dua, D.; Graff, C. UCI Machine Learning Repository; University of California, Irvine, School of Information and Computer Sciences: Irvine, CA, USA, 2017; Available online: http://archive.ics.uci.edu/ml/datasets.html (accessed on 30 July 2025).
- Christen, P.; Hand, D.J.; Kirielle, N. A review of the F-measure: Its history, properties, criticism, and alternatives. ACM Comput. Surv. 2023, 56, 1–24. [Google Scholar] [CrossRef]
- Campello, R. A fuzzy extension of the Rand index and other related indexes for clustering and classification assessment. Pattern Recognit. Lett. 2007, 28, 833–841. [Google Scholar] [CrossRef]
- Khrissi, L.; El Akkad, N.; Satori, H.; Satori, K. Clustering method and sine cosine algorithm for image segmentation. Evol. Intell. 2022, 15, 669–682. [Google Scholar] [CrossRef]
- Liu, Y.; Guo, C.; Wang, H.; Chao, H. Incremental Fuzzy C-Ordered Means Clustering. J. Beijing Univ. Posts Telecommun. 2018, 41, 29–36. [Google Scholar] [CrossRef]
- Liu, Y.; Zhang, Y.; Chao, H.; Sharma, V. Incremental fuzzy clustering based on feature reduction. J. Electr. Comput. Eng. 2022, 2022, 8566253. [Google Scholar] [CrossRef]
Dataset | Sample Size | Attribute Count | Cluster Number | Source | Domain |
---|---|---|---|---|---|
Breast Tissue (BT) | 106 | 9 | 6 | UCI | Medical dataset |
Zoo | 101 | 16 | 7 | UCI | Animal dataset |
Mice | 1077 | 68 | 8 | UCI | Medical dataset |
HCV | 589 | 12 | 2 | UCI | Medical dataset |
Shill Bidding Data (SBD) | 6321 | 9 | 2 | Kaggle | E-commerce dataset |
Chronic Kidney Disease (CKD) | 1659 | 51 | 2 | Kaggle | Medical dataset |
Manufacturing Defect (MD) | 3240 | 16 | 2 | Kaggle | Industrial dataset |
Evaluation Criteria | SP FCM | SP FCOM | SPFR FCM | SPBD FCOM | O FCM | O FCOM | OFR FCM | OBD FCOM |
---|---|---|---|---|---|---|---|---|
F1-Score | 0.24 | 0.22 | 0.26 | 0.36 | 0.22 | 0.24 | 0.24 | 0.34 |
RI | 0.58 | 0.53 | 0.57 | 0.63 | 0.54 | 0.53 | 0.54 | 0.61 |
ARI | 0.13 | 0.06 | 0.11 | 0.19 | 0.09 | 0.09 | 0.11 | 0.16 |
FMI | 0.59 | 0.55 | 0.56 | 0.64 | 0.56 | 0.56 | 0.55 | 0.64 |
JI | 0.34 | 0.30 | 0.34 | 0.46 | 0.29 | 0.30 | 0.32 | 0.42 |
Evaluation Criteria | Average Improvement (%) |
---|---|
F1-Score | 44.01% |
RI | 13.35% |
ARI | 81.73% |
FMI | 13.16% |
JI | 37.34% |
Proportion of Data Chunks | F1-Score | RI | ARI | FMI | JI | Average Improvement |
---|---|---|---|---|---|---|
10 | 54.62% | 18.48% | 267.36% | 12.04% | 38.87% | 78.27% |
20 | 57.87% | 8.95% | 60.30% | 10.39% | 38.17% | 35.14% |
30 | 81.10% | 11.78% | 110.89% | 14.39% | 50.08% | 53.65% |
40 | 32.77% | 14.96% | 138.06% | 16.97% | 44.54% | 49.46% |
50 | 29.47% | 13.63% | 114.42% | 14.82% | 40.18% | 42.50% |
Proportion of Data Chunks | F1-Score | RI | ARI | FMI | JI | Average Improvement |
---|---|---|---|---|---|---|
10 | 53.68% | 24.90% | 105.86% | 12.76% | 42.99% | 48.04% |
20 | 46.80% | 8.02% | 56.81% | 13.42% | 32.56% | 31.52% |
30 | 32.12% | 13.92% | 57.58% | 21.67% | 46.53% | 34.36% |
40 | 58.01% | 7.26% | 43.54% | 6.22% | 29.21% | 28.85% |
50 | 45.15% | 19.22% | 78.34% | 17.24% | 41.72% | 40.33% |
Proportion of Data Chunks | F1-Score | RI | ARI | FMI | JI | Average Improvement |
---|---|---|---|---|---|---|
10 | 54.15% | 21.69% | 186.61% | 12.40% | 40.93% | 63.16% |
20 | 52.33% | 8.49% | 58.56% | 11.90% | 35.36% | 33.33% |
30 | 56.61% | 12.85% | 84.23% | 18.03% | 48.30% | 44.00% |
40 | 45.39% | 11.11% | 90.80% | 11.60% | 36.87% | 39.16% |
50 | 37.31% | 16.43% | 96.38% | 16.03% | 40.95% | 41.42% |
Dataset | SPFCOM | SPBDFCOM | OFCOM | OBDFCOM |
---|---|---|---|---|
BT | 6.58 | 1.02 | 8.94 | 1.20 |
Zoo | 2.68 | 1.84 | 2.19 | 2.16 |
Mice | 883.40 | 8.29 | 164.05 | 12.25 |
HCV | 26.19 | 5.96 | 38.57 | 6.81 |
SBN | 626.52 | 5.14 | 1422.28 | 7.77 |
CKD | 790.32 | 8.05 | 266.35 | 11.52 |
MD | 332.34 | 4.77 | 632.40 | 6.52 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, H.; Mohamad Mohsin, M.F.; Mohd Pozi, M.S.; Zeng, Z. Incremental Beta Distribution Weighted Fuzzy C-Ordered Means Clustering. Information 2025, 16, 663. https://doi.org/10.3390/info16080663
Wang H, Mohamad Mohsin MF, Mohd Pozi MS, Zeng Z. Incremental Beta Distribution Weighted Fuzzy C-Ordered Means Clustering. Information. 2025; 16(8):663. https://doi.org/10.3390/info16080663
Chicago/Turabian StyleWang, Hengda, Mohamad Farhan Mohamad Mohsin, Muhammad Syafiq Mohd Pozi, and Zhu Zeng. 2025. "Incremental Beta Distribution Weighted Fuzzy C-Ordered Means Clustering" Information 16, no. 8: 663. https://doi.org/10.3390/info16080663
APA StyleWang, H., Mohamad Mohsin, M. F., Mohd Pozi, M. S., & Zeng, Z. (2025). Incremental Beta Distribution Weighted Fuzzy C-Ordered Means Clustering. Information, 16(8), 663. https://doi.org/10.3390/info16080663