Next Article in Journal
Performance Analysis of Cache-Enabled Millimeter-Wave Downlink Time Division Duplexing Networks with Cooperative Base Stations
Next Article in Special Issue
Benchmarking LLM-as-a-Judge Models for 5W1H Extraction Evaluation
Previous Article in Journal
Velocity Ambiguity and Inter-Carrier Interference Suppression Algorithm in Stepped-Carrier OFDM Radar for ISAC
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Ensemble Clustering Method via Robust Consensus Learning

1
Department of Computer Science and Artificial Intelligence, Changzhou University, Changzhou 213159, China
2
Department of AI & Computer Science, Jiangnan University, Wuxi 214122, China
3
Department of Taihu Jiangsu Key Construction Lab. of IoT Application Technologies, Wuxi 214122, China
4
Department of Computer Science and Engineering, Shaoxing University, Shaoxing 312000, China
*
Authors to whom correspondence should be addressed.
Electronics 2025, 14(23), 4764; https://doi.org/10.3390/electronics14234764
Submission received: 12 November 2025 / Revised: 30 November 2025 / Accepted: 2 December 2025 / Published: 3 December 2025
(This article belongs to the Special Issue Multimodal Learning for Multimedia Content Analysis and Understanding)

Abstract

Although ensemble clustering methods based on the co-association (CA) matrix have achieved considerable success, they still face the following challenges: (1) in the label space, the noise within the connective matrices and the structural differences between them are often neglected, and (2) the rich structural information inherent in the feature space is overlooked. Specifically, for each connective matrix, a symmetric error matrix is first introduced in the label space to characterize the noise. Then, a set of mapping models is designed, each of which processes a denoised connective matrix to recover a reliable consensus matrix. Moreover, multi-order graph structures are introduced into the feature space to enhance the expressiveness of the consensus matrix further. To preserve a clear cluster structure, a theoretical rank constraint with a block-diagonal enhancement property is imposed on the consensus matrix. Finally, spectral clustering is applied to the refined consensus matrix to obtain the final clustering result. Experimental results demonstrate that ECM-RCL achieves superior clustering performance compared to several state-of-the-art methods.

1. Introduction

Clustering analysis [1,2] is a fundamental task in unsupervised learning. To further improve the robustness and reliability of clustering results, ensemble learning has been incorporated into clustering research [3,4,5,6,7,8,9]. Among ensemble clustering methods, methods based on the co-association (CA) matrix have been widely adopted, owing to their simplicity and strong theoretical foundation [10]. Typically, these methods follow a standard two-stage framework. In the first stage, multiple connective matrices are integrated to construct the CA matrix. Then, the CA matrix is used as input to a traditional clustering method (e.g., K-means [11]) to obtain the final clustering result. Due to the critical role of the CA matrix in ensemble clustering, a series of ensemble clustering methods [3,7,10,12,13,14,15,16,17,18,19,20] have been proposed to enhance the CA matrix in the following aspects:
First, some studies [3,12,16,19,20] enhance the CA matrix by introducing the weighting mechanism. For example, Bagherinia et al. [12] selected base clustering results based on their diversity and quality to construct a fuzzy CA matrix. Huang et al. [3] applied entropy theory to assign weights to clusters, thereby constructing a locally weighted CA matrix. Gu et al. [16] jointly considered uncertainties in base clustering results and clusters, using ensemble-driven cluster-uncertainty estimation to construct a two-level weighted CA matrix. Zheng et al. [20] enhanced the CA matrix by weighting base clustering results according to their correlations. Yang et al. [19] proposed a novel CA matrix learning method by dynamically adjusting cluster weights. Zeng et al. [21] designed a cluster confidence measure to assess low-dimensional deep embeddings, ensuring the generation of high-quality and diverse base clustering results for learning the CA matrix. Huang et al. [22] generated base clustering results by leveraging multi-level features from a neural network. An entropy criterion was then introduced to weight clusters, thereby providing a reliable basis for constructing the CA matrix.
Second, some methods [7,10,13,17,18] aim to enhance the structural representation of the CA matrix. For example, Tao et al. [13] improved the expressiveness of the CA matrix by leveraging its low-rank representation. Jia et al. [7] proposed a self-enhancement mechanism to refine the CA matrix representation. Xu et al. [17] employed a self-expressive model combined with self-enhancement to capture both global and local CA matrix structures, thereby improving clustering performance. Xu et al. [18] integrated a self-enhancement mechanism with topological consistency propagation to improve the quality of the CA matrix. Li et al. [10] combined kernel learning with local information preservation to propose a novel CA matrix learning method. Hao et al. [23] employed feature representations from an encoder to strengthen the structure of the CA matrix.
Third, several studies [14,15] have integrated external knowledge or refinement strategies into CA matrix learning. For example, Zhou et al. [14] enhanced the CA matrix using an active learning strategy to automatically label key samples. Zhou et al. [15] integrated consensus learning with base clustering refinement, thereby enhancing the CA matrix.
Additionally, inspired by the success of hybrid deep learning architectures in capturing complex data patterns and robust feature representations [24], recent works [25,26,27] have been proposed to improve the ensemble clustering performance. For example, Miklautz et al. [25] employed deep learning to generate a set of base clustering results and performed ensemble clustering in the feature space. It extends the utilization of the label space to some extent; however, it still falls short of fully leveraging the rich information embedded in the feature space, such as higher-order relationships between samples. Liang [26] utilized deep learning to produce base clustering results and integrated multiple such outcomes, with the ensemble result subsequently being used to refine the base clustering results. It enables the transfer of final clustering information from the label space back to the feature space, thereby improving clustering performance. Nevertheless, it fails to adequately account for the noise in the connective matrices within the label space and the structural differences among clusters across different matrices. Huang et al. [27] adopted graph convolutional networks (GCNs) to explore high-order structural information embedded in the data and integrated raw data features into the CA matrix. Although this method attempts to combine information from both the feature space and the label space, it still overlooks the noise in the connective matrices and the variations in cluster structures associated with them.
In summary, although the aforementioned ensemble clustering methods have successfully improved the CA matrix and achieved notable clustering performance, they still face the following challenges.
(1) These methods typically directly fuse multiple connective matrices derived from different cluster structures. The resulting CA matrix may suffer from noise introduced by individual connective matrices and structural corruption caused by differences in the cluster structures of these matrices, which may adversely affect the overall performance of ensemble clustering.
(2) These methods primarily exploit information in the label space (i.e., base clustering results). As a result, the CA matrix they construct captures only label space information, neglecting the rich structural and relational information inherent in the feature space (i.e., the original dataset), which may significantly hinder ensemble clustering performance.
To address the aforementioned challenges, we address the following attempts: (1) Due to the symmetry of connective matrices [28], a symmetric error matrix is introduced for each connective matrix to characterize its potential noise distribution. Subsequently, we design a set of mapping models. By assigning a mapping model to each connective matrix, we recover a highly reliable consensus matrix. This process eliminates structural corruption caused by differences in cluster structures among the connective matrices. (2) Multi-order graph structures are designed to explore high-order structural information in the feature space, allowing richer relational patterns among samples to be effectively captured. Based on this, a dynamic combination strategy is further proposed to adaptively integrate information from graph structures of different orders, thereby refining and enhancing the representation capability of the consensus matrix. In addition, a theoretical rank constraint with a block-diagonal structure is imposed on the consensus matrix. Then, we explain our method in detail from the following aspects.
(1) Recent studies [29] have shown that accounting for differences among cluster structures in the connective matrices is essential for learning a reliable consensus matrix. Furthermore, as the connective matrices are generated under different clustering configurations, they can be viewed as observations of the consensus matrix from diverse structural perspectives. However, these observations may be contaminated by noise. Existing studies, such as [28], provide indirect evidence of noise and outliers within the connective matrices. Therefore, an effective strategy is to first denoise the connective matrices and then recover a reliable consensus matrix. Motivated by this, we introduce a symmetric error matrix for each connective matrix to capture its inherent noise. In doing so, the influence of unreliable sample relationships is automatically attenuated during the consensus learning process. This mechanism serves a similar purpose to cluster weighting. Then, we design a set of mapping models to eliminate structural corruption caused by differences among connective matrices, enabling more robust consensus matrix recovery.
(2) The CA matrix captures only the association relationships between pairwise samples within the label space [30]. However, when the label space contains noise or erroneous information, the information in the feature space can serve as a powerful supplement to correct such biases. Therefore, integrating the structural information of samples and the latent cluster structure in the feature space into ensemble clustering is a practical strategy to compensate for deficiencies in the label space and significantly enhance clustering performance. In addition, as noted in [31], the association relationships between pairwise samples are not limited to direct first-order connections but also encompass higher-order connections. Motivated by these observations, we propose multi-order graph structures to incorporate multi-scale connectivity information between pairwise samples from the feature space.
(3) Existing studies [32,33] have demonstrated that the block-diagonal structure plays a critical role in improving clustering performance. Following the idea in [32], we introduce a theoretical rank constraint to enhance the block-diagonal property of the consensus matrix while preserving a clear cluster structure. It is worth noting that the Laplacian rank constraint [18] can also be employed to reinforce the block-diagonal structure of a matrix. However, this Laplacian rank constraint is not a viable option. The reason is that the number of samples n is typically much larger than the number of ground-truth clusters c , resulting in a high rank for the Laplacian matrix. Therefore, it is generally unreasonable to seek such a high-rank matrix by minimizing the rank of the Laplacian matrix [33].
In summary, motivated by the aforementioned theories and methods, we propose a novel ensemble clustering method, termed ECM-RCL. Specifically, we first introduce a symmetric error matrix for each connective matrix to identify noise components and then design a set of mapping models to recover the consensus matrix robustly. We further introduce multi-order graph structures to refine the consensus matrix. Finally, a theoretical rank constraint is applied to enhance the block-diagonal structure, thereby more effectively preserving a clear cluster structure.
To describe clearly, here we summarize the differences between the proposed ECM-RCL and some existing methods [3,7,10,12,13,14,15,16,17,18,19,20] from the following six perspectives (e.g., WBC, WC, CSD, CD, BDS, and OD) in Table 1. Specifically, WBC indicates whether weighting or selection of base clustering results is considered. WC indicates whether clusters in the base clustering results are weighted. CSD considers differences in cluster structures across connective matrices, but the weighted cluster does not modify its cluster structures. CD indicates whether noise in the connective matrix is considered. BDS denotes whether the block-diagonal property of the CA matrix is considered. OD indicates whether the original dataset information is utilized. In Table 1, “√” suggests that the method finds the corresponding feature, while “Weak” denotes limited or indirect consideration of that feature.
The core contributions of this study can be summarized as follows:
(1)
A symmetric error matrix is introduced for each connective matrix to identify noise components effectively. Additionally, a reliable consensus matrix is recovered by designing a set of mapping models that address structural differences among the connective matrices and enable robust consensus learning.
(2)
Multi-order graph structures are designed to fully exploit the association relationships between samples in the feature space, thereby enhancing the structural quality of the consensus matrix.
(3)
A theoretical rank constraint is incorporated to reinforce the block-diagonal property of the consensus matrix, ensuring a clear cluster structure.
(4)
The experimental results on all adopted datasets demonstrate that ECM-RCL is effective compared to the state-of-the-art methods.
For clarity, the mathematical notations used in this study are summarized in Table 2.
The remainder of this paper is organized as follows: We present preliminaries on ECM-RCL in Section 2. We elaborate further on the ECM-RCL model in Section 3. We present the experimental results of ECM-RCL in Section 4. We conclude with a summary of the findings and future study directions for ECM-RCL in Section 5.

2. Preliminaries

2.1. Co-Association Matrix

The co-association (CA) matrix quantifies association relationships between pairs of samples. Under the theoretical framework in [8], the CA matrix can be systematically derived from a given set of connective matrices (i.e., A 1 , A 2 , , A m ).
S = 1 m i = 1 m A i
A i = β cls i x p , cls i x q
where cls i x p denotes the cluster membership of x p in the i -th base clustering result π i , and β a , b is the Kronecker delta function, which can be defined as follows:
β ( a , b ) = 1 , a = b 0 , a b

2.2. Multi-Order Graph Structures

First-order proximity is a fundamental concept in graph analysis, capturing the direct relationships between samples as indicated by adjacency information.
G p q 1 = κ x p , x q , x q N K x p 0 , otherwise
The parameter K denotes the number of nearest neighbors. Although the first-order graph structure captures direct similarities, it cannot reveal global structural patterns. To overcome this, higher-order structures leverage shared neighbors to uncover latent associations among samples [34]. The specific construction method is as follows:
G j = G 1 , j = 1 G j 1 G 1 , j > 1
Multi-order graph structures integrate both first-order and higher-order proximities, systematically capturing multi-level relationships among samples and thereby fully exploiting the structural information present in the feature space.

3. Ensemble Clustering Method via Robust Consensus Learning

In this section, we provide a detailed description of ECM-RCL, as illustrated in Figure 1. First, for each connective matrix, a symmetric error matrix is constructed to characterize potential noise. Next, a set of mapping models is designed to reliably recover the consensus matrix. Second, multi-order graph structures are built in the feature space to capture the rich structural information embedded in the samples, further refining the consensus matrix. Third, a block-diagonal enhancement constraint is incorporated to preserve the clear structural properties of the consensus matrix. Finally, spectral clustering is applied to the refined consensus matrix to obtain the final clustering result.

3.1. The Objective Function of ECM-RCL

In this subsection, we formally introduce the objective function of ECM-RCL. As discussed in the preceding sections, ensemble clustering methods based on the CA matrix still face two critical but often overlooked challenges. First, during the consensus matrix learning process, these methods commonly ignore noise contamination within individual connective matrices and structural differences among connective matrices. It is worth noting that although cluster weighting can enhance the influence of reliable clusters, it cannot effectively modify the inherent structural differences among connective matrices. Second, most existing methods focus primarily on the label space, overlooking the rich structural and relational information embedded in the feature space.
In the label space, given a set of connective matrices (i.e., A 1 , A 2 , , A m ), the first step is to recover a reliable consensus matrix from them. To address the noise within each connective matrix, we leverage its symmetry [28]. Adding to this, we introduce a symmetric error matrix E i for each matrix to account for potential observation noise. The denoised connective matrices are thus obtained via error compensation (i.e., A 1 E 1 , A 2 E 2 , , A m E m ). Subsequently, we propose a set of mapping models (i.e., P 1 , P 2 , , P m ) to address differences in cluster structures across connective matrices. Each denoised matrix is assigned to a mapping model and projected into a unified space for consensus matrix recovery.
min P i , H , α i i = 1 m α i 2 A i E i P i H F 2 + λ 1 i = 1 m E i F 2 + λ 3 i = 1 m P i F 2 s . t . E i = E i T , H = H T , i = 1 m α i = 1 , 0 α i 1
where λ 1 and λ 3 are hyperparameters, α i denotes the weight assigned to the i -th connective matrix.
The consensus matrix obtained from the label space alone may not sufficiently represent feature space structures, limiting clustering performance. To enhance its expressiveness, we integrate it with structural information from the feature space. Our goal is to learn a unified consensus matrix that jointly captures cluster structures from both spaces. Accordingly, the objective function of ECM-RCL is defined as
min P i , H , E i , α i , β j i = 1 m α i 2 A i E i P i H F 2 + λ 1 i = 1 m E i F 2 + λ 2 j = 1 V β j 2 H G j F 2 + λ 3 i = 1 m P i F 2 s . t . 0 H p q 1 , H = H T , t r ( H ) = c , E i = E i T , i = 1 m α i = 1 , 0 α i 1 , j = 1 V β j = 1 , 0 β j 1
where λ 2 is a hyperparameter. The theoretical rank constraint t r ( H ) = c approximates the graph structure by preserving the block-diagonal property of the matrix. The constraint can help preserve a clear cluster structure. β j denotes the weight assigned to the j -th order graph structure.
After obtaining the consensus matrix, spectral clustering is applied to generate the final ensemble clustering result. The spectral clustering method has two main advantages. First, the consensus matrix H can directly serve as input for spectral clustering. Second, H exhibits a block-diagonal structure that aligns well with the structural requirements of an ideal similarity matrix in spectral clustering [33].

3.2. Optimization and Analysis

3.2.1. Optimization

In this subsection, we optimize the objective function in Equation (7).
Step 1:  P i subproblem
When optimizing P i , we fix the other variables, and the objective function Equation (7) reduces to
min P i α i 2 A i E i P i H F 2 + λ 3 P i F 2
Given that Equation (8) is an unconstrained convex optimization problem, its global minimum can be efficiently found by setting the gradient of the objective function with respect to P i to zero.
P i = α i 2 A i E i H T α i 2 H H T + λ 3 I 1
Step 2:  H subproblem
When optimizing H , we fix the other variables, and the objective function Equation (7) reduces to
min H i = 1 m α i 2 A i E i P i H F 2 + λ 2 j = 1 V β j 2 H G j F 2 s . t . 0 H p q 1 , H = H T , t r ( H ) = c
Since direct optimization is challenging, we adopt a two-step approximation strategy based on [35] to optimize Equation (10). Specifically, we first represent H as H ˜ , disregarding the constraints on H , and then incorporate these constraints to adjust H ˜ into an approximation of H . We transform the problem in Equation (10) into the problems in Equations (11) and (12).
min H ˜ i = 1 m α i 2 A i E i P i H ˜ F 2 + λ 2 j = 1 V β j 2 H ˜ G j F 2
min H H H ˜ F 2 s . t . 0 H p q 1 , H = H T , t r ( H ) = c
Given that Equation (11) is an unconstrained convex optimization problem, its global minimum can be efficiently found by setting the gradient of the objective function with respect to H ˜ to zero.
H ˜ = i = 1 m α i 2 P i T P i + λ 2 j = 1 V β j 2 I 1 i = 1 m α i 2 P i T A i E i + λ 2 j = 1 V β j 2 G j
When we obtain H ˜ , we can directly derive that H is solved by the manner [32].
Step 3:  E i subproblem
When optimizing E i , we fix the other variables, and the objective function Equation (7) reduces to
min E i α i 2 A i E i P i H F 2 + λ 1 E i F 2 s . t . E i = E i T
According to [7], the objective function in Equation (14) is recast as Equation (15).
min E i E i α i 2 A i P i H α i 2 + λ 1 F 2 s . t . E i = E i T
Although Equation (15) is a constrained optimization problem, it can be effectively solved in the manner of [7].
E i = α i 2 A i P i H + A i P i H T 2 α i 2 + λ 1
Step 4:  α i subproblem
When optimizing α i , we fix the other variables, and the objective function Equation (7) reduces to
min α i i = 1 m α i 2 A i E i P i H F 2 s . t . i = 1 m α i = 1 , 0 α i 1
We derive the solution for α i based on the Cauchy–Schwarz inequality.
α i = A i E i P i H F 2 i = 1 m A i E i P i H F 2
For details on the application of the Cauchy–Schwarz inequality to weight allocation, refer to Appendix A.
Step 5:  β j subproblem
When optimizing β j , we fix the other variables, and the objective function Equation (7) reduces to
min β j j = 1 V β j 2 H G j F 2 s . t . j = 1 V β j = 1 , 0 β j 1
We derive the solution for β j based on the Cauchy–Schwarz inequality.
β j = H G j F 2 j = 1 V H G j F 2
For details on the application of the Cauchy–Schwarz inequality to weight allocation, refer to Appendix A.
The optimization process and the overall workflow of ECM-RCL are summarized in Algorithms 1 and 2, respectively.
Algorithm 1: Optimization of Objective Function
Input:   Connective   matrices   A 1 , A 2 , , A m , multi-order graph structures G 1 , G 2 , , G V ,   hyperparameters   λ 1 ,   λ 2   and   λ 3 .
  • Initialize:  i t e r = 0 ,   α i = 1 m ,   β j = 1 V ,   E i = 0 ,   H = 0 ,   P i = I .
  •      Obtain   P i by Equation (9).
  •      Obtain   H by Equation (10).
  • while not converging do
  •      i t e r = i t e r + 1
  •      Update   E i by Equation (16).
  •      Update   P i by Equation (9).
  •      Update   H by Equation (10).
  •      Update   α i by Equation (18).
  •      Update   β j by Equation (20).
  •     Check the convergence conditions:
  •         obj i t e r obj i t e r 1 < 1   ×   10 4 obj i t e r
  • end while
Output: Consensus Matrix  H
Algorithm 2: ECM-RCL
Input:   Dataset   X R n × d , the number of base clustering results m , the number of ground-truth clusters c , the number of clusters of base clustering results c 1 , c 2 , , c m .
  • Step 1: Obtain m connective matrices A 1 , A 2 , , A m .
  • Step 2: Obtain multi-order graph structures G 1 , G 2 , , G V .
  • Step 3: Call Algorithm 1 to obtain the consensus matrix H .
  • Step 4: Obtain clustering results by spectral clustering.
Output: Final ensemble clustering result.

3.2.2. Complexity Analysis

This subsection analyzes the time complexity of ECM-RCL. According to Algorithm 2, it comprises four steps:
Step 1 aims to generate connective matrices. In this study, the K-means method is employed to produce the base clustering results, from which the corresponding connectivity matrices are constructed. Assuming that all K-means method processes have the same number of iterations, the time complexity of this step is dominated by the complexity of K-means (i.e., max O n c i d t ), where t denotes the number of iterations.
Step 2 involves constructing multi-order graph structures. The time complexity of building the first-order graph structure is O n 3 , while the overall time complexity of generating the multi-order graph structures is O V n 3 .
Step 3 corresponds to the optimization process in Algorithm 1. The time complexities for updating P i i = 1 m , H and E i i = 1 m require O m n 3 . It needs O m n 2 to update α i i = 1 m . It needs O V n 2 to update β j = 1 V . It reveals that the time complexity of a key step in this algorithm is O n 3 , a cost arising primarily from the iterative updates of parameters P i i = 1 m and H . As the sample size n increases, the computational burden grows significantly. Compared with the improvement in robustness and accuracy of ECM-RCL on large-scale datasets, the computational burden is worthwhile. Additionally, some methods (e.g., Nyström method [36] or anchor-based method [37]) can be adopted to further reduce the computational burden.
Step 4 corresponds to obtaining the final clustering result using spectral clustering. The time complexity of spectral clustering is O n 3 + c n 2 .
In summary, the total time cost of ECM-RCL is max O n c i d t   + O ( V + m ) n 3 .
We observe that as the sample size increases, the time complexity of the method shows a significant upward trend. Meanwhile, the clustering performance of the proposed model also improves. Although the computational cost increases, the rise in time complexity remains acceptable given the performance gains achieved. In subsequent research, optimization strategies such as the introduction of the Nyström method or the anchor-based method can be explored to effectively reduce time complexity.

4. Experimental Section

In this section, extensive experiments are conducted to evaluate the effectiveness of the proposed ECM-RCL. To ensure fairness, all experiments are performed on a machine equipped with a 14th-generation Intel Core i7-14700KF processor and 64 GB of RAM.

4.1. Experimental Organization

4.1.1. Datasets

To systematically evaluate the clustering performance of the proposed ECM-RCL, 19 real-world benchmark datasets were selected. The details of each dataset are summarized in Table 3.

4.1.2. Comparative Methods

In this subsection, we introduce the seven comparative methods: K-means clustering method (K-means) [11], Locally Weighted Ensemble Clustering (LWEA) [3], Ensemble Clustering Based on Dense Representation (DREC) [38], Enhanced Ensemble Clustering via Fast Propagation of Cluster-Wise Similarities (ECPCS-HC) [4], Ensemble Clustering via Co-Association Matrix Self-Enhancement (EC-CMS) [7], Ensemble Clustering with Attentional Representation (ECAR) [23], and Clustering Ensemble via Diffusion on Adaptive Multiplex (CEAM) [39].

4.1.3. Experimental Setup and Evaluation Indices

In this study, following the common practice in the literature [40], we adopted K-means as the base clustering method and set the number of base clustering results to 10 (i.e., m = 10 ). For each base clustering, the number of clusters was randomly selected from the range c , max 100 , n . For the comparative methods, all parameter settings followed the configurations described in [3,4,7,23,38,39]. All experimental results were averaged over 10 independent runs.
In the ECM-RCL, the number of neighbors K for constructing the k-nearest neighbor graph is set to min 10 , n 2 c when the number of samples is less than 1000 ( n < 1000 ), and min 20 , n 2 c otherwise. The maximum order of the multi-order graph structures is set to 2 ( V = 2 ). The parameters of ECM-RCL are determined via grid search, where the candidate values for λ 1 , λ 2 , and λ 3 are 0 . 001 , 0 . 01 , 0 . 1 , 1 , 10 , 100 , 1000 .
In addition, three commonly used evaluation metrics are employed to assess the clustering performance of the proposed method: Accuracy (ACC) [41], Normalized Mutual Information (NMI) [42], and Adjusted Rand Index (ARI) [43]. The values of ACC and NMI range from 0 , 1 , while ARI ranges from 1 , 1 . Higher values indicate better clustering performance.

4.2. Discussions on Datasets

4.2.1. Clustering Performance

In this section, we compare the clustering performance of the proposed ECM-RCL with seven clustering methods on all the adopted datasets. The average ACC, NMI, and ARI scores for all adopted methods across all datasets are presented in Table 4, Table 5 and Table 6, respectively.
As shown in Table 4, Table 5 and Table 6, the performance of ECM-RCL surpasses that of the other compared methods on most datasets, particularly on large-scale ones. This is because ECM-RCL accounts for structural differences among connective matrices in the label space and captures rich multi-scale structural information of samples in the feature space. K-means achieves the lowest clustering performance due to the lack of an ensemble learning strategy. In contrast to K-means, ECAR introduces a novel autoencoder with an attention mechanism that adaptively learns both sample representations and weights for base clustering results, thereby enhancing ensemble clustering performance. CEAM iteratively performs representation learning and structural optimization on a multiplex graph. This process enables mutual enhancement between the base clustering results and the ensemble clustering result, ultimately achieving better performance. LWEA improves the quality of the CA matrix by introducing a cluster weighting strategy, while ECPCS-HC further refines the CA matrix by considering indirect association relationships among clusters. Compared to other methods, EC-CMS and DREC achieve better performance on the datasets, primarily due to their more comprehensive optimization strategies and effective incorporation of high-confidence information.
Furthermore, according to the results in Table 4, Table 5 and Table 6, the proposed ECM-RCL yields relatively poor performance on the Lymphoma, WDBC, and Zoo datasets, performing comparably to, or even worse than, some ensemble clustering methods. The potential reasons for this are analyzed as follows: (1) These datasets have complex structures that contain redundant information, which weakens the accuracy of the symmetric error matrix of ECM-RCL. (2) These datasets may have a high degree of overlap between clusters, making it difficult to effectively align the multi-order graph structure information extracted from the feature space with the information in the label space. (3) The multi-scale structure of samples in the feature space introduces redundant information, which compromises the clarity of the consensus matrix. (4) Although the cluster structure and noise in the base clustering results within the label space are considered, other information in the base clustering results (e.g., redundant information) has not been fully utilized. Future study could explore adaptive model ensembles that integrate the dataset’s original features to address these limitations.

4.2.2. Statistical Analysis

In this subsection, we employ the Friedman test [42], the Wilcoxon signed-rank test [44], and the post hoc Holm test [45] to evaluate the differences in clustering performance (i.e., ACC) between ECM-RCL and all the adopted comparative methods on all the adopted datasets.
Table 7 presents the results of the Friedman test, where the null hypothesis assumes no significant differences among all the adopted methods. As shown in Table 7, the p -value is less than 0 . 05 , leading to the rejection of the null hypothesis, which indicates significant differences in the clustering performance of the adopted methods.
Table 8 presents the Wilcoxon signed-rank test results for pairwise comparisons between ECM-RCL and each comparative method, assuming no significant difference for each method pair. As shown in Table 8, all the p -values are less than 0 . 05 , and we therefore reject the null hypothesis, indicating significant differences in the clustering performance between ECM-RCL and each adopted comparative method.
Table 9 presents post hoc Holm test results for pairwise comparisons between ECM-RCL and each comparative method, assuming no significant difference for each method pair. According to [42], if p -value   <   α / i , we can conclude that the two methods have significant differences, where i represents the ranking of the i -th comparative method and α = 0 . 05 . As shown in Table 9, all the p -values are less than α / i , and we therefore reject the null hypothesis, indicating significant differences in the clustering performance between ECM-RCL and each adopted comparative method.
In summary, ECM-RCL exhibits significant differences compared with the other comparative methods. Moreover, based on the results in Table 4, Table 5 and Table 6, ECM-RCL demonstrates superior clustering performance over the adopted comparative methods.

4.2.3. Ablation Study

In this subsection, due to space limitations, we compare the proposed ECM-RCL with several of its degraded variants on the 9 adopted datasets (i.e., GLIMA, Heart, Iris, LM, Orlraws10p, WarpPIE10p, WDBC, Wine, Zoo) to validate the effectiveness and importance of each key component, which cover diverse scenarios ranging from low-dimensional to ultra-high-dimensional data, small-sample to medium-scale settings, and varying numbers of clusters.
EECM-RCL: It removes the noise matrices (i.e., E 1 , E 2 , , E m ) and directly recovers the consensus matrix from the connective matrices.
PECM-RCL: It removes the mapping models (i.e., P 1 , P 2 , , P m ) and directly performs weighted fusion of the denoised connective matrices.
RECM-RCL: It removes the theoretical rank constraint.
DECM-RCL: It refines the consensus matrix using first-order graph structure in the feature space.
As shown in Table 10, ECM-RCL achieves the best clustering performance in most cases. In contrast, EECM-RCL suffers from performance degradation due to noise in the recovered consensus matrix, arising from ignoring noise in each connective matrix. PECM-RCL fails to account for differences in cluster structures across individual connective matrices, leading to structural disruption in the fused consensus matrix. RECM-RCL lacks modeling of the block-diagonal structure, which reduces the structural clarity of the consensus matrix. DECM-RCL ignores the multi-scale structural information of samples in the feature space, resulting in insufficient refinement of the consensus matrix and ultimately impairing clustering performance.

4.2.4. Parameter Sensitivity Analysis

In this subsection, we conduct a parameter sensitivity analysis with clustering accuracy (ACC) as the evaluation metric. Figure 2 presents the effects of parameters λ 1 and λ 2 on three adopted datasets (i.e., Heart, Iris, and LM). We can observe that the impact of parameter combinations on method performance varies among datasets. In general, ECM-RCL achieves favorable results when λ 1 0 . 1 and λ 2 0 . 1 . Figure 3 presents the effect analysis of parameter λ 3 on three adopted datasets (i.e., Heart, Iris, and LM). The results indicate that the optimal value of λ 3 typically lies around 1 , 10 . The experimental results indicate that although different parameter combinations lead to fluctuations in the result, the overall magnitude of variation is limited. It is noteworthy that parameter λ 3 demonstrates significantly higher sensitivity compared to λ 1 and λ 2 . A plausible explanation for this observation is that the quality of the mapping matrix directly influences the recovery of the consensus matrix. In addition, Table 11 reports the effect of the maximum order V of the multi-order graph structures on clustering performance on six adopted datasets (i.e., ALLAML, GLIOMA, Heart, Iris, Lymphoma, and Orlraws10p). It is evident that the method generally performs better when V = 2 or V = 3 .
Additionally, although the above parameter analysis results have confirmed that the parameter setting by grid search is reliable in our experiments, the grid search is difficult to implement in applications, and the settings of these parameters should be further improved. From the parameter analysis results, we can observe that ECM-RCL exhibits stable performance with the change in parameters λ 1 and λ 2 , so their search ranges are appropriately precise to λ 1 0 . 1 , 1 , 10 , 100 , 1000 , λ 2 0 . 1 , 1 , 10 , 100 , 1000 in the applications. For parameters λ 3 and V , they have the highest sensitivity in the performance of ECM-RCL, and they should be further precisely set according to the structure of datasets in the application, which has also been confirmed in [24].

4.2.5. Robustness Analysis of Base Clustering Results

In this subsection, the influence of both the quantity and variety of base clustering results on the ECM-RCL is investigated. The evaluation is conducted from two perspectives: (1) assessing the impact of the ensemble sizes m on performance, and (2) comparing the results when using spectral clustering as an alternative to K-means for generating base clustering results. For spectral clustering, the number of base clustering results is set to 10 (i.e., m = 10 ), and the number of clusters within each is randomly selected from c , max 100 , n . Experiments are performed on seven datasets (i.e., CLL_SUB, Heart, Iris, Lymphoma, Orlraws10p, Wine, Zoo). Figure 4 presents the average ACC, NMI, and ARI results of ECM-RCL under different m , while Figure 5 compares the average ACC, NMI, and ARI results obtained using K-means and spectral clustering.
As shown in Figure 4, although the performance of ECM-RCL varies slightly with different numbers of base clustering results, the experimental results demonstrate that the method achieves optimal performance when the number of base clustering results is set to 10 (i.e., m = 10 ). This finding is consistent with the parameter setting in [40], further validating the rationality of this parameter selection. From Figure 5, it can be observed that ECM-RCL also exhibits stable clustering performance when K-means and spectral clustering are used as the base clustering methods.
In summary, ECM-RCL demonstrates good robustness and reliability under different base clustering configurations.

4.2.6. Visual Analysis

In this section, we present a visual analysis of the consensus matrix learned by ECM-RCL, as well as the graph structural weights corresponding to different orders in the feature space. Specifically, Figure A1 visualizes the consensus matrix, while Figure A2, Figure A3, Figure A4, Figure A5, Figure A6, Figure A7 and Figure A8 illustrate the distribution of graph structural weights across different orders. The corresponding figures can be found in the Appendix B and Appendix C (Figure A1, Figure A2, Figure A3, Figure A4, Figure A5, Figure A6, Figure A7 and Figure A8).
As shown in Figure A1, most of the learned consensus matrices exhibit a clear block-diagonal structure. However, we also observe that a few matrices lack a distinct block-diagonal pattern, which may be due to ambiguous sample relationships within the corresponding datasets. Furthermore, results presented in Table 4, Table 5, Table 6 and Table 10 demonstrate that the superior clustering performance of ECM-RCL is closely associated with the presence of a well-defined block-diagonal structure in the consensus matrix.
As shown in Figure A2, Figure A3, Figure A4, Figure A5, Figure A6, Figure A7 and Figure A8, despite variations across the fifteen datasets, some consistent trends in the weight distributions of graph structures emerge. When the maximum order is 2, the second-order graph structure is dominant. When the maximum order is 3, the weights are more evenly distributed across the first, second, and third orders. Notably, when the maximum order reaches 4 or higher, the weights do not increase with order. In addition, the parameter analysis in Table 11 further validates the choice of 2 or 3 as a reasonable maximum order.

4.2.7. Execution Time and Convergence Analysis

This subsection presents the execution times of all adopted methods and the convergence analysis of ECM-RCL. Table 12 reports the execution times for each method. As shown in Table 12, LWEA and ECPCS-HC exhibit faster execution times, primarily because these methods construct the consensus matrix using heuristic strategies with fewer optimization steps. In contrast, EC-CMS and DREC are more time-consuming: the former processes first-order high-confidence information among samples, while the latter integrates the global structure of the samples. CEAM exhibits a relatively long runtime due to dynamically adjusting intra-cluster connections based on cluster similarity during clustering. The computation of ECAR is relatively time-consuming, primarily due to the complexity of its pre-training and training phases. Although ECM-RCL is not the slowest method across all datasets, its overall execution time remains relatively long, particularly on large-scale datasets, which is mainly attributed to its time complexity being cubic with respect to the number of samples. Nevertheless, this method can effectively account for the noise within connective matrices, the structural differences among these matrices, and the rich sample structure information extracted from the feature space, resulting in superior clustering performance across multiple datasets, especially on large-scale ones. Although the increase in sample size significantly raises the computational cost, the results in Table 4, Table 5 and Table 6 indicate that the improvement in the accuracy and stability of clustering performance is worthwhile. Furthermore, optimized acceleration strategies (e.g., the Nyström method [36] or anchor-based method [37]) can be employed to effectively reduce the time complexity.
Furthermore, to verify the convergence of ECM-RCL, we visualize its convergence process on three datasets (i.e., Iris, WarpPIE10p, and Zoo). Figure 6 illustrates the evolution of the objective function, showing that ECM-RCL typically converges within approximately 50 iterations.

5. Conclusions

In this study, we propose a novel ensemble clustering method, termed ECM-RCL. First, a symmetric error matrix is introduced for each connective matrix to identify its noise components. Based on these matrices, a set of mapping models is designed to recover a reliable consensus matrix. Second, in the feature space, multi-order graph structures are constructed via random walks to capture high-order associations among samples, enabling multi-granularity refinement of the consensus matrix. Finally, a rank constraint incorporating a block-diagonal structure is imposed to ensure the structural clarity of the consensus matrix. Experimental results demonstrate that ECM-RCL achieves competitive clustering performance compared to all the adopted comparative methods.
Although the proposed ECM-RCL achieves good performance, some limitations still exist. First, the optimization process of ECM-RCL remains computationally expensive. Developing more efficient optimization techniques is crucial, such as the Nyström method or anchor-based methods. Second, this study mainly focuses on differences in cluster structures across connective matrices, without considering the cluster weights within them. Future work will explore the distribution of cluster weights in the label space. Third, we will develop an adaptive deep learning hybrid model. This framework will leverage advanced techniques (e.g., graph attention networks) to achieve intelligent redundancy reduction in a data-driven manner, while adaptively adjusting its learning strategy based on inherent dataset characteristics (e.g., scale and noise patterns). Ultimately, this leads to a more powerful and versatile clustering framework. Fourth, we will devise efficient hyperparameter optimization strategies to ascertain optimal configurations.

Author Contributions

Conceptualization, J.Q. and Q.D.; methodology, Q.D.; software, Q.D.; validation, J.Q. and Q.D.; formal analysis, Q.D.; investigation, J.Q., Z.B., J.Z. and Z.J.; data curation, Q.D.; writing—original draft preparation, Q.D.; writing—review and editing, J.Q., Z.B., J.Z. and Z.J.; visualization, Q.D.; supervision, Z.B. Funding acquisition, J.Q., Z.B., J.Z. and Z.J.; All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Natural Science Foundation of China under Grant 62306126, Grant 62206177 and Grant 62106145, in part by the Fundamental Research Funds for the Central Universities under Grant JUSRP123036, in part by the Jiangsu Province Youth Science and Technology Talent Support Project under Grant JSTJ2024283, in part by the Natural Science Foundation of Jiangsu Province under Grant BK20220621, in part by the Wuxi Science and Technology Development Fund Project under Grant K20231006, in part by the Zhejiang Provincial Natural Science Foundation of China under Grant LY23F020007 and Grant LQ22F020024, and in part by the General Scientific Research Project of Zhejiang Education Department under Grant Y202248951.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

The Cauchy–Schwarz inequality states that for any vectors a 1 , a 2 , a m and b 1 , b 2 , b m ,
i = 1 m a i b i 2 i = 1 m a i 2 i = 1 m b i 2
Equality holds if and only if the sequences are proportional (i.e., a i = k b i ) for some constant k and all i .
To connect this to our problem, we make a clever choice for a i and b i :
Let a i = c i d i . Notice that i = 1 m a i 2 = i = 1 m c i 2 d i , which is exactly our objective function.
Let b i = 1 d i .
Substituting these into the inequality,
i = 1 m c i d i 1 d i 2 i = 1 m c i 2 d i i = 1 m 1 d i
Let us assume i = 1 m c i = 1 , and substituting this gives the following:
1 2 i = 1 m c i 2 d i i = 1 m 1 d i
Rearranging this inequality isolates the objective function:
i = 1 m c i 2 d i 1 i = 1 m 1 d i
The minimum value of the objective function is achieved when the Cauchy–Schwarz inequality holds with equality. This happens when a i is proportional to b i :
c i d i = k d i s . t . i
where k is some constant. Solving for a i ,
c i = k d i s . t . i
We find the constant k by enforcing the constraint i = 1 m c i = 1 :
i = 1 m c i = i = 1 m k d i = 1
Therefore, k = 1 i = 1 m 1 d i .
Substituting the value of k back into c i = k d i gives the optimal solution:
c i = 1 d i i = 1 m 1 d i
For the subproblem α i , the parameters are assigned as c i = α i and d i = A i E i P i H F 2 . For the subproblem β j , the parameters are set to c j = β j and d i = H G j F 2 .

Appendix B

Figure A1. Visualization of the consensus matrix learned by ECM-RCL on all adopted datasets. (a) ALLAML. (b) CLL_SUB. (c) GLIMA. (d) Heart. (e) Iris. (f) LM. (g) Lymphoma. (h) MF. (i) Orlraws10p. (j) USPS. (k) Vertebral. (l) WarpPIE10p. (m) WDBC. (n) Wine. (o) Zoo. (p) ISOLET. (q) LS. (r) ODR. (s) PD.
Figure A1. Visualization of the consensus matrix learned by ECM-RCL on all adopted datasets. (a) ALLAML. (b) CLL_SUB. (c) GLIMA. (d) Heart. (e) Iris. (f) LM. (g) Lymphoma. (h) MF. (i) Orlraws10p. (j) USPS. (k) Vertebral. (l) WarpPIE10p. (m) WDBC. (n) Wine. (o) Zoo. (p) ISOLET. (q) LS. (r) ODR. (s) PD.
Electronics 14 04764 g0a1aElectronics 14 04764 g0a1b

Appendix C

Figure A2. Visualization of the consensus matrix learned by ECM-RCL on three adopted datasets. (a) ALLAML. (b) CLL_SUB. (c) GLIMA.
Figure A2. Visualization of the consensus matrix learned by ECM-RCL on three adopted datasets. (a) ALLAML. (b) CLL_SUB. (c) GLIMA.
Electronics 14 04764 g0a2
Figure A3. Visualization of the consensus matrix learned by ECM-RCL on three adopted datasets. (a) Heart. (b) Iris. (c) LM.
Figure A3. Visualization of the consensus matrix learned by ECM-RCL on three adopted datasets. (a) Heart. (b) Iris. (c) LM.
Electronics 14 04764 g0a3
Figure A4. Visualization of the consensus matrix learned by ECM-RCL on three adopted datasets. (a) Lymphoma. (b) MF. (c) Orlraws10p.
Figure A4. Visualization of the consensus matrix learned by ECM-RCL on three adopted datasets. (a) Lymphoma. (b) MF. (c) Orlraws10p.
Electronics 14 04764 g0a4
Figure A5. Visualization of the consensus matrix learned by ECM-RCL on three adopted datasets. (a) USPS. (b) Vertebral. (c) WarpPIE10p.
Figure A5. Visualization of the consensus matrix learned by ECM-RCL on three adopted datasets. (a) USPS. (b) Vertebral. (c) WarpPIE10p.
Electronics 14 04764 g0a5
Figure A6. Visualization of the consensus matrix learned by ECM-RCL on three adopted datasets. (a) WDBC. (b) Wine. (c) Zoo.
Figure A6. Visualization of the consensus matrix learned by ECM-RCL on three adopted datasets. (a) WDBC. (b) Wine. (c) Zoo.
Electronics 14 04764 g0a6
Figure A7. Visualization of the consensus matrix learned by ECM-RCL on three adopted datasets. (a) ISOLET. (b) LS. (c) ODR.
Figure A7. Visualization of the consensus matrix learned by ECM-RCL on three adopted datasets. (a) ISOLET. (b) LS. (c) ODR.
Electronics 14 04764 g0a7
Figure A8. Visualization of the consensus matrix learned by ECM-RCL on PD dataset.
Figure A8. Visualization of the consensus matrix learned by ECM-RCL on PD dataset.
Electronics 14 04764 g0a8

References

  1. Popescu, M.; Keller, J.; Bezdek, J.; Zare, A. Random projections fuzzy c-means (RPFCM) for big data clustering. In Proceedings of the 2015 IEEE International Conference on Fuzzy Systems, Istanbul, Turkey, 2–5 August 2015. [Google Scholar]
  2. Sun, H.; Liu, L.; Li, F. A lie group semi-supervised FCM clustering method for image segmentation. Pattern Recognit. 2024, 155, 110681. [Google Scholar] [CrossRef]
  3. Huang, D.; Wang, C.D.; Lai, J.H. Locally Weighted Ensemble Clustering. IEEE Trans. Cybern. 2018, 48, 1460–1473. [Google Scholar] [CrossRef]
  4. Huang, D.; Wang, C.-D.; Peng, H.; Lai, J.; Kwoh, C.-K. Enhanced Ensemble Clustering via Fast Propagation of Cluster-Wise Similarities. IEEE Trans. Syst. Man Cybern. Syst. 2021, 51, 508–520. [Google Scholar] [CrossRef]
  5. Ji, X.; Sun, J.; Peng, J.; Pang, Y.; Zhou, P. Clustering Ensemble Based on Fuzzy Matrix Self-Enhancement. IEEE Trans. Knowl. Data Eng. 2024, 37, 148–161. [Google Scholar] [CrossRef]
  6. Jia, Y.; Liu, H.; Hou, J.; Zhang, Q. Clustering Ensemble Meets Low-rank Tensor Approximation. In Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, Online, 2–9 February 2021; Volume 35, pp. 7970–7978. [Google Scholar]
  7. Jia, Y.; Tao, S.; Wang, R.; Wang, Y. Ensemble Clustering via Co-Association Matrix Self-Enhancement. IEEE Trans. Neural Netw. Learn. Syst. 2023, 35, 11168–11179. [Google Scholar] [CrossRef]
  8. Shi, Y.; Yu, Z.; Chen, C.L.P.; Zeng, H. Consensus Clustering with Co-Association Matrix Optimization. IEEE Trans. Neural Netw. Learn. Syst. 2024, 35, 4192–4205. [Google Scholar] [CrossRef]
  9. Chen, M.S.; Lin, J.Q.; Wang, C.D.; Huang, D.; Lai, J.H. Contrastive Ensemble Clustering. IEEE Trans. Neural Netw. Learn. Syst. 2025, 36, 14678–14690. [Google Scholar] [CrossRef]
  10. Li, T.; Shu, X.; Wu, J.; Zheng, Q.; Lv, X.; Xu, J. Adaptive weighted ensemble clustering via kernel learning and local information preservation. Knowl.-Based Syst. 2024, 294, 111793. [Google Scholar] [CrossRef]
  11. Ay, M.; Özbakır, L.; Kulluk, S.; Gülmez, B.; Öztürk, G.; Özer, S. FC-Kmeans: Fixed-centered K-means algorithm. Expert Syst. Appl. 2023, 211, 118656. [Google Scholar] [CrossRef]
  12. Bagherinia, A.; Minaei-Bidgoli, B.; Hossinzadeh, M.; Parvin, H. Elite fuzzy clustering ensemble based on clustering diversity and quality measures. Appl. Intell. 2018, 49, 1724–1747. [Google Scholar] [CrossRef]
  13. Tao, Z.; Liu, H.; Li, S.; Ding, Z.; Fu, Y. Robust Spectral Ensemble Clustering via Rank Minimization. ACM Trans. Knowl. Discov. Data 2019, 13, 1–25. [Google Scholar] [CrossRef]
  14. Zhou, P.; Sun, B.; Liu, X.; Du, L.; Li, X. Active Clustering Ensemble with Self-Paced Learning. IEEE Trans. Neural Netw. Learn. Syst. 2023, 35, 12186–12200. [Google Scholar] [CrossRef] [PubMed]
  15. Zhou, P.; Du, L.; Li, X. Adaptive Consensus Clustering for Multiple K-Means Via Base Results Refining. IEEE Trans. Knowl. Data Eng. 2023, 35, 10251–10264. [Google Scholar] [CrossRef]
  16. Gu, Q.; Wang, Y.; Wang, P.; Li, X.; Chen, L.; Xiong, N.N.; Liu, D. An improved weighted ensemble clustering based on two-tier uncertainty measurement. Expert Syst. Appl. 2024, 238, 121672. [Google Scholar] [CrossRef]
  17. Xu, J.; Li, T.; Zhang, D.; Wu, J. Ensemble clustering via fusing global and local structure information. Expert Syst. Appl. 2024, 237, 121557. [Google Scholar] [CrossRef]
  18. Xu, J.; Li, T.; Wu, J.; Zhang, D. Ensemble clustering via dual self-enhancement by alternating denoising and topological consistency propagation. Appl. Soft Comput. J. 2024, 167, 112299. [Google Scholar] [CrossRef]
  19. Yang, X.; Zheng, Z.; Xie, J.; Zhao, W.; Xue, J.; Nie, F. Spectral ensemble clustering from graph reconstruction with auto-weighted cluster. Pattern Recognit. Lett. 2025, 196, 243–249. [Google Scholar] [CrossRef]
  20. Zheng, X.; Lu, Y.; Wang, R.; Nie, F.; Li, X. Structured Graph-Based Ensemble Clustering. IEEE Trans. Knowl. Data Eng. 2025, 37, 3728–3738. [Google Scholar] [CrossRef]
  21. Zeng, L.; Yao, S.; Liu, X.; Xiao, L.; Qian, Y. A clustering ensemble algorithm for handling deep embeddings using cluster confidence. Comput. J. 2025, 68, 163–174. [Google Scholar] [CrossRef]
  22. Huang, D.; Chen, D.-H.; Chen, X.; Wang, C.-D.; Lai, J.-H. DeepCluE: Enhanced Deep Clustering via Multi-Layer Ensembles in Neural Networks. IEEE Trans. Emerg. Top. Comput. Intell. 2024, 8, 1582–1594. [Google Scholar] [CrossRef]
  23. Hao, Z.; Lu, Z.; Li, G.; Nie, F.; Wang, R.; Li, X. Ensemble Clustering with Attentional Representation. IEEE Trans. Knowl. Data Eng. 2023, 36, 581–593. [Google Scholar] [CrossRef]
  24. Shukla, P.K.; Veerasamy, B.D.; Alduaiji, N.; Addula, S.R.; Pandey, A.; Shukla, P.K. Fraudulent account detection in social media using hybrid deep transformer model and hyperparameter optimization. Sci. Rep. 2025, 15, 38447. [Google Scholar] [CrossRef]
  25. Miklautz, L.; Teuffenbach, M.; Weber, P.; Perjuci, R.; Durani, W.; Bohm, C.; Plant, C. Deep Clustering with Consensus Representations. In Proceedings of the 2022 IEEE International Conference on Data Mining (ICDM), Orlando, FL, USA, 28 November 2022–1 December 2022; pp. 1119–1124. [Google Scholar]
  26. Liang, C.; Dong, Z.; Yang, S.; Zhou, P. Jointly Learn the Base Clustering and Ensemble for Deep Image Clustering. In Proceedings of the 2024 IEEE International Conference on Multimedia and Expo, Niagara Falls, ON, Canada, 15–19 July 2024. [Google Scholar]
  27. Huang, D.; Lai, Q.; Wang, H.; Xu, Y.-K.; Guan, C.-B.; Wang, C.-D. REC-GCN: Robust ensemble clustering with graph convolutional networks. Pattern Recognit. 2026, 172, 112717. [Google Scholar] [CrossRef]
  28. Zhou, P.; Du, L.; Wang, H.; Shi, L.; Shen, Y.-D. Learning a Robust Consensus Matrix for Clustering Ensemble via Kullback-Leibler Divergence Minimization. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, Buenos Aires, Argentina, 25–31 July 2015. [Google Scholar]
  29. Chen, M.; Lin, J.; Wang, C.; Xi, W.; Huang, D. On Regularizing Multiple Clusterings for Ensemble Clustering by Graph Tensor Learning. In Proceedings of the 31st ACM International Conference on Multimedia, Ottawa, ON, Canada, 29 October–3 November 2023. [Google Scholar]
  30. Zhou, P.; Du, L.; Liu, X.; Shen, Y.D.; Fan, M.; Li, X. Self-Paced Clustering Ensemble. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 1497–1511. [Google Scholar] [CrossRef]
  31. Xu, J.; Li, T. Ensemble clustering with low-rank optimal Laplacian matrix learning. Appl. Soft Comput. 2024, 150, 111095. [Google Scholar] [CrossRef]
  32. Ren, Z.; Sun, Q.; Wei, D. Multiple Kernel Clustering with Kernel k-Means Coupled Graph Tensor Learning. In Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, Online, 2–9 February 2021. [Google Scholar]
  33. Lu, C.; Feng, J.; Lin, Z.; Mei, T.; Yan, S. Subspace Clustering by Block Diagonal Representation. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 41, 487–501. [Google Scholar] [CrossRef]
  34. Pan, E.; Kang, Z. High-order multi-view clustering for generic data. Inf. Fusion 2023, 100, 101947. [Google Scholar] [CrossRef]
  35. Wang, Z.; Li, L.; Ning, X.; Tan, W.; Liu, Y.; Song, H. Incomplete multi-view clustering via structure exploration and missing-view inference. Inf. Fusion 2024, 103, 102123. [Google Scholar] [CrossRef]
  36. Wang, Y.; Wu, N.-C.; Liu, Y.; Xiang, H. A generalized Nyström method with subspace iteration for low-rank approximations of large-scale nonsymmetric matrices. Appl. Math. Lett. 2025, 166, 109531. [Google Scholar] [CrossRef]
  37. Zhou, S.; Yang, M.; Wang, X.; Song, W. Anchor-based scalable multi-view subspace clustering. Inf. Sci. 2024, 666, 120374. [Google Scholar] [CrossRef]
  38. Zhou, J.; Zheng, H.; Pan, L. Ensemble clustering based on dense representation. Neurocomputing 2019, 357, 66–76. [Google Scholar] [CrossRef]
  39. Zhou, P.; Hu, B.; Yan, D.; Du, L. Clustering Ensemble via Diffusion on Adaptive Multiplex. IEEE Trans. Knowl. Data Eng. 2024, 36, 1463–1474. [Google Scholar] [CrossRef]
  40. Qin, Y.; Pu, N.; Sebe, N.; Feng, G. Latent Space Learning-Based Ensemble Clustering. IEEE Trans. Image Process. 2025, 34, 1259–1270. [Google Scholar] [CrossRef] [PubMed]
  41. Bian, Z.; Ishibuchi, H.; Wang, S. Joint Learning of Spectral Clustering Structure and Fuzzy Similarity Matrix of Data. IEEE Trans. Fuzzy Syst. 2019, 27, 31–44. [Google Scholar] [CrossRef]
  42. Bian, Z.; Qu, J.; Zhou, J.; Jiang, Z.; Wang, S. Weighted adaptively ensemble clustering method based on fuzzy Co-association matrix. Inf. Fusion 2024, 103, 102099. [Google Scholar] [CrossRef]
  43. Zhou, P.; Liu, X.; Du, L.; Li, X. Self-paced Adaptive Bipartite Graph Learning for Consensus Clustering. ACM Trans. Knowl. Discov. Data 2023, 17, 1–35. [Google Scholar] [CrossRef]
  44. Taheri, S.M.; Hesamian, G. A generalization of the Wilcoxon signed-rank test and its applications. Stat. Pap. 2012, 54, 457–470. [Google Scholar] [CrossRef]
  45. Bian, Z.; Yu, L.; Qu, J.; Deng, Z.; Wang, S. An ensemble clustering method via learning the CA matrix with fuzzy neighbors. Inf. Fusion 2025, 120, 103105. [Google Scholar] [CrossRef]
Figure 1. Illustration of ECM-RCL.
Figure 1. Illustration of ECM-RCL.
Electronics 14 04764 g001
Figure 2. The effect of λ 1 and λ 2 on clustering results (ACC). (a) Heart. (b) Iris. (c) LM.
Figure 2. The effect of λ 1 and λ 2 on clustering results (ACC). (a) Heart. (b) Iris. (c) LM.
Electronics 14 04764 g002
Figure 3. The effect of λ 3 on clustering results (ACC). (a) Heart. (b) Iris. (c) LM.
Figure 3. The effect of λ 3 on clustering results (ACC). (a) Heart. (b) Iris. (c) LM.
Electronics 14 04764 g003
Figure 4. Average ACCs, NMIs, and ARIs of the proposed ECM-RCL with the different ensemble sizes on the adopted datasets. (a) ACC. (b) NMI. (c) ARI.
Figure 4. Average ACCs, NMIs, and ARIs of the proposed ECM-RCL with the different ensemble sizes on the adopted datasets. (a) ACC. (b) NMI. (c) ARI.
Electronics 14 04764 g004
Figure 5. Average ACC, NMI, and ARI of the proposed ECM-RCL with the different types of base clustering methods on the adopted datasets. (a) ACC. (b) NMI. (c) ARI.
Figure 5. Average ACC, NMI, and ARI of the proposed ECM-RCL with the different types of base clustering methods on the adopted datasets. (a) ACC. (b) NMI. (c) ARI.
Electronics 14 04764 g005
Figure 6. Convergence curves. (a) Iris. (b) WarpPIE10p. (c) Zoo.
Figure 6. Convergence curves. (a) Iris. (b) WarpPIE10p. (c) Zoo.
Electronics 14 04764 g006
Table 1. The six perspectives of the proposed ECM-RCL and some existing methods.
Table 1. The six perspectives of the proposed ECM-RCL and some existing methods.
MethodsWBCWCCSDCDBDSOD
Bagherinia et al. [12]
Huang et al. [3]
Tao et al. [13] WeakWeak
Zhou et al. [14]
Jia et al. [7] Weak
Zhou et al. [15]
Gu et al. [16]
Xu et al. [17] Weak
Li et al. [10] Weak
Xu et al. [18] Weak
Zheng et al. [20]
Yang et al. [19]
ECM-RCL
Table 2. Description of notation.
Table 2. Description of notation.
NotationsDescriptions
X The dataset.
S The CA matrix.
S p q The   p , q -th element in S
A i The i -th connective matrix.
E i The i -th error matrix.
I The identity matrix.
P i The i -th mapping matrix.
G j The j -th order graph structure.
G p q j The   p , q -th   element   in   G j .
H The consensus matrix.
H p q The   p , q -th element in H .
π i The i -th base clustering result.
m The number of base clustering results.
V The maximum order of the multi-order graph structures.
n The number of samples.
c The number of ground-truth clusters.
c i The number of clusters of the i -th base clustering result.
d The number of features in X .
x p The p -th sample.
F 2 The Frobenius norm of .
t r The trace of .
Table 3. The details of the datasets.
Table 3. The details of the datasets.
DatasetsNumber
of Samples
Number
of Features
Number
of Clusters
ALLAML7271292
CLL_SUB11111,3403
GLIMA5044344
Heart270132
Iris15043
LM3609015
Lymphoma9640269
MF200064910
Orlraws10p10010,30410
USPS185425610
Vertebral31063
WarpPIE10p210242010
WDBC569302
Wine178133
Zoo101167
ISOLET779761726
LS6435366
ODR56206410
PD10,9921610
Table 4. Average ACC of all the adopted methods on all adopted datasets.
Table 4. Average ACC of all the adopted methods on all adopted datasets.
DatasetsK-MeansLWEADRECECPCS-HCEC-CMSECARCEAMECM-RCL
ALLAML0.6903 ± 0.03710.6833 ± 0.06670.6903 ± 0.06870.6569 ± 0.03650.6722 ± 0.07360.6806 ± 0.00000.5972 ± 0.07290.7153 ± 0.0150
CLL_SUB0.5252 ± 0.00850.5279 ± 0.00470.5279 ± 0.00470.5279 ± 0.00470.5270 ± 0.00470.4775 ± 0.00000.4486 ± 0.03370.5459 ± 0.0047
GLIOMA0.5600 ± 0.07940.5900 ± 0.02160.5860 ± 0.02320.5740 ± 0.02120.5900 ± 0.03680.3000 ± 0.00000.5680 ± 0.04730.6440 ± 0.0430
Heart0.5904 ± 0.00190.5856 ± 0.02710.6037 ± 0.02870.5933 ± 0.02980.6100 ± 0.02430.6185 ± 0.00000.5837 ± 0.03790.6244 ± 0.0053
Iris0.8560 ± 0.10410.8747 ± 0.03010.9020 ± 0.03360.8993 ± 0.03200.8987 ± 0.03200.6667 ± 0.00000.7300 ± 0.06680.9140 ± 0.0685
LM0.4406 ± 0.02670.4503 ± 0.01530.4558 ± 0.01170.4292 ± 0.01840.4550 ± 0.01940.4611 ± 0.00000.4583 ± 0.01540.5200 ± 0.0140
Lymphoma0.5427 ± 0.05460.5740 ± 0.07240.5646 ± 0.05760.6948 ± 0.09080.5198 ± 0.07000.5000 ± 0.00000.5677 ± 0.02560.6552 ± 0.0421
MF0.5036 ± 0.04530.5493 ± 0.02980.5666 ± 0.03380.5447 ± 0.03590.6057 ± 0.03020.5200 ± 0.00000.5751 ± 0.06470.8004 ± 0.0096
Orlraws10P0.6730 ± 0.05360.7960 ± 0.04810.7830 ± 0.04470.7220 ± 0.03770.7980 ± 0.04240.6500 ± 0.00000.8010 ± 0.05670.8470 ± 0.0408
USPS0.6182 ± 0.02780.6836 ± 0.02540.6901 ± 0.03720.6859 ± 0.05600.7273 ± 0.04530.6208 ± 0.00000.6972 ± 0.04710.7573 ± 0.0000
Vertebral0.5984 ± 0.05460.6177 ± 0.11350.5352 ± 0.02840.6110 ± 0.13370.5358 ± 0.01950.6516 ± 0.00000.5652 ± 0.06800.6703 ± 0.0866
WarpPIE10p0.2605 ± 0.01820.2357 ± 0.00930.2552 ± 0.02020.2124 ± 0.00680.2386 ± 0.01820.1952 ± 0.00000.2676 ± 0.02140.4762 ± 0.0135
WDBC0.8541 ± 0.00000.8035 ± 0.08760.8283 ± 0.02960.8738 ± 0.04110.8190 ± 0.07630.9244 ± 0.00000.8649 ± 0.03350.8903 ± 0.0191
Wine0.6292 ± 0.06440.6713 ± 0.06510.6809 ± 0.03880.6494 ± 0.07460.6691 ± 0.07010.6404 ± 0.00000.6758 ± 0.04640.7247 ± 0.0000
Zoo0.7347 ± 0.08000.7426 ± 0.05850.7455 ± 0.04690.7436 ± 0.02110.7188 ± 0.07250.7822 ± 0.00000.6267 ± 0.08410.7248 ± 0.0283
ISOLET0.5255 ± 0.02840.5620 ± 0.01000.5624 ± 0.01630.5226 ± 0.01190.5508 ± 0.00150.5256 ± 0.00000.5319 ± 0.01070.5645 ± 0.0023
LS0.6342 ± 0.06780.6276 ± 0.19150.6494 ± 0.00260.6709 ± 0.17890.6253 ± 0.00650.7052 ± 0.00000.5460 ± 0.12820.7239 ± 0.0036
ODR0.7592 ± 0.06490.8391 ± 0.03160.9017 ± 0.04570.8669 ± 0.00250.9212 ± 0.00450.7194 ± 0.00000.8484 ± 0.00330.9790 ± 0.0000
PD0.7039 ± 0.04940.7836 ± 0.00310.8088 ± 0.05190.7426 ± 0.02700.7832 ± 0.00600.6543 ± 0.00000.7212 ± 0.03840.8890 ± 0.0021
Table 5. Average NMI of all the adopted methods on all adopted datasets.
Table 5. Average NMI of all the adopted methods on all adopted datasets.
DatasetsK-MeansLWEADRECECPCS-HCEC-CMSECARCEAMECM-RCL
ALLAML0.0897 ± 0.03930.1119 ± 0.03330.1209 ± 0.03100.0417 ± 0.04520.0857 ± 0.06560.1003 ± 0.00000.0505 ± 0.03790.1448 ± 0.0173
CLL_SUB0.1804 ± 0.00040.1805 ± 0.00030.1805 ± 0.00030.1805 ± 0.00030.1804 ± 0.00030.0967 ± 0.00000.0686 ± 0.02890.2627 ± 0.0005
GLIOMA0.4257 ± 0.12020.4921 ± 0.02260.4943 ± 0.02670.4780 ± 0.01640.4938 ± 0.02930.0000 ± 0.00000.4044 ± 0.04490.5253 ± 0.0198
Heart0.0187 ± 0.00070.0176 ± 0.01340.0280 ± 0.01360.0210 ± 0.01510.0295 ± 0.01330.0410 ± 0.00000.0214 ± 0.01560.0409 ± 0.0042
Iris0.7204 ± 0.06690.7505 ± 0.04370.7915 ± 0.05280.7870 ± 0.04820.7862 ± 0.04830.7337 ± 0.00000.5398 ± 0.08780.8131 ± 0.0853
LM0.5623 ± 0.02430.5906 ± 0.01290.5930 ± 0.01480.5612 ± 0.01760.5961 ± 0.01720.5801 ± 0.00000.5862 ± 0.01710.6460 ± 0.0082
Lymphoma0.5607 ± 0.04240.6141 ± 0.04230.6161 ± 0.04830.6964 ± 0.05090.5851 ± 0.03580.5575 ± 0.00000.5926 ± 0.01740.6711 ± 0.0179
MF0.5594 ± 0.01820.6045 ± 0.02520.6152 ± 0.03210.6151 ± 0.02450.6462 ± 0.02330.6184 ± 0.00000.5896 ± 0.03440.7644 ± 0.0083
Orlraws10P0.7607 ± 0.03230.8484 ± 0.03370.8437 ± 0.03100.8112 ± 0.03040.8392 ± 0.02860.7846 ± 0.00000.8374 ± 0.04130.9207 ± 0.0158
USPS0.6138 ± 0.01500.6757 ± 0.02100.6797 ± 0.02440.6655 ± 0.01800.6981 ± 0.01540.6405 ± 0.00000.6734 ± 0.02830.7998 ± 0.0000
Vertebral0.3898 ± 0.02990.3658 ± 0.17290.4404 ± 0.10830.2954 ± 0.19380.2808 ± 0.17500.4209 ± 0.00000.3926 ± 0.09020.4970 ± 0.0576
WarpPIE10p0.2448 ± 0.03220.2115 ± 0.02870.2412 ± 0.02910.1726 ± 0.01360.2210 ± 0.03210.1400 ± 0.00000.2676 ± 0.02420.5562 ± 0.0127
WDBC0.4223 ± 0.00000.3262 ± 0.17030.3666 ± 0.06390.4696 ± 0.09110.3549 ± 0.15240.6064 ± 0.00000.4313 ± 0.08720.5045 ± 0.0441
Wine0.4100 ± 0.01650.4118 ± 0.02640.4032 ± 0.04240.4083 ± 0.03170.4046 ± 0.03650.2973 ± 0.00000.3546 ± 0.09600.4225 ± 0.0116
Zoo0.7179 ± 0.05210.7287 ± 0.04340.7254 ± 0.03480.7068 ± 0.04980.7166 ± 0.04970.7921 ± 0.00000.6485 ± 0.05780.7219 ± 0.0093
ISOLET0.7114 ± 0.01230.7334 ± 0.00940.7468 ± 0.01390.7101 ± 0.01100.7362 ± 0.00670.6674 ± 0.00000.7232 ± 0.00020.7633 ± 0.0015
LS0.5485 ± 0.06820.4937 ± 0.14760.6157 ± 0.01200.5290 ± 0.14670.5215 ± 0.01320.6366 ± 0.00000.4477 ± 0.11250.6732 ± 0.0004
ODR0.7274 ± 0.03100.8206 ± 0.0120.8565 ± 0.02980.8323 ± 0.00160.8628 ± 0.00480.7045 ± 0.00000.8154 ± 0.00520.9505 ± 0.0000
PD0.6712 ± 0.02050.7581 ± 0.04430.7965 ± 0.00460.7327 ± 0.01370.7742 ± 0.01610.6945 ± 0.00000.7265 ± 0.02170.8522 ± 0.0051
Table 6. Average ARI of all the adopted methods on all adopted datasets.
Table 6. Average ARI of all the adopted methods on all adopted datasets.
DatasetsK-MeansLWEADRECECPCS-HCEC-CMSECARCEAMECM-RCL
ALLAML0.1333 ± 0.05960.1329 ± 0.08360.1451 ± 0.08680.0372 ± 0.07970.1044 ± 0.11060.1189 ± 0.00000.0413 ± 0.06500.1754 ± 0.0256
CLL_SUB0.0933 ± 0.02090.0867 ± 0.00080.0867 ± 0.00080.0867 ± 0.00080.0866 ± 0.00080.0436 ± 0.00000.0270 ± 0.02270.1230 ± 0.0011
GLIOMA0.3004 ± 0.13080.3588 ± 0.03480.3566 ± 0.03900.3876 ± 0.03820.3760 ± 0.04340.0000 ± 0.00000.2467 ± 0.03330.3854 ± 0.0311
Heart0.0287 ± 0.00130.0266 ± 0.02070.0411 ± 0.02270.0318 ± 0.02450.0459 ± 0.01940.0527 ± 0.00000.0292 ± 0.02570.0584 ± 0.0052
Iris0.6911 ± 0.09460.7016 ± 0.05170.7548 ± 0.07360.7489 ± 0.06880.7476 ± 0.06900.5681 ± 0.00000.4603 ± 0.10190.7975 ± 0.1174
LM0.2930 ± 0.02730.3283 ± 0.02180.3262 ± 0.01830.3252 ± 0.02450.3432 ± 0.02450.3076 ± 0.00000.3123 ± 0.02230.3895 ± 0.0122
Lymphoma0.3214 ± 0.08930.3547 ± 0.07330.3426 ± 0.05800.5594 ± 0.11500.3012 ± 0.07040.3270 ± 0.00000.3064 ± 0.02630.4119 ± 0.0433
MF0.4242 ± 0.02320.4652 ± 0.03040.4762 ± 0.04090.4823 ± 0.02770.5128 ± 0.03070.4293 ± 0.00000.4436 ± 0.03610.6819 ± 0.0072
Orlraws10P0.5703 ± 0.05410.7080 ± 0.06440.6986 ± 0.06070.6244 ± 0.05890.6959 ± 0.04190.4328 ± 0.00000.7094 ± 0.07320.8330 ± 0.0218
USPS0.5182 ± 0.02280.5790 ± 0.02550.5760 ± 0.04110.5963 ± 0.04800.6354 ± 0.04880.5347 ± 0.00000.5790 ± 0.04510.7061 ± 0.0000
Vertebral0.3145 ± 0.00890.3105 ± 0.26370.2952 ± 0.12250.2585 ± 0.31330.1374 ± 0.18090.3240 ± 0.00000.3198 ± 0.11270.4652 ± 0.1460
WarpPIE10p0.0592 ± 0.01850.0381 ± 0.01610.0551 ± 0.01910.0208 ± 0.00670.0407 ± 0.01820.0133 ± 0.00000.0788 ± 0.01570.3181 ± 0.0129
WDBC0.4914 ± 0.00000.3746 ± 0.20250.4199 ± 0.08120.5568 ± 0.12190.4100 ± 0.18720.7182 ± 0.00000.5294 ± 0.10240.6050 ± 0.0615
Wine0.3575 ± 0.01260.3705 ± 0.01910.3569 ± 0.02970.3563 ± 0.02860.3608 ± 0.04140.2745 ± 0.00000.3320 ± 0.08710.4007 ± 0.0000
Zoo0.6281 ± 0.11250.6390 ± 0.07890.6405 ± 0.06950.6226 ± 0.05010.6117 ± 0.09770.7178 ± 0.00000.5097 ± 0.13450.6282 ± 0.0154
ISOLET0.4692 ± 0.01850.4921 ± 0.02680.5216 ± 0.01500.4827 ± 0.04040.5447 ± 0.00310.4270 ± 0.00000.4629 ± 0.02750.5320 ± 0.0022
LS0.4572 ± 0.08560.4504 ± 0.21560.5274 ± 0.00730.5244 ± 0.19470.4632 ± 0.01520.5715 ± 0.00000.3090 ± 0.10290.6167 ± 0.0003
ODR0.6410 ± 0.06080.7644 ± 0.02490.8262 ± 0.05110.7832 ± 0.00670.8404 ± 0.00700.6047 ± 0.00000.7560 ± 0.00620.9545 ± 0.0000
PD0.5589 ± 0.04040.6492 ± 0.03990.6987 ± 0.02580.6105 ± 0.02470.6594 ± 0.02000.5298 ± 0.00000.6052 ± 0.03660.7971 ± 0.0040
Table 7. The Friedman test of all the adopted methods.
Table 7. The Friedman test of all the adopted methods.
MethodsRank p -ValueThe Null Hypothesis
K-means6.07890rejected
LWEA4.6579
DREC3.7105
ECPCS-HC4.8421
EC-CMS4.6053
ECAR5.5263
CEAM5.2105
ECM-RCL1.3684
Table 8. The Wilcoxon signed-rank test of all the adopted methods.
Table 8. The Wilcoxon signed-rank test of all the adopted methods.
Methods p -ValueThe Null Hypothesis
K -means   vs . ECM-RCL0.000143rejected
LWEA   vs . ECM-RCL0.000168rejected
DREC   vs . ECM-RCL0.000271rejected
ECPCS-HC   vs . ECM-RCL0.00058rejected
EC-CMS   vs . ECM-RCL0.000121rejected
ECAR   vs . ECM-RCL0.000673rejected
CEAM   vs . ECM-RCL0.000121rejected
Table 9. The post hoc Holm test of all the adopted methods.
Table 9. The post hoc Holm test of all the adopted methods.
i Methods z = R 0 R i SE p -Value α i The Null Hypothesis
7K-means vs . ECM-RCL5.92728200.007143rejected
6ECAR vs . ECM-RCL5.23190300.008333rejected
5CEAM vs . ECM-RCL4.8345430.0000010.01rejected
4ECPCS-HC vs . ECM-RCL4.3709570.0000120.0125rejected
3LWEA vs . ECM-RCL4.1381640.0000350.0167rejected
2EC-CMS vs . ECM-RCL4.0729370.0000460.025rejected
1DREC vs . ECM-RCL2.9470840.0032080.05rejected
Table 10. Ablation study under three indices of ACC, NMI, and ARI.
Table 10. Ablation study under three indices of ACC, NMI, and ARI.
MethodsIndicesEECM-RCLPECM-RCLRECM-RCLDECM-RCLECM-RCL
Datasets
GLIMAACC0.6000 ± 0.04520.6080 ± 0.05510.6200 ± 0.06390.6320 ± 0.05670.6440 ± 0.0430
NMI0.5143 ± 0.03530.4986 ± 0.03800.5010 ± 0.03220.5171 ± 0.01830.5253 ± 0.0198
ARI0.3763 ± 0.05630.3652 ± 0.06110.3601 ± 0.03660.3817 ± 0.03000.3854 ± 0.0311
HeartACC0.6200 ± 0.00580.6189 ± 0.00640.6163 ± 0.00430.6222 ± 0.00000.6244 ± 0.0053
NMI0.0367 ± 0.00530.0375 ± 0.00330.0378 ± 0.01840.0408 ± 0.00000.0409 ± 0.0042
ARI0.0539 ± 0.00590.0530 ± 0.00620.0505 ± 0.00410.0562 ± 0.00000.0584 ± 0.0052
IrisACC0.9100 ± 0.02480.9080 ± 0.02080.9067 ± 0.00000.9067 ± 0.00000.9140 ± 0.0685
NMI0.8029 ± 0.04110.7985 ± 0.03050.7960 ± 0.00000.7960 ± 0.00000.8131 ± 0.0853
ARI0.7787 ± 0.11010.7641 ± 0.04940.7592 ± 0.00000.7592 ± 0.00000.7975 ± 0.1174
LMACC0.5122 ± 0.00960.5008 ± 0.01660.5169 ± 0.00700.5058 ± 0.01610.5200 ± 0.0140
NMI0.6412 ± 0.00570.6337 ± 0.00990.6403 ± 0.00550.6398 ± 0.00260.6460 ± 0.0082
ARI0.3828 ± 0.00990.3705 ± 0.01620.3825 ± 0.00680.3697 ± 0.01800.3895 ± 0.0122
Orlraws10pACC0.8470 ± 0.04320.8400 ± 0.00000.8430 ± 0.00670.8450 ± 0.03410.8470 ± 0.0408
NMI0.9113 ± 0.03100.9097 ± 0.02790.9172 ± 0.01590.9156 ± 0.02330.9207 ± 0.0158
ARI0.8161 ± 0.01060.8148 ± 0.05020.8285 ± 0.02220.8208 ± 0.04520.8330 ± 0.0218
WarpPIE10pACC0.4714 ± 0.01510.4671 ± 0.01630.4648 ± 0.01210.4648 ± 0.01810.4762 ± 0.0135
NMI0.5528 ± 0.00930.5512 ± 0.01410.5540 ± 0.00420.5512 ± 0.00830.5562 ± 0.0127
ARI0.3132 ± 0.01590.3159 ± 0.01940.3121 ± 0.01120.3072 ± 0.04680.3181 ± 0.0129
WDBCACC0.8896 ± 0.01900.8842 ± 0.01880.8989 ± 0.01240.8793 ± 0.02580.8903 ± 0.0191
NMI0.5026 ± 0.04200.4878 ± 0.04360.5218 ± 0.04810.4817 ± 0.07330.5045 ± 0.0441
ARI0.6028 ± 0.06130.5862 ± 0.05970.6329 ± 0.04050.5706 ± 0.08250.6050 ± 0.0615
WineACC0.7247 ± 0.00000.7247 ± 0.00000.7253 ± 0.00180.7247 ± 0.00000.7247 ± 0.0000
NMI0.4155 ± 0.01460.4231 ± 0.01380.4210 ± 0.01710.4217 ± 0.01190.4225 ± 0.0116
ARI0.4007 ± 0.00000.4007 ± 0.00000.4007 ± 0.00000.4007 ± 0.00000.4007 ± 0.0000
ZooACC0.7129 ± 0.02290.6960 ± 0.09590.7386 ± 0.03140.6366 ± 0.02470.7248 ± 0.0283
NMI0.7215 ± 0.00930.7015 ± 0.02010.7191 ± 0.02460.6217 ± 0.01370.7219 ± 0.0093
ARI0.6214 ± 0.07320.6061 ± 0.03220.6220 ± 0.04100.4636 ± 0.01240.6282 ± 0.0154
Table 11. The comparison of the maximum order of ECM-RCL.
Table 11. The comparison of the maximum order of ECM-RCL.
ALLAMLGLIOMAHeartIrisLymphomaOrlraws10p
2nd0.7153 ± 0.01500.6440 ± 0.04300.6244 ± 0.00530.9140 ± 0.06850.6552 ± 0.04210.8470 ± 0.0408
3rd0.7139 ± 0.01990.6440 ± 0.07410.6148 ± 0.00000.9220 ± 0.03400.6406 ± 0.04030.8560 ± 0.0425
4th0.7139 ± 0.03600.6160 ± 0.05400.6111 ± 0.00000.9220 ± 0.03400.6365 ± 0.03560.8500 ± 0.0406
5th0.7153 ± 0.04100.6240 ± 0.05230.6078 ± 0.00120.9213 ± 0.03450.6542 ± 0.03430.8500 ± 0.0406
Table 12. Average execution time (s) on all adopted datasets.
Table 12. Average execution time (s) on all adopted datasets.
MethodsLWEADRECECPCS-HCEC-CMSECARCEAMECM-RCL
Datasets
ALLAML0.0278 1.1962 0.0076 0.0437 5.37310.2442 0.0124
CLL_SUB0.0023 0.0191 0.0019 0.0179 6.14490.2303 0.0317
GLIOMA0.0018 0.0248 0.0014 0.0014 4.77700.1304 0.0111
Heart0.0035 0.0264 0.0025 0.0381 4.66200.4803 0.1100
Iris0.0025 0.0201 0.0018 0.0183 4.30120.3162 0.1263
LM0.0055 0.0630 0.0052 0.0640 7.50510.7737 0.2029
Lymphoma0.0030 0.0544 0.0016 0.0080 5.94080.2820 0.1428
MF0.0293 0.1930 0.0942 6.1873 22.23899.7868 18.3646
Orlraws10P0.0028 0.0427 0.0023 0.0094 7.72610.2722 0.0563
USPS0.0278 0.3771 0.0770 4.9987 18.80309.6544 21.2204
Vertebral0.0045 0.0321 0.0042 0.0539 5.07150.5352 0.1206
WarpPIE10p0.0042 0.0534 0.0039 0.0247 7.01900.4849 0.0877
WDBC0.0057 0.0238 0.0111 0.4950 6.33751.8485 0.4830
Wine0.0024 0.0204 0.0019 0.0263 4.62080.3426 1.3352
Zoo0.0019 0.0368 0.0016 0.0128 4.76200.2212 0.0541
ISOLET0.26683.21812.0049339.9318316.5550249.76551035.0218
LS0.22831.78011.1972132.5164100.9128143.3467872.7249
ODR0.17411.84780.896665.020595.9148130.5054448.3944
PD0.50572.29724.1880732.8491316.2745442.89643118.2037
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Qu, J.; Dai, Q.; Bian, Z.; Zhou, J.; Jiang, Z. Ensemble Clustering Method via Robust Consensus Learning. Electronics 2025, 14, 4764. https://doi.org/10.3390/electronics14234764

AMA Style

Qu J, Dai Q, Bian Z, Zhou J, Jiang Z. Ensemble Clustering Method via Robust Consensus Learning. Electronics. 2025; 14(23):4764. https://doi.org/10.3390/electronics14234764

Chicago/Turabian Style

Qu, Jia, Qidong Dai, Zekang Bian, Jie Zhou, and Zhibin Jiang. 2025. "Ensemble Clustering Method via Robust Consensus Learning" Electronics 14, no. 23: 4764. https://doi.org/10.3390/electronics14234764

APA Style

Qu, J., Dai, Q., Bian, Z., Zhou, J., & Jiang, Z. (2025). Ensemble Clustering Method via Robust Consensus Learning. Electronics, 14(23), 4764. https://doi.org/10.3390/electronics14234764

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop